|
Stewart C. Loken (Chairman)
Director
Information and Computing Sciences
Division
Lawrence Berkeley Laboratory
University of California Berkeley, CA
94720
Richard J. Beach
Manager
Electronic Documents Laboratory
Xerox Palo Alto Research Center
Palo Alto, CA 94304
Kirk T. McDonald
Professor of Physics
Princeton University
Princeton, NJ 08544
Theodore D. Schultz
Research Staff Member
IBM Thomas J. Watson Research
Center
Yorktown Heights, NY 10598
Harry Thacker
Physicist
Fermi National Accelerator Laboratory
Batavia, IL 60510
and Professor of Physics
University of Virginia
Joseph Worth
Senior Physicist
AT&T Bell Laboratories
Holmdel, NJ 07733
Consultants
Miriam Forman
Deputy Executive Secretary
American Physical Society
New York, NY 10017
Peter Adams
Editor
Physical Review B
George Basbas
Editor
Physical Review Letters
Gene Wells
Editor
Physical Review Letters
Peggy Judd
Director
Management Information Systems
American Physical Society
Woodbury, NY 11797
Peggy Sutherland
American Physical Society
Woodbury, NY 11797
8 December 1988
Lawrence Berkeley Laboratory Berkeley, CA
- Review of Charge
- Presentation on LBL Libraries (H. Griffin)
8 February 1989
American Institute of Physics Woodbury, NY
- Integrated Publishing System (P. Judd)
- Tour of Woodbury Facility (C. Fleming)
- Electronic Publishing Overview (P. Judd)
- AIP Information Technology Committee history/status (T.
Ingoldsby)
- Submission of Author-Prepared Manuscripts (P. Sutherland, P, Judd,
L. Siddons, D. Marlette)
- Advantages of Computerized Composition System (D. Carlin)
- SPIN, FIZ-Database Availability of Journal Material (P.
Parisi)
- CD-ROM Update (T. Ingoldsby)
- Optical Disk for APS Archive (S. Quillen)
9 February 1989
American Physical Society New York NY
- IEEE/UMI Project (D. Staiger)
- AMS CD-ROM Project (T. Kuzma)
17-18 April 1989
AT&T Bell Laboratories Holmdel, NJ
- Tour of AT&T Bell Laboratories
- Demonstration of CD-ROM Systems
- Document Preparation at Bell Laboratories
16-17July 1989
Xerox Palo Alto Research Center Palo Alto, CA
- The Electronic Document Laboratory (R. Beach)
- Dialog (J. Maxon-Dadd)
27-28 September 1989
American Physical Society, Washington, DC
- Defense Technical Information Center (K. Molholm)
- Association of Research Libraries
- American Chemical Society Initiatives (R. Marks)
- Preparation of What's New (R. Park)
25-27 March 1990
IBM Thomas J. Watson Research Center, Yorktown Heights, NY
- Preparation of Final Report
In this appendix we enumerate and give some details
about a few important hosts, physics information systems, physics
preprint databases, and databases in several sciences-all connected to
on-line services. Three of these provide full text. No attempt whatsoever
is made to be complete. Many more information services that could be
studied as models are given in various directories of databases and
information systems (Ref. 7)
A few important hosts:
- STN International is a consortium of the Chemical Abstracts Service
(CAS) of the American Chemical Society (ACS), Fachinformationszentrum
Energie, Physik, Mathematik GmbH (FIZ Karlsruhe), and the Japan
Information Center of Science and Technology (JICST). Interactive
access is available by dialing up through the commercial networks
Telenet, Tymnet, and CompuServe, or by dialing up directly.
- Bibliographic Retrieval Service (BRS) and Dialog Information
Services, Inc. (DIALOG) are private firms hosting enormous numbers of
databases that are accessible by dial-up over commercial networks like
TELENET and TYMNET. Some institutions provide quicker access to these
services.
Two physics and one biomedical information system(s):
- PINET, the Physics Information System of the AIP. This system
offers gateways to BITNET, Internet, Telemail, etc. (for users who have
no e-mail facility at their home institutions), two bibliographic
databases (SPIN and General - Physics Advanced Abstracts, both
described below), two bulletin_boards, several searchable lists (job
notices, announcements, and news items including What's New), a
calendar of physics meetings, an AIP publications catalogue with
on-line ordering, and the biannual Directory of Physics and Astronomy
Staff. The AIP is both producer and host; the interactive access is by
direct dialup to an 800 number or over the Internet using the Telenet
protocol.
- SIS, the Superconductivity Information System of the DOE. This
system offers a number of services in response to the great interest in
high-temperature superconductivity: the Preprint Database and Energy
Database described below, a bulletin board, a work-in-progress
database, and an e-mail facility for SIS users. The producer and host
is the Integrated Technical Information System (ITIS), Office of
Scientific and Technical Information (OSTI) of the DOE. Interactive
access for some users is by dial-up over the government's FTS network;
access for anyone is by dial-up over Telenet.
- MEDLARS, the Medical Literature Analysis and Retrieval System at
the National Library of Medicine of the National Institutes of Health,
produces the full-text biomedical database MEDLINE (see below). It also
operates an on-line network to provide health professionals with
remote-terminal interactive access to more than 20 databases produced
at the NLM and elsewhere. This system is currently accessed by 15,000
institutions and private individuals.
Two Bibliographic Databases of Physics Preprints:
- The Preprint Database (PDB) of the Superconductivity Information
System (S1S), referred to above, lists bibliographic information
including author-provided abstracts for preprints related to
high-temperature superconductivity that are collected in a variety of
ways.
- The High-Energy Physics (HEP) database lists bibliographic
information (but does not include abstracts) for high-energy preprints
coming into the library of the Stanford Linear Accelerator (SLAC) and
subsequent publication information when known, using the Stanford
Information Retrieval System (SPIRES). The producer and host are SLAC.
Interactive access over one of the networks connecting mainframes is
available to many high-energy physics groups. It is also available by
dialing up over the TELENET commercial network. Noninteractive access,
in which requests are sent and answers are later returned, is available
from QSPIRES over BITNET.
Some Bibliographic Databases of Published Literature in Physics,
Mathematics, Chemistry, Biology, and Biomedicine:
- INSPEC lists bibliographic information and abstracts of published
papers, covering physics (on-line version of Physics Abstracts),
electrical and electronics engineering, computer science, and library
and information science, but not chemistry. The producer is the
Institute of Electrical Engineers (IEE) in London; the host include
DIALOG and BRS.
- PHYS, the on-line version of Physics Briefs, lists bibliographic
information and abstracts of published papers in physics, physical
chemistry, biophysics, geophysics, astronomy, and astrophysics. The
producers are FIZ Karlsruhe in cooperation with the Deutsche
Physikalische Gesellschaft and the AIP. The host is STN
International.
- SPIN (Searchable Physics Information Notices) is a subset of PHYS,
listing bibliographic information and abstracts for all articles in
publications of the APS, the AIP, and its other member societies,
including the Russian and Chinese translation journals. The producer is
the AIP; the hosts are DIALOG and, recently, the AIP's PINET (see
above). The contents of SPIN eventually appear as part of PHYS (see
above).
- General Physics Advance Abstracts is an advance listing of most of
the entries in SPIN (see above), giving similar information about
papers accepted for, but in advance of, publication (hence, the
publication information is tentative and less complete, and the
abstracts may still be preliminary). The producer is the AIP; the host,
the AIP's PINET.
- CA SEARCH, the on-line version of Chemical Abstracts, lists
bibliographic information, mostly with abstracts, for the published
chemical literature, patents, technical reports, etc. The producer is
the Chemical Abstracts Service of the American Chemical Society; the
hosts include BRS, DLALOG, and STN.
- DOE ENERGY is a bibliographic database with abstracts of published
information (and some unpublished) of interest to the DOE. The producer
is OSTI of the DOE. The hosts include DIALOG, STN, and ITIS-OSTI-DOE
itself. Access for the general public is by dialup; subscribers to SIS
(see above) have access through SIS.
- BIOSIS PREVIEWS, the on-line version of Biological Abstracts and
Biological Abstracts/RRM (Reports, Review, Meetings) combined, lists
bibliographical information and (for entries in Bio. Abst. since 1976)
abstracts in the biological and biomedical sciences. The producer is
BIOSIS; the hosts include BRS, DIALOG, and STN.
- MEDLINE (MEDLARS Ordine), the on-line version of Index Medicus and
other medical indexes, lists bibliographic information for the
biomedical literature, about 60% with abstracts. The producer is the
U.S. National Library of Medicine (NLM), an organization setup by
Congressional mandate as a component of the National Institutes of
Health, which makes MEDLINE available through MEDLARS (see above); the
hosts are BRS, DIALOG, Mead Data Central, and others.
Three Full-Text Databases in Mathematics, Chemistry, and
Biomedicine:
- Math/Sci Online includes all information in four printed
publications: Mathematical Review, Current Mathematical Publications,
Current Index to Statistics, and Index to Statistics and Probability,
which means full text including equations. This is possible because the
contents are coded in AMSTEX, the AMS's version of TEX and the AMS
provides the program for decoding and printing on paper or a graphics
terminal. The producer is the AMS with the American Statistical
Association and the Institute of Mathematical Statistics; the hosts are
BRS, DIALOG, EasyNet, and others.
- CJO (Chemical Journals Online) is a database containing full text
of the chemical journals of The Royal Society of Chemistry and of John
Wiley & Sons and of 19 primary journals published by the ACS. The
producers are the ACS, the RSC, andJW&S; the host is STN (of which
the ACS is a part). Illustrations and many tables are not included;
equations are in a decipherable language.
- Comprehensive Core Medical Library (CCML) is a database containing
full text of numerous major medical textbooks and medical journals.
Both the producer and host are BRS.
Selected CD-ROM Products in Science and Technology The
following is a representative list of current CD-ROM database products
covering science and technology. It is not intended to provide a complete
catalog, but to indicate the range of products that are available.
- Science Citation Index
Published by ISI
Covers 1986-1990
- Contains citations to articles in 3300 of the world's leading
science and technical journals indexed by cited author, cited work,
cited patent, title word, journal title, author name, and author
address.
- NTIS
Several publishers
1983-1990
- Bibliographic citations and abstracts to government-sponsored
research and development reports produced by the National Technical
Information Service.
- Math/Sci
Silver Platter
1990
- CD-ROM version of the American Mathematical Society's MathSci
Database.
- SciTech Plus
Bowker
1990
- Includes American Men & Women of Science, Directory of American
Research & Technology, Corporate Technology Directory, listing of
sci-tech books from Books in Print, and sci-tech journals from
Ulrich's.
- Computer Library
Ziff-Davis
1990
- Bibliographic information and abstracts of articles from over 120
computer, technical, and business periodicals.
- Thomas Register
DIALOG
1990
- CD-ROM version of Thomas Register of American Manufacturers.
- Compendex
Dialog
1990
- Engineering Index and Engineering Meetings.
- Biological Abstracts
Silver Platter
1990
- CD-ROM version of BIOSIS.
- Medline
Several Publishers
1990
- National Library of Medicine's database.
- Life Sciences Collection
Cambridge
1982-1990
- Abstracts from more than 5000 journals and other sources covering
many fields in the life sciences.
- CD-GENE
- DNA and Protein (amino acid) sequences in GENBANK, EMBL, and
Protein Information Resources.
- McGraw-Hill Science & Technology Reference Set
- McGraw-Hill Concise Encyclopedia of Science and Technology and the
McGraw-Hill Dictionary of Scientific and Technical Terms.
- Kirk-Othmer Encyclopedia of Chemical Technology
Wiley
- PC-PDF
International Center for Diffraction Data
- CD-ROM version of the Powder Diffraction File.
- Registry of Mass Spectral Data
Wiley
- CD-ROM version of the 1987 Registry of Mass Spectra Data.
- Geophysics of North America
National Oceanographic and Atmospheric Administration (NOAA)
- Includes magnetics, gravity, earthquake seismology, thermal aspect,
stress data, and satellite imagery.
In this appendix we describe three current
experiments in delivering and using technical information electronically,
all of which have been referred to in Sec. II. The experiments are in the
fields of mathematics, electrical engineering, and chemistry.
- Math/Sci Disc (on CD-ROMs through university Microfilms, Inc.)
-
For many years, the AMS has published Mathematical Reviews, a
prestigious monthly reviewing journal, and Cur-tent Mathematical
Publications, a triweekly current awareness journal, both in the
mathematical sciences. Similarly, the Association for Computing
Machinery has been publishing the monthly Computing Reviews and the
annual ACM Guide to Computing Literature.
Since 1982 the full text of these journals, including fully
formatted equations, has been available on-line through BRS, DIALOG,
and ESA-IRS, b as well as via several other systems. The availability
of full text (including equations) is made possible by having the
article entirely encoded in AMSTEX, one of the TEX family of
formatting languages (recall that TEX was developed under the
sponsorship of the AMS) and by supplying all users with the TEX
programs needed to transform TEX code into fully formatted text.
If provision of truly full text is one peculiarity of this
experiment, the relatively small size of these two journals is the
other; all these journals from 1985 to 1988 plus an index for
searching easily fit on one CD-ROM. The AMS now makes all this
material available on a CD-ROM product produced for them by
SilverPlatter International. Every six months the AMS brings out a
new disc which, until now, has contained all the text of the previous
disc, all additions for the previous six months, and a fully updated
index. To fill in the large time gap between issuance of discs, users
are encouraged to refer to the on-paper or on-line versions of these
journals.
The discs are leased to users, who must then return the previous
disc. What happens when it becomes necessary to issue a second disc,
to both the timespan of articles included and the index is not yet
clear, since the situation has yet to arise.
The AMS has also introduced a novel pricing scheme. Whereas a
library with no subscription to Mathematical Reviews (on paper) would
pay $3510 to lease MathSci Disc for a year, a library with a
subscription to Mathematical Reviews would pay only $2106 for MathSci
Disc; an individual at an institution already having a subscription
to MathSci Disc would have to pay only $351 for an additional
subscription for his/her personal use only. The AMS believes that
typical users of MR are inclined to wish to do their own, unhurried
searching, so that a personal copy of MathSci Disc (and, of course,
one's own CD-ROM reader and workstation) should have special
appeal.
Perhaps anticipating the day when there will be two or more
MathSci Discs, The AMS is also considering the possibility of
supplying this information on magnetic tapes for uploading onto local
mainframes or LAN servers. The fees to be charged and other issues
will be negotiated.
- IEEE/IEE Publications Ondisc (IPO)
-
This experiment, described briefly in Sec. II, is a collaborative
effort of the IEEE, the IEE of the UK, and UMI (University
Microfilms, Inc.) to provide a large amount of the electro-technical
literature on CD-ROMs together with some machine searching. Until
very recently, the system they developed was in a "beta test"
(described below). It just has been decided to market the product
commercially. The information presented here comes from two meetings
with David Staiger, then Director of Publications of the IEEE (one
meeting with the Task Force, the other with the AIP Subcommittee on
Electronic Information Technologies to which one Task Force member
belongs), and from extensive conversations with Staiger and Dan
Crawley, the Associate General Manager of the IEEE. A detailed report
with emphasis on the user's point of view has been published by Mary
Holland, Librarian for National Semiconductor Corp., Santa Clara, CA
(Ref. 8).
-
General Idea
- A large amount of IEEE and IEE literature for search,
retrieval, and printing is provided through UMI on CD-ROMs. For
now, this is viewed as a delivery mode that parallels the
traditional on-paper subscriptions. It also allows the IEEE to
have a presence and to develop experience in a mode that is
becoming popular and can lead to better things.
-
Contents:
- Text, tables, figures, etc.: All IEEE and IEE published
journals (60), the published proceedings of about 1/3 of all
lEEE-sponsored conferences (150), and a complete set of new
IEEE standards (1000-2000) and magazines (20). The beta test
began with this literature for 1988, about 200,000 pages in
all, which was updated twice monthly.
- INSPEC for indexing: Bibliographic information (titles,
authors, institutions, publication information, abstracts) for
all the above articles (which constitute about 26% of the
electrotechnical literature) but no others, drawn from the
IEE's INSPEC database.
-
Production of Discs.
- Image Discs: Using a dedicated computer, scanner, and
workstations in a system of their own design, UMI bit-maps the
articles as full pages (omitting the advertisement pages in the
magazines) at 300 dpi and with the same compression as on a fax
machine (CI'IT4 standard for compression algorithms), giving an
average compression ratio of about 1/10. These bit mappings are
recorded on a WORM disk, which is then used as a master in
making the CD-ROMS. These disks, which are not machine
readable, appear twice monthly.
- Index Discs: The selected portions of the INSPEC database
are stored on CD-ROMs in machine-readable form and keyed to the
bit-mapped articles. These index discs thus provide a
searchable index to the articles, although the appearance of
the entries in this database can lag behind the appearance of
the articles by as much as four months. An updated replacement
disc appears monthly. It is estimated that a single index disc
will fill up in about six years, unless it is decided to
include a larger fraction of INSPEC than is now included.
-
Just-completed beta test by UMI:
- Sites: Four U.S. universities (Michigan, Illinois,
Stanford, and Polytechnic University of Brooklyn), two U.S.
government labs (NIST and NRL), four U.S. industrial labs (GE,
Xerox-Webster, Hughes Aircraft, and Hewlett-Packard), and two
U.K. sites (Imperial College and GEC Hirst Research).
- Installations: At each site, UMI installed a UMI Image
Workstation consisting of a Sigma Designs LView high-resolution
monitor with a 386 NCR processor, 4 Mb RAM, 30 Mb hard disk,
and two Toshiba internal CD-ROM drives. There are attached an
inkjet printer for printing only abstracts and a Canon Series
II laser printer for printing full pages.
-
Features:
- Searching: All 59 of INSPEC'S fields can be searched, or
the set of fields to be searched can be selected. Search
criteria can include the boolean AND, OR, and NOT combinations
and four kinds of proximity operators.
- Retrieval: A list of "hits" can be displayed, and a cursor
can be used to select any title for viewing, downloading, or
printing. The disk number containing the article appears on
screen, whereupon the correct disk must be inserted, after
which the first page appears-a far cry from a completely
automated system. The hit list is stored so that further hits
can be viewed without reinserting the index disk.
- On-screen display: pages can be "turned" either way and any
portion of a page can be magnified or reduced.
- Printing: only full pages can be printed.
-
User reactions in the beta test:
- User information was gathered in the test, much of it from
the user, after accessing the database, with a questionnaire.
These answers were unreliable because users were reported to be
impatient to start searching. Connection time, actual
publications printed, and number of pages printed were
monitored.
- Miriam Forman of the APS, with the aid of some Task Force
members, sought out acquaintances at several of the test sites
for reaction to the system. This anecdotal survey brought a
number of reactions: it would be more useful if the
installation were closer to the user; it is useful mainly as a
very sophisticated copier, especially for students; the quality
of full-article printing with the laser printer was much better
than photocopying the original; while the index disc was copied
onto the workstation's hard disk, the image discs were kept at
the reference desk and handed out on request; and the searching
was good for names and titles, primitive beyond that. No one
saw this as a universal resource, and so there were no
complaints about the narrowness of the search -window, although
some did use it to browse and/or search by subject.
- Decision to market: It has just been decided to market
the IPO product essentially as tested, as a worthwhile parallel
delivery mode for which there may be some demand, fully realizing
its shortcomings. The system is seen by the IEEE as only the first
step in a process that will lead ultimately to fully electronic
publishing delivered to the user in a number of different modes,
including on-line, and possibly much expanded in the literature it
covers.
- Pricing In the marketing of IPO now contemplated, an
initial package will be offered of three back years and the current
year. To commercial research libraries, where paying for off-prints
is believed to be undesirable, the initial package will probably
cost about $26K with no charge for off-prints. To academic
libraries, the package will cost $ 12K-13K, but off-prints will
cost $0.20/page. In subsequent years, the subscription price wiIl
be $8K or $4K per year, depending on the kind of library. All these
disks will be leased, not sold, so that a library wishing to have a
durable archive of this literature will have to retain an on-paper
subscription.
- Compensating the publishers: Although 95% of the
contents will come from the IEEE (the other 5% coming from the
IEE), the journals are distinct and will be compensated distinctly,
in proportion to the number of off-prints made, whether or not the
user has paid separately for each off-print.
-
Problems:
- Limited searching. Because the index disc contains only
that portion of the INSPEC database relating to the documents
on the image discs, every search is far from complete, missing
about 70% of the literature, and so must be repeated elsewhere
(perhaps in INSPEC-on-line from DIALOG or BRS). To solve this
problem, the entire INSPEC index could be put on the index
CD-ROMs, but this would shorten the time until the index
required more than one CD-ROM.
- Index on several CD-ROMs. It is estimated that in six years
or less the index itself will occupy more than one CD-ROM. At
that time, each search will have to be duplicated if it is to
extend over more than that period. It is felt by IEEE people
that by that time the system will have evolved, possibly to an
on-line index coupled to storage on CD-ROMs.
- Inefficient searching. The searching algorithm is not as
expert as it could be, and only the INSPEC fields, not the full
text, are searchable. Staiger of the IEEE believes that
full-text searching would increase the hit rate by only 20%,
although it might also greatly increase the efficiency if
combined with a smarter search algorithm.
- Slow, cumbersome access. The problems associated with
having to access a whole library of CD-ROMS are obvious. When,
eventually, full text is stored, the number of CD-ROMs will be
greatly reduced, but then two other problems arise: (1)
interactive searching of full text, now possible in principle,
is far more difficult if it must go over several CD-ROMs, and
(2) since a much greater part of the literature will now reside
on one or a few CD-ROMS, the problem of simultaneous users is
exacerbated.
- Full-text storage. The problem of storing full text
digitally is difficult. It is less difficult if all the
material is produced by one organization, the IEEE, but when
the operation is so complex as to produce 200,000 pages a year,
enforcing a uniform standard even then is difficult. And if
this system is ever to cover the whole electro-technical
literature, with 5,000 publishers, digital storage will be
impossible unless there is a single widely-accepted standard
imposed on publishers (if not on authors), or some machine
translation between formatting languages. In this connection,
the IEEE has been moving to promote such a standard, like
SGML.
- Impact on other media. The impact on the IEEE on-paper
offerings of making IPO universally available can only be
conjectured, but the contemplated leasing (rather than selling)
is consistent with the philosophy that IPO will be a parallel,
additional mode to on-paper, at least for the moment.
- Chemistry Online Retrieval Experiment (CORE)
-
This experiment, in its first stages, involves the ACS, the Online
Computer Library Center (OCLC), Bellcore, and Cornell University,
each of which plays a distinct role:
- The ACS is supplying the computer code that generates the text
of 20 ACS journals for the past 10 years, text that is now
available on-line in the database Chemical Journals Online (CJO)
from STN. This code will be updated through the course of the
experiment as new issues of these journals are published. The ACS
also is providing microfilms of the full pages of the journals as
published.
- Bellcore is producing a tape that will contain" two files: the
ASCII code of the text, stripped of all formatting instructions and
therefore suitable for searching, and a bit mapping of the original
pages at 300 dpi, obtained from scanning the microfilm. At this
point, no hybrid file is being produced in which the figures are
bit-mapped and the text is stored in ASCII, nor does the stored
ASCII code contain all the formatting instructions needed to
generate the textual part of the journal pages. Bellcore is also
providing a rather simple search program, Superbook, to be used in
the search studied.
- OCLC is providing a more comprehensive search facility,
X-MEMEX, which supports a large variety of search strategies. OCLC
will also have access to the tapes produced by Bellcore so that
they can do their own experiments on searching and using.
- Cornell, through its Mann Library, will install these tapes on
an appropriate LAN that will be available for use by university
chemists. Cornell psychologists will be able to monitor this usage
to study the man-system interface, the way the system is used, and
the effectiveness of various search strategies.
In
this report, and in thinking about electronic information problems
generally, certain numbers that characterize the literature produced by
the APS are constantly coming up. We have collected some of these here.
- Characters (bytes) on a Physictd Review page of solid text: 6.5 Kb
[2 columns/page x 61 lines/col x 54 characters (bytes)/line]. Size of
1989 Physical Review, Physical Review Letters, and Reviews of Modern
Physics: 50,000 pages.
- Bytes in all 1989 Physical Review and Physical Review Letters if
all text: 325 Mb (50,000 pages x 6.5 Kb/page).
- Fraction of Physical Review pages devoted to figures: 15% (This is
a very crude estimate based on a random sampling of all pages with
numbers ending in 26 or 76 in the issues PRB-Oct 1, PRB-Oct 15, and
PRA-Oct 15 in 1989. The percentages found were 20,8, and 15,
respectively; the percentage for all three issues was 15).
- Storage requirements for figures:
-
- Resolution:
-
- Fax, current ("Group 3" standards): 200 dots per inch
(dpi)
- Fax, future ("Group 4" standards): 400 dpi
- A typical laser printer: 300 dpi IEEE/IEE/UMI experiment
300 dpi
- Very good printing on paper: 600 dpi
- High-quality printing with good grey scale: 1200 dpi
- Compression factor (C) when bit mapping:
-
- Average figure for text, graphics, illustrations: 1/10
- Good figure for line drawings: 1/20
- Storage of text versus storage of line drawings in Physical
Review:
-
- Text (see above): 52 kbit/page
- Area of text 65 sq. in./page
- Bits at 300 dpi, no compression: 5.85 Mbit/page
- Bits at 300 dpi, compression 1/20: 0.29 Mbit/page
- Expansion ratio by which storage must be increased when an
area of straight text is replaced by a graphic figure to be
reproduced with 300 dpi, assuming a compression factor of 1/20:
5.6
- Expansion factor if resolution is R dpi and compression
ratio is C: 5.6X (R/300)2X
20C=0.124XR2XC
- Effective expansion ratio if graphics occupies 15% of area
and text occupies the rest assuming 300 dpi and compression of
1/20: 1.8
- Transmitting Physical Review pages and issues
electronically:
-
- At 9,600 bits/sec:
-
- 2 pages/sec if just text
- 1 page/sec if ASCII text with 15% graphics (compression
ratio 1/20 )
- At 1.5 Mbi.t/see:
-
- 300 pages/see if just text
- 150 pages/see if ASCII text, 15% graphics, etc.
- 1 year of APS publications in 1/2 hour, if just text
- 1 year of APS publications in 1 hour, if ASCII text, 15%
graphics, etc.
- Putting a current year of Pbysi- cal Review on CD-ROM, without
index:
-
- Bit-mapping 50,000 entire pages with a compression factor of
1/10 (the factor believed to be used by the IEEE experiment): 3,650
Mb, or about 6 CD-ROMS
- Text in ASCII code, illustrations bit-mapped with compression
1/20: 585 Mb, or about 1 CD-ROM
- The fractions of the periodical literature (serials) in
English:
-
- APS journals: 1/9
- AIP journals: 1/9
- AIP translation journals: 1/9
- The size of world's scientific periodical literature: 150
million pages
This is a very crude estimate by Feinberg (Ref. 6), that could
easily be off by a factor of 2 or 3. It is the product of three
factors:
- - 9 million abstracts in decade 1977-86: 6 million in Chemical
Abstracts, 3 million in Bio Abstracts, 1 million in Physics
Abstracts-overlaps are neglected
- factor of 3 to include all years before 1977
- 5 pages per paper (below, we assume big Physical Reuiew pages);
Feinberg (Ref. 6) rounds up to 150 million pages
- Storing the world's scientific periodical literature:
-
- Assuming that all is ASCII-coded text:
150 million pages= 1 Tb=1 12-in. reel "digital paper" tape=1,600
CD-ROMs
- Assuming all is bit-mapped at 300 dpi, compression ratio 1/10:
150 million pages=11 Tb=11 12-in. reels digital paper tape= 18,000
CD-ROMS
- Size of the total physics periodical literature in English: 8
million pages
-
This estimate is obtained by a quite different but also crude method,
based on two assumptions:
- Last year the total physics periodical literature in English
was 9 times that in the Physical Review, or 450,000 pages.
- The rate of growth is and always has been exponential with a
doubling time often years. Thus, there will be about 8.2 million
pages produced in the next ten years, and the total literature
until now should be about the same, 8.2 million pages.
- Storing the world's English-language physics periodical
literature:
- Roughly 1/20 the figures for the scientific periodical literature
given above.
This
appendix describes the operations at the APS Editorial Offices in Ridge,
NY. Editorial operations, in contrast to production operations, at Ridge
are based on paper records of manuscripts and correspondence. The paper
file is the authoritative source and record of communications, history,
and status of the editorial handling of a manuscript There is
considerable electronic assistance given to record keeping, management of
the editorial process, and communications.
For record keeping and management there are two databases, one for the
administrative record and status of manuscripts (data for over 100,000
manuscripts have been accumulated over the last decade and a half and one
for referees (more than 15,000).
Communications are now assisted by e-mail, fax, and telex. Reports
from referees and status inquiries from authors are now routinely
received via e-mail, fax, and telex. The traffic is growing. There were
7,550 incoming e-mail communications in 1988 and 13,600 in 1989 (an
average of one per manuscript submitted). The sum of fax and telex is
about half that of e-mail and growing at the same rate. e-mail is now
used for about 30% of referee reports. Outgoing correspondence is
generally limited to remainders to referees and status reports to
authors.
A capability for receiving manuscripts submitted electronically now
exists as part of an experimental enterprise which is restricted perforce
to a few manuscripts per day. Most of these submissions are requests by
the authors to use their keystrokes for production in the compuscript
program of the APS Liaison Office at Woodbury, NY.
Some reflect simply a desire to take advantage of the conveniences and
strengths of e-mail communications. Such authors are automatically
informed of the compuscript program and invited to participate.
There are two requirements for a successful electronic submission: The
file must be readable at Ridge in a way that allows suitable hard copies
to be made for referral (in paper) to reviewers, and figures must be
received promptly (by fax or overnight mail) at the Editorial Office.
To qualify for the compuscript program, an electronic submission must
be prepared under REVTEX. When the paper is accepted for publication, the
author's keystrokes are transmitted to the Woodbury office by the Ridge
office (or transmitted directly to Woodbury by the author).
Production at Ridge (all of PRL - 6,000 pages per year-and 4,000 pages
worth of Rapid Communications for PR) is based on keyboarding copy-edited
manuscripts into a VAX/750 with UNIX. Camera-ready copy is produced and
sent to the printer (PRL to Canterbury Press in Rome, NY, and the Rapid
Communication pages to Woodbury for collation with the rest of PR).
All operations at Ridge are on paper. All incoming material
(manuscripts, referee reports, communications from authors) must be
converted to paper for the office to operate. Outgoing communications
from the editor to referee or author must be duplicated by a paper copy
in the paper file for the manuscript. When a manuscript is submitted
electronically it is converted to paper. When it is accepted it is keyed
(unless part of the compuscript program) as input to the
computer-assisted typesetter. Finally, it is converted back to paper for
distribution to the reader in a paper journal.
At least four barriers to greater use of electronic media exist for
operations at Ridge.
- Figures are not routinely a part of electronic manuscripts.
- A significant number of the people with whom the APS journals
communicate lack either electronic capabilities or an interest in using
them. Even if all incoming communications were electronic, or converted
to electronic at Ridge, referral to many referees would have to be
converted to paper copy.
- The hardware and software capabilities at Ridge could not support
paperless operations. Even less-paper operations would require hardware
and software upgrades.
- General reluctance to work entirely with the electronic media by
people who find paper superior.
Meanwhile, the journals are becoming more international than ever.
In 1989, for the first time, more than half (52%) the submissions to
Ridge were non-U.S. in origin. In the last three years all the growth in
annual submissions were non-U.S. in origin. From October 1989 through
February 1990, 40% of electronic submission were from the U. S.; 60% were
non-U.S. in origin. (Total submissions during this period totaled 95.)
The leading countries in submissions (all forms) in 1989 were West
Germany (6.9%), Japan (6.3%), Canada (4.6%), France (4.0%), India (3.6%),
and China (3.6%). For comparison, the U.S. leaders were California
(7.4%), New York (6.2%), and New Jersey (3.7%). (Percentages refer to the
total submissions of 13.5K manuscripts.)
The Task
Force recommends that the APS hire an individual to track developments in
electronic information technology. The following is a suggested job
description for this individual.
- Monitor experiments with APS journals on tape to be conducted at
Xerox and other places; organize follow-up workshop.
- Gather information on the state of the art; report to APS
Publication managers on:
- Hardware and software developments.
- Other databases, including delivery, charging, and
publisher-payment systems.
- Current and projected usage by, and needs of, the physics
community.
- Maintain and enhance contact with other scientific societies,
monitoring their electronic journal programs, especially the
IEEE-IEE-UMI experiment with CD-ROMs, the ACS-Bellcore-Cornell research
with the text of of ACS journals, and the AMS marketing of Math.
Reviews on CDROMs and tapes.
- Work with the AIP and other partners of the APS now involved in
SPIN, FIZ, and STN on current and possible future electronic products
and their distribution.
- Propose policy alternatives to APS management on electronic
journals and their distribution, including financial and marketing
information.
- Execute policy decisions.
|