APPENDIX A-Task Force Members

    Stewart C. Loken (Chairman)
    Director
    Information and Computing Sciences
    Division
    Lawrence Berkeley Laboratory
    University of California Berkeley, CA
    94720

    Richard J. Beach
    Manager
    Electronic Documents Laboratory
    Xerox Palo Alto Research Center
    Palo Alto, CA 94304

    Kirk T. McDonald
    Professor of Physics
    Princeton University
    Princeton, NJ 08544

    Theodore D. Schultz
    Research Staff Member
    IBM Thomas J. Watson Research
    Center
    Yorktown Heights, NY 10598

    Harry Thacker
    Physicist
    Fermi  National Accelerator Laboratory
    Batavia, IL 60510
    and Professor of Physics
    University of Virginia

    Joseph Worth
    Senior Physicist
    AT&T Bell Laboratories
    Holmdel, NJ 07733

    

Consultants

    Miriam Forman
    Deputy  Executive  Secretary
    American Physical Society
    New York, NY 10017

    Peter Adams
    Editor
    Physical Review B

    George Basbas
    Editor
    Physical Review Letters

    Gene Wells
    Editor
    Physical Review Letters

    Peggy Judd
    Director
    Management Information Systems
    American Physical Society
    Woodbury, NY 11797

    Peggy Sutherland
    American Physical Society
    Woodbury, NY 11797
    

APPENDIX B-Task Force Meetings

8 December 1988

Lawrence Berkeley Laboratory Berkeley, CA

  • Review of Charge
  • Presentation on LBL Libraries (H. Griffin)

8 February 1989

American Institute of Physics Woodbury, NY

  • Integrated Publishing System (P. Judd)
  • Tour of Woodbury Facility (C. Fleming)
  • Electronic Publishing Overview (P. Judd)
  • AIP Information Technology Committee history/status (T. Ingoldsby)
  • Submission of Author-Prepared Manuscripts (P. Sutherland, P, Judd, L. Siddons, D. Marlette)
  • Advantages of Computerized Composition System (D. Carlin)
  • SPIN, FIZ-Database Availability of Journal Material (P. Parisi)
  • CD-ROM Update (T. Ingoldsby)
  • Optical Disk for APS Archive (S. Quillen)

9 February 1989

American Physical Society New York NY

  • IEEE/UMI Project (D. Staiger)
  • AMS CD-ROM Project (T. Kuzma)

17-18 April 1989

AT&T Bell Laboratories Holmdel, NJ

  • Tour of AT&T Bell Laboratories
  • Demonstration of CD-ROM Systems
  • Document Preparation at Bell Laboratories

16-17July 1989

Xerox Palo Alto Research Center Palo Alto, CA

  • The Electronic Document Laboratory (R. Beach)
  • Dialog (J. Maxon-Dadd)

27-28 September 1989

American Physical Society, Washington, DC

  • Defense Technical Information Center (K. Molholm)
  • Association of Research Libraries
  • American Chemical Society Initiatives (R. Marks)
  • Preparation of What's New (R. Park)

25-27 March 1990

IBM Thomas J. Watson Research Center, Yorktown Heights, NY

  • Preparation of Final Report

APPENDIX C-Some Scientific Databases

In this appendix we enumerate and give some details about a few important hosts, physics information systems, physics preprint databases, and databases in several sciences-all connected to on-line services. Three of these provide full text. No attempt whatsoever is made to be complete. Many more information services that could be studied as models are given in various directories of databases and information systems (Ref. 7)

A few important hosts:

  • STN International is a consortium of the Chemical Abstracts Service (CAS) of the American Chemical Society (ACS), Fachinformationszentrum Energie, Physik, Mathematik GmbH (FIZ Karlsruhe), and the Japan Information Center of Science and Technology (JICST). Interactive access is available by dialing up through the commercial networks Telenet, Tymnet, and CompuServe, or by dialing up directly.
  • Bibliographic Retrieval Service (BRS) and Dialog Information Services, Inc. (DIALOG) are private firms hosting enormous numbers of databases that are accessible by dial-up over commercial networks like TELENET and TYMNET. Some institutions provide quicker access to these services.

Two physics and one biomedical information system(s):

  • PINET, the Physics Information System of the AIP. This system offers gateways to BITNET, Internet, Telemail, etc. (for users who have no e-mail facility at their home institutions), two bibliographic databases (SPIN and General - Physics Advanced Abstracts, both described below), two bulletin_boards, several searchable lists (job notices, announcements, and news items including What's New), a calendar of physics meetings, an AIP publications catalogue with on-line ordering, and the biannual Directory of Physics and Astronomy Staff. The AIP is both producer and host; the interactive access is by direct dialup to an 800 number or over the Internet using the Telenet protocol.
  • SIS, the Superconductivity Information System of the DOE. This system offers a number of services in response to the great interest in high-temperature superconductivity: the Preprint Database and Energy Database described below, a bulletin board, a work-in-progress database, and an e-mail facility for SIS users. The producer and host is the Integrated Technical Information System (ITIS), Office of Scientific and Technical Information (OSTI) of the DOE. Interactive access for some users is by dial-up over the government's FTS network; access for anyone is by dial-up over Telenet.
  • MEDLARS, the Medical Literature Analysis and Retrieval System at the National Library of Medicine of the National Institutes of Health, produces the full-text biomedical database MEDLINE (see below). It also operates an on-line network to provide health professionals with remote-terminal interactive access to more than 20 databases produced at the NLM and elsewhere. This system is currently accessed by 15,000 institutions and private individuals.

Two Bibliographic Databases of Physics Preprints:

  • The Preprint Database (PDB) of the Superconductivity Information System (S1S), referred to above, lists bibliographic information including author-provided abstracts for preprints related to high-temperature superconductivity that are collected in a variety of ways.
  • The High-Energy Physics (HEP) database lists bibliographic information (but does not include abstracts) for high-energy preprints coming into the library of the Stanford Linear Accelerator (SLAC) and subsequent publication information when known, using the Stanford Information Retrieval System (SPIRES). The producer and host are SLAC. Interactive access over one of the networks connecting mainframes is available to many high-energy physics groups. It is also available by dialing up over the TELENET commercial network. Noninteractive access, in which requests are sent and answers are later returned, is available from QSPIRES over BITNET.

Some Bibliographic Databases of Published Literature in Physics, Mathematics, Chemistry, Biology, and Biomedicine:

  • INSPEC lists bibliographic information and abstracts of published papers, covering physics (on-line version of Physics Abstracts), electrical and electronics engineering, computer science, and library and information science, but not chemistry. The producer is the Institute of Electrical Engineers (IEE) in London; the host include DIALOG and BRS.
  • PHYS, the on-line version of Physics Briefs, lists bibliographic information and abstracts of published papers in physics, physical chemistry, biophysics, geophysics, astronomy, and astrophysics. The producers are FIZ Karlsruhe in cooperation with the Deutsche Physikalische Gesellschaft and the AIP. The host is STN International.
  • SPIN (Searchable Physics Information Notices) is a subset of PHYS, listing bibliographic information and abstracts for all articles in publications of the APS, the AIP, and its other member societies, including the Russian and Chinese translation journals. The producer is the AIP; the hosts are DIALOG and, recently, the AIP's PINET (see above). The contents of SPIN eventually appear as part of PHYS (see above).
  • General Physics Advance Abstracts is an advance listing of most of the entries in SPIN (see above), giving similar information about papers accepted for, but in advance of, publication (hence, the publication information is tentative and less complete, and the abstracts may still be preliminary). The producer is the AIP; the host, the AIP's PINET.
  • CA SEARCH, the on-line version of Chemical Abstracts, lists bibliographic information, mostly with abstracts, for the published chemical literature, patents, technical reports, etc. The producer is the Chemical Abstracts Service of the American Chemical Society; the hosts include BRS, DLALOG, and STN.
  • DOE ENERGY is a bibliographic database with abstracts of published information (and some unpublished) of interest to the DOE. The producer is OSTI of the DOE. The hosts include DIALOG, STN, and ITIS-OSTI-DOE itself. Access for the general public is by dialup; subscribers to SIS (see above) have access through SIS.
  • BIOSIS PREVIEWS, the on-line version of Biological Abstracts and Biological Abstracts/RRM (Reports, Review, Meetings) combined, lists bibliographical information and (for entries in Bio. Abst. since 1976) abstracts in the biological and biomedical sciences. The producer is BIOSIS; the hosts include BRS, DIALOG, and STN.
  • MEDLINE (MEDLARS Ordine), the on-line version of Index Medicus and other medical indexes, lists bibliographic information for the biomedical literature, about 60% with abstracts. The producer is the U.S. National Library of Medicine (NLM), an organization setup by Congressional mandate as a component of the National Institutes of Health, which makes MEDLINE available through MEDLARS (see above); the hosts are BRS, DIALOG, Mead Data Central, and others.

Three Full-Text Databases in Mathematics, Chemistry, and Biomedicine:

  • Math/Sci Online includes all information in four printed publications: Mathematical Review, Current Mathematical Publications, Current Index to Statistics, and Index to Statistics and Probability, which means full text including equations. This is possible because the contents are coded in AMSTEX, the AMS's version of TEX and the AMS provides the program for decoding and printing on paper or a graphics terminal. The producer is the AMS with the American Statistical Association and the Institute of Mathematical Statistics; the hosts are BRS, DIALOG, EasyNet, and others.
  • CJO (Chemical Journals Online) is a database containing full text of the chemical journals of The Royal Society of Chemistry and of John Wiley & Sons and of 19 primary journals published by the ACS. The producers are the ACS, the RSC, andJW&S; the host is STN (of which the ACS is a part). Illustrations and many tables are not included; equations are in a decipherable language.
  • Comprehensive Core Medical Library (CCML) is a database containing full text of numerous major medical textbooks and medical journals. Both the producer and host are BRS.

APPENDIX D-Selected CD-ROM Products

Selected CD-ROM Products in Science and Technology The following is a representative list of current CD-ROM database products covering science and technology. It is not intended to provide a complete catalog, but to indicate the range of products that are available.

Science Citation Index
Published by ISI
Covers 1986-1990
Contains citations to articles in 3300 of the world's leading science and technical journals indexed by cited author, cited work, cited patent, title word, journal title, author name, and author address.
NTIS
Several publishers
1983-1990
Bibliographic citations and abstracts to government-sponsored research and development reports produced by the National Technical Information Service.
Math/Sci
Silver Platter
1990
CD-ROM version of the American Mathematical Society's MathSci Database.
SciTech Plus
Bowker
1990
Includes American Men & Women of Science, Directory of American Research & Technology, Corporate Technology Directory, listing of sci-tech books from Books in Print, and sci-tech journals from Ulrich's.
Computer Library
Ziff-Davis
1990
Bibliographic information and abstracts of articles from over 120 computer, technical, and business periodicals.
Thomas Register
DIALOG
1990
CD-ROM version of Thomas Register of American Manufacturers.
Compendex
Dialog
1990
Engineering Index and Engineering Meetings.
Biological Abstracts
Silver Platter
1990
CD-ROM version of BIOSIS.
Medline
Several Publishers
1990
National Library of Medicine's database.
Life Sciences Collection
Cambridge
1982-1990
Abstracts from more than 5000 journals and other sources covering many fields in the life sciences.
CD-GENE
DNA and Protein (amino acid) sequences in GENBANK, EMBL, and Protein Information Resources.
McGraw-Hill Science & Technology Reference Set
McGraw-Hill Concise Encyclopedia of Science and Technology and the McGraw-Hill Dictionary of Scientific and Technical Terms.
Kirk-Othmer Encyclopedia of Chemical Technology
Wiley
PC-PDF
International Center for Diffraction Data
CD-ROM version of the Powder Diffraction File.
Registry of Mass Spectral Data
Wiley
CD-ROM version of the 1987 Registry of Mass Spectra Data.
Geophysics of North America
National Oceanographic and Atmospheric Administration (NOAA)
Includes magnetics, gravity, earthquake seismology, thermal aspect, stress data, and satellite imagery.

APPENDIX E-Full-Text Database Experiments

In this appendix we describe three current experiments in delivering and using technical information electronically, all of which have been referred to in Sec. II. The experiments are in the fields of mathematics, electrical engineering, and chemistry.

Math/Sci Disc (on CD-ROMs through university Microfilms, Inc.)
For many years, the AMS has published Mathematical Reviews, a prestigious monthly reviewing journal, and Cur-tent Mathematical Publications, a triweekly current awareness journal, both in the mathematical sciences. Similarly, the Association for Computing Machinery has been publishing the monthly Computing Reviews and the annual ACM Guide to Computing Literature.

Since 1982 the full text of these journals, including fully formatted equations, has been available on-line through BRS, DIALOG, and ESA-IRS, b as well as via several other systems. The availability of full text (including equations) is made possible by having the article entirely encoded in AMSTEX, one of the TEX family of formatting languages (recall that TEX was developed under the sponsorship of the AMS) and by supplying all users with the TEX programs needed to transform TEX code into fully formatted text.

If provision of truly full text is one peculiarity of this experiment, the relatively small size of these two journals is the other; all these journals from 1985 to 1988 plus an index for searching easily fit on one CD-ROM. The AMS now makes all this material available on a CD-ROM product produced for them by SilverPlatter International. Every six months the AMS brings out a new disc which, until now, has contained all the text of the previous disc, all additions for the previous six months, and a fully updated index. To fill in the large time gap between issuance of discs, users are encouraged to refer to the on-paper or on-line versions of these journals.

The discs are leased to users, who must then return the previous disc. What happens when it becomes necessary to issue a second disc, to both the timespan of articles included and the index is not yet clear, since the situation has yet to arise.

The AMS has also introduced a novel pricing scheme. Whereas a library with no subscription to Mathematical Reviews (on paper) would pay $3510 to lease MathSci Disc for a year, a library with a subscription to Mathematical Reviews would pay only $2106 for MathSci Disc; an individual at an institution already having a subscription to MathSci Disc would have to pay only $351 for an additional subscription for his/her personal use only. The AMS believes that typical users of MR are inclined to wish to do their own, unhurried searching, so that a personal copy of MathSci Disc (and, of course, one's own CD-ROM reader and workstation) should have special appeal.

Perhaps anticipating the day when there will be two or more MathSci Discs, The AMS is also considering the possibility of supplying this information on magnetic tapes for uploading onto local mainframes or LAN servers. The fees to be charged and other issues will be negotiated.

IEEE/IEE Publications Ondisc (IPO)
This experiment, described briefly in Sec. II, is a collaborative effort of the IEEE, the IEE of the UK, and UMI (University Microfilms, Inc.) to provide a large amount of the electro-technical literature on CD-ROMs together with some machine searching. Until very recently, the system they developed was in a "beta test" (described below). It just has been decided to market the product commercially. The information presented here comes from two meetings with David Staiger, then Director of Publications of the IEEE (one meeting with the Task Force, the other with the AIP Subcommittee on Electronic Information Technologies to which one Task Force member belongs), and from extensive conversations with Staiger and Dan Crawley, the Associate General Manager of the IEEE. A detailed report with emphasis on the user's point of view has been published by Mary Holland, Librarian for National Semiconductor Corp., Santa Clara, CA (Ref. 8).
  • General Idea
    • A large amount of IEEE and IEE literature for search, retrieval, and printing is provided through UMI on CD-ROMs. For now, this is viewed as a delivery mode that parallels the traditional on-paper subscriptions. It also allows the IEEE to have a presence and to develop experience in a mode that is becoming popular and can lead to better things.
  • Contents:
    • Text, tables, figures, etc.: All IEEE and IEE published journals (60), the published proceedings of about 1/3 of all lEEE-sponsored conferences (150), and a complete set of new IEEE standards (1000-2000) and magazines (20). The beta test began with this literature for 1988, about 200,000 pages in all, which was updated twice monthly.
    • INSPEC for indexing: Bibliographic information (titles, authors, institutions, publication information, abstracts) for all the above articles (which constitute about 26% of the electrotechnical literature) but no others, drawn from the IEE's INSPEC database.
  • Production of Discs.
    • Image Discs: Using a dedicated computer, scanner, and workstations in a system of their own design, UMI bit-maps the articles as full pages (omitting the advertisement pages in the magazines) at 300 dpi and with the same compression as on a fax machine (CI'IT4 standard for compression algorithms), giving an average compression ratio of about 1/10. These bit mappings are recorded on a WORM disk, which is then used as a master in making the CD-ROMS. These disks, which are not machine readable, appear twice monthly.
    • Index Discs: The selected portions of the INSPEC database are stored on CD-ROMs in machine-readable form and keyed to the bit-mapped articles. These index discs thus provide a searchable index to the articles, although the appearance of the entries in this database can lag behind the appearance of the articles by as much as four months. An updated replacement disc appears monthly. It is estimated that a single index disc will fill up in about six years, unless it is decided to include a larger fraction of INSPEC than is now included.
  • Just-completed beta test by UMI:
    • Sites: Four U.S. universities (Michigan, Illinois, Stanford, and Polytechnic University of Brooklyn), two U.S. government labs (NIST and NRL), four U.S. industrial labs (GE, Xerox-Webster, Hughes Aircraft, and Hewlett-Packard), and two U.K. sites (Imperial College and GEC Hirst Research).
    • Installations: At each site, UMI installed a UMI Image Workstation consisting of a Sigma Designs LView high-resolution monitor with a 386 NCR processor, 4 Mb RAM, 30 Mb hard disk, and two Toshiba internal CD-ROM drives. There are attached an inkjet printer for printing only abstracts and a Canon Series II laser printer for printing full pages.
  • Features:
    • Searching: All 59 of INSPEC'S fields can be searched, or the set of fields to be searched can be selected. Search criteria can include the boolean AND, OR, and NOT combinations and four kinds of proximity operators.
    • Retrieval: A list of "hits" can be displayed, and a cursor can be used to select any title for viewing, downloading, or printing. The disk number containing the article appears on screen, whereupon the correct disk must be inserted, after which the first page appears-a far cry from a completely automated system. The hit list is stored so that further hits can be viewed without reinserting the index disk.
    • On-screen display: pages can be "turned" either way and any portion of a page can be magnified or reduced.
    • Printing: only full pages can be printed.
  • User reactions in the beta test:
    • User information was gathered in the test, much of it from the user, after accessing the database, with a questionnaire. These answers were unreliable because users were reported to be impatient to start searching. Connection time, actual publications printed, and number of pages printed were monitored.
    • Miriam Forman of the APS, with the aid of some Task Force members, sought out acquaintances at several of the test sites for reaction to the system. This anecdotal survey brought a number of reactions: it would be more useful if the installation were closer to the user; it is useful mainly as a very sophisticated copier, especially for students; the quality of full-article printing with the laser printer was much better than photocopying the original; while the index disc was copied onto the workstation's hard disk, the image discs were kept at the reference desk and handed out on request; and the searching was good for names and titles, primitive beyond that. No one saw this as a universal resource, and so there were no complaints about the narrowness of the search -window, although some did use it to browse and/or search by subject.
  • Decision to market: It has just been decided to market the IPO product essentially as tested, as a worthwhile parallel delivery mode for which there may be some demand, fully realizing its shortcomings. The system is seen by the IEEE as only the first step in a process that will lead ultimately to fully electronic publishing delivered to the user in a number of different modes, including on-line, and possibly much expanded in the literature it covers.
  • Pricing In the marketing of IPO now contemplated, an initial package will be offered of three back years and the current year. To commercial research libraries, where paying for off-prints is believed to be undesirable, the initial package will probably cost about $26K with no charge for off-prints. To academic libraries, the package will cost $ 12K-13K, but off-prints will cost $0.20/page. In subsequent years, the subscription price wiIl be $8K or $4K per year, depending on the kind of library. All these disks will be leased, not sold, so that a library wishing to have a durable archive of this literature will have to retain an on-paper subscription.
  • Compensating the publishers: Although 95% of the contents will come from the IEEE (the other 5% coming from the IEE), the journals are distinct and will be compensated distinctly, in proportion to the number of off-prints made, whether or not the user has paid separately for each off-print.
  • Problems:
    • Limited searching. Because the index disc contains only that portion of the INSPEC database relating to the documents on the image discs, every search is far from complete, missing about 70% of the literature, and so must be repeated elsewhere (perhaps in INSPEC-on-line from DIALOG or BRS). To solve this problem, the entire INSPEC index could be put on the index CD-ROMs, but this would shorten the time until the index required more than one CD-ROM.
    • Index on several CD-ROMs. It is estimated that in six years or less the index itself will occupy more than one CD-ROM. At that time, each search will have to be duplicated if it is to extend over more than that period. It is felt by IEEE people that by that time the system will have evolved, possibly to an on-line index coupled to storage on CD-ROMs.
    • Inefficient searching. The searching algorithm is not as expert as it could be, and only the INSPEC fields, not the full text, are searchable. Staiger of the IEEE believes that full-text searching would increase the hit rate by only 20%, although it might also greatly increase the efficiency if combined with a smarter search algorithm.
    • Slow, cumbersome access. The problems associated with having to access a whole library of CD-ROMS are obvious. When, eventually, full text is stored, the number of CD-ROMs will be greatly reduced, but then two other problems arise: (1) interactive searching of full text, now possible in principle, is far more difficult if it must go over several CD-ROMs, and (2) since a much greater part of the literature will now reside on one or a few CD-ROMS, the problem of simultaneous users is exacerbated.
    • Full-text storage. The problem of storing full text digitally is difficult. It is less difficult if all the material is produced by one organization, the IEEE, but when the operation is so complex as to produce 200,000 pages a year, enforcing a uniform standard even then is difficult. And if this system is ever to cover the whole electro-technical literature, with 5,000 publishers, digital storage will be impossible unless there is a single widely-accepted standard imposed on publishers (if not on authors), or some machine translation between formatting languages. In this connection, the IEEE has been moving to promote such a standard, like SGML.
    • Impact on other media. The impact on the IEEE on-paper offerings of making IPO universally available can only be conjectured, but the contemplated leasing (rather than selling) is consistent with the philosophy that IPO will be a parallel, additional mode to on-paper, at least for the moment.
Chemistry Online Retrieval Experiment (CORE)
This experiment, in its first stages, involves the ACS, the Online Computer Library Center (OCLC), Bellcore, and Cornell University, each of which plays a distinct role:
  • The ACS is supplying the computer code that generates the text of 20 ACS journals for the past 10 years, text that is now available on-line in the database Chemical Journals Online (CJO) from STN. This code will be updated through the course of the experiment as new issues of these journals are published. The ACS also is providing microfilms of the full pages of the journals as published.
  • Bellcore is producing a tape that will contain" two files: the ASCII code of the text, stripped of all formatting instructions and therefore suitable for searching, and a bit mapping of the original pages at 300 dpi, obtained from scanning the microfilm. At this point, no hybrid file is being produced in which the figures are bit-mapped and the text is stored in ASCII, nor does the stored ASCII code contain all the formatting instructions needed to generate the textual part of the journal pages. Bellcore is also providing a rather simple search program, Superbook, to be used in the search studied.
  • OCLC is providing a more comprehensive search facility, X-MEMEX, which supports a large variety of search strategies. OCLC will also have access to the tapes produced by Bellcore so that they can do their own experiments on searching and using.
  • Cornell, through its Mann Library, will install these tapes on an appropriate LAN that will be available for use by university chemists. Cornell psychologists will be able to monitor this usage to study the man-system interface, the way the system is used, and the effectiveness of various search strategies.

APPENDIX F-Some useful numbers

In this report, and in thinking about electronic information problems generally, certain numbers that characterize the literature produced by the APS are constantly coming up. We have collected some of these here.

  • Characters (bytes) on a Physictd Review page of solid text: 6.5 Kb [2 columns/page x 61 lines/col x 54 characters (bytes)/line]. Size of 1989 Physical Review, Physical Review Letters, and Reviews of Modern Physics: 50,000 pages.
  • Bytes in all 1989 Physical Review and Physical Review Letters if all text: 325 Mb (50,000 pages x 6.5 Kb/page).
  • Fraction of Physical Review pages devoted to figures: 15% (This is a very crude estimate based on a random sampling of all pages with numbers ending in 26 or 76 in the issues PRB-Oct 1, PRB-Oct 15, and PRA-Oct 15 in 1989. The percentages found were 20,8, and 15, respectively; the percentage for all three issues was 15).
Storage requirements for figures:
Resolution:
  • Fax, current ("Group 3" standards): 200 dots per inch (dpi)
  • Fax, future ("Group 4" standards): 400 dpi
  • A typical laser printer: 300 dpi IEEE/IEE/UMI experiment 300 dpi
  • Very good printing on paper: 600 dpi
  • High-quality printing with good grey scale: 1200 dpi
Compression factor (C) when bit mapping:
  • Average figure for text, graphics, illustrations: 1/10
  • Good figure for line drawings: 1/20
Storage of text versus storage of line drawings in Physical Review:
  • Text (see above): 52 kbit/page
  • Area of text 65 sq. in./page
  • Bits at 300 dpi, no compression: 5.85 Mbit/page
  • Bits at 300 dpi, compression 1/20: 0.29 Mbit/page
  • Expansion ratio by which storage must be increased when an area of straight text is replaced by a graphic figure to be reproduced with 300 dpi, assuming a compression factor of 1/20: 5.6
  • Expansion factor if resolution is R dpi and compression ratio is C: 5.6X (R/300)2X 20C=0.124XR2XC
  • Effective expansion ratio if graphics occupies 15% of area and text occupies the rest assuming 300 dpi and compression of 1/20: 1.8
Transmitting Physical Review pages and issues electronically:
At 9,600 bits/sec:
  • 2 pages/sec if just text
  • 1 page/sec if ASCII text with 15% graphics (compression ratio 1/20 )
At 1.5 Mbi.t/see:
  • 300 pages/see if just text
  • 150 pages/see if ASCII text, 15% graphics, etc.
  • 1 year of APS publications in 1/2 hour, if just text
  • 1 year of APS publications in 1 hour, if ASCII text, 15% graphics, etc.
Putting a current year of Pbysi- cal Review on CD-ROM, without index:
  • Bit-mapping 50,000 entire pages with a compression factor of 1/10 (the factor believed to be used by the IEEE experiment): 3,650 Mb, or about 6 CD-ROMS
  • Text in ASCII code, illustrations bit-mapped with compression 1/20: 585 Mb, or about 1 CD-ROM
The fractions of the periodical literature (serials) in English:
  • APS journals: 1/9
  • AIP journals: 1/9
  • AIP translation journals: 1/9

- The size of world's scientific periodical literature: 150 million pages

This is a very crude estimate by Feinberg (Ref. 6), that could easily be off by a factor of 2 or 3. It is the product of three factors:

  • - 9 million abstracts in decade 1977-86: 6 million in Chemical Abstracts, 3 million in Bio Abstracts, 1 million in Physics Abstracts-overlaps are neglected
  • factor of 3 to include all years before 1977
  • 5 pages per paper (below, we assume big Physical Reuiew pages); Feinberg (Ref. 6) rounds up to 150 million pages
Storing the world's scientific periodical literature:
  • Assuming that all is ASCII-coded text:
    150 million pages= 1 Tb=1 12-in. reel "digital paper" tape=1,600 CD-ROMs
  • Assuming all is bit-mapped at 300 dpi, compression ratio 1/10: 150 million pages=11 Tb=11 12-in. reels digital paper tape= 18,000 CD-ROMS
Size of the total physics periodical literature in English: 8 million pages
This estimate is obtained by a quite different but also crude method, based on two assumptions:
  • Last year the total physics periodical literature in English was 9 times that in the Physical Review, or 450,000 pages.
  • The rate of growth is and always has been exponential with a doubling time often years. Thus, there will be about 8.2 million pages produced in the next ten years, and the total literature until now should be about the same, 8.2 million pages.
Storing the world's English-language physics periodical literature:
Roughly 1/20 the figures for the scientific periodical literature given above.

APPENDIX G-APS Operations Today

This appendix describes the operations at the APS Editorial Offices in Ridge, NY. Editorial operations, in contrast to production operations, at Ridge are based on paper records of manuscripts and correspondence. The paper file is the authoritative source and record of communications, history, and status of the editorial handling of a manuscript There is considerable electronic assistance given to record keeping, management of the editorial process, and communications.

For record keeping and management there are two databases, one for the administrative record and status of manuscripts (data for over 100,000 manuscripts have been accumulated over the last decade and a half and one for referees (more than 15,000).

Communications are now assisted by e-mail, fax, and telex. Reports from referees and status inquiries from authors are now routinely received via e-mail, fax, and telex. The traffic is growing. There were 7,550 incoming e-mail communications in 1988 and 13,600 in 1989 (an average of one per manuscript submitted). The sum of fax and telex is about half that of e-mail and growing at the same rate. e-mail is now used for about 30% of referee reports. Outgoing correspondence is generally limited to remainders to referees and status reports to authors.

A capability for receiving manuscripts submitted electronically now exists as part of an experimental enterprise which is restricted perforce to a few manuscripts per day. Most of these submissions are requests by the authors to use their keystrokes for production in the compuscript program of the APS Liaison Office at Woodbury, NY.

Some reflect simply a desire to take advantage of the conveniences and strengths of e-mail communications. Such authors are automatically informed of the compuscript program and invited to participate.

There are two requirements for a successful electronic submission: The file must be readable at Ridge in a way that allows suitable hard copies to be made for referral (in paper) to reviewers, and figures must be received promptly (by fax or overnight mail) at the Editorial Office.

To qualify for the compuscript program, an electronic submission must be prepared under REVTEX. When the paper is accepted for publication, the author's keystrokes are transmitted to the Woodbury office by the Ridge office (or transmitted directly to Woodbury by the author).

Production at Ridge (all of PRL - 6,000 pages per year-and 4,000 pages worth of Rapid Communications for PR) is based on keyboarding copy-edited manuscripts into a VAX/750 with UNIX. Camera-ready copy is produced and sent to the printer (PRL to Canterbury Press in Rome, NY, and the Rapid Communication pages to Woodbury for collation with the rest of PR).

All operations at Ridge are on paper. All incoming material (manuscripts, referee reports, communications from authors) must be converted to paper for the office to operate. Outgoing communications from the editor to referee or author must be duplicated by a paper copy in the paper file for the manuscript. When a manuscript is submitted electronically it is converted to paper. When it is accepted it is keyed (unless part of the compuscript program) as input to the computer-assisted typesetter. Finally, it is converted back to paper for distribution to the reader in a paper journal.

At least four barriers to greater use of electronic media exist for operations at Ridge.

  1. Figures are not routinely a part of electronic manuscripts.
  2. A significant number of the people with whom the APS journals communicate lack either electronic capabilities or an interest in using them. Even if all incoming communications were electronic, or converted to electronic at Ridge, referral to many referees would have to be converted to paper copy.
  3. The hardware and software capabilities at Ridge could not support paperless operations. Even less-paper operations would require hardware and software upgrades.
  4. General reluctance to work entirely with the electronic media by people who find paper superior.

Meanwhile, the journals are becoming more international than ever. In 1989, for the first time, more than half (52%) the submissions to Ridge were non-U.S. in origin. In the last three years all the growth in annual submissions were non-U.S. in origin. From October 1989 through February 1990, 40% of electronic submission were from the U. S.; 60% were non-U.S. in origin. (Total submissions during this period totaled 95.) The leading countries in submissions (all forms) in 1989 were West Germany (6.9%), Japan (6.3%), Canada (4.6%), France (4.0%), India (3.6%), and China (3.6%). For comparison, the U.S. leaders were California (7.4%), New York (6.2%), and New Jersey (3.7%). (Percentages refer to the total submissions of 13.5K manuscripts.)

APPENDIX H-Job Description

The Task Force recommends that the APS hire an individual to track developments in electronic information technology. The following is a suggested job description for this individual.

  1. Monitor experiments with APS journals on tape to be conducted at Xerox and other places; organize follow-up workshop.
  2. Gather information on the state of the art; report to APS Publication managers on:
    • Hardware and software developments.
    • Other databases, including delivery, charging, and publisher-payment systems.
    • Current and projected usage by, and needs of, the physics community.
  3. Maintain and enhance contact with other scientific societies, monitoring their electronic journal programs, especially the IEEE-IEE-UMI experiment with CD-ROMs, the ACS-Bellcore-Cornell research with the text of of ACS journals, and the AMS marketing of Math. Reviews on CDROMs and tapes.
  4. Work with the AIP and other partners of the APS now involved in SPIN, FIZ, and STN on current and possible future electronic products and their distribution.
  5. Propose policy alternatives to APS management on electronic journals and their distribution, including financial and marketing information.
  6. Execute policy decisions.