Speaker: Mick Draper (CERN)

Panel Discussion: Current Thinking Panel

We decided we wanted to try and scan all the preprints arriving, about 12,000 preprints a year. So we decided to start a project to scan these and put them on-line. Clearly we were asking the wrong question; rather than asking how we could make our archive available, we should have asked what can we do to improve the process of getting these things on-line. And this was clearly an oversight at the time and now, since it's now been addressed by Paul in the E print archives, I think that the scanning aspect should be something we shouldn't really talk about today. I think it's a dying art, it will disappear. We are forced to still scan 40%. We get 60% on the bulletin boards or internally from inside CERN.

The other reason that I'm here today is I'm here to listen to what the people in the U.S. plan to do in this area. And just coincidentally I have a meeting in two weeks' time with Elsevier who want to come and discuss with us how they can collaborate with us on electronic journals. So I'm very interested to hear what the feeling is of the people around here. We're also looking to do an accelerator electronic journal as well, because this is a very expensive journal and we have a lot of authors and editors at CERN.

We started a preprint server this year where we store certain papers which are not available on the bulletin boards and we also have links to the E-prints on Paul's servers. We do the same sort of thing as Pat described earlier. We try to link things into a library system, we try to make the services to the community as good as possible. And one of the other reasons I'm here in this part of the world is to try to discuss collaboration with SLAC and Desy so we can try not to do the same things over and over again in different places, but try to do it once and once only.

I have no real goal for this talk, it's just a lot of random thoughts. But one of the things that was mentioned earlier by, I think it was Annette, who mentioned the Glasgow conference. This was the major high energy physics conference which happens every two years (the so-called Rochester series). We offered our services in a sense to set up a server for these guys. So we actually went there and we got in touch before and we set up the mechanisms so the abstracts were submitted in the correct way, the papers were sent in TeX, and they all ended up on that server. An E print-like server which we made available on day one of the conference to the countries' delegates. Just as a little aside, we then saw the publisher's worry because one of the main sources of income of a conference is photocopies of talks. Now, if we can put them on line on the Web and they're available to everyone free, why pay money and carry home kilograms of paper with you.

I think some thoughts for discussion, one of the things someone just noted down and I hope will be addressed today, is how do we solve the problem of the non-total coverage of papers on the E print servers. There are lots of papers which do not get to the E print servers today in high energy physics. I'm not even talking about the other disciplines. There are lots of accelerator papers which do not get onto the E print servers. There are lots of experimental papers which don't get into E print servers. At CERN, we have no central control over who puts the preprints where. We can't tell people to send things to hep-ex. People will do if they want to do it, or they won't do it if they don't want to do. They all come through us anyway, because my group runs the publishing service of CERN. But I don't feel that I can take someone's paper and send it to Paul's server. I'm not the author.

Andy just said you can't force them to send it to a journal either.

Yes. They are not actually producing preprints, but they can't be bothered sending for further reason to the bulletin boards. Perhaps in the same topic, why the experimentalist are so unwilling to do this. It's been touched on before. I also feel that nowadays we have certainly four very large collaborations of 400 physicists who do all the peer reviewing in house, inside the collaboration. Now, it's not necessarily a good idea. It's a little bit incestuous perhaps, that these guys are all doing it inside. And it's going to get worse with LHC if ever LHC gets approved. There will be two large experiments with 1,000 physicists each, reviewing their own papers and then submitting them to wherever. So I think there's a very different community, as someone said before, the one man and a boy theorist approach and the huge experimental collaboration approach. It's up to us, I think, at CERN to try to encourage the collaborations to submit their papers to hep-ex, I think we will try to do that. I'm not sure how much success we'll have.

One of the other areas which was touched on in the E-print mailing lists before the conference was the importance or non-importance of the computing platforms. I think we shouldn't underestimate this. I think that when we put up a preprint server, the biggest source of complaints we received were from people who couldn't view the postscript files. They could see the abstracts and the authors, etc., but when they use Mosaic on their MacIntosh they just got all this gibberish thrown at them. So I think it's something which we have to address if you're talking about E-journals. In the end one has to have a system where people can actually see them.

The other thing I would say is that at CERN, apart from just preprints, we also publish official reports. We publish eight to ten CERN reports a year. Do you want to see a 500 page postscript file on your MacIntosh? It's not obvious to me that you want to see it in one chunk, or blob just being thrown at you. So we have to think of ways how you can split these things up into the bits which people want to actually see. One of the major CERN reports is the CERN schools of physics or accelerators or computing. These are very, very large documents and people probably only want to look at section 1.2 which talks about the thing they want to see. So we have to find ways of automating the process of splitting these things into pieces. We actually have something on line which we could demonstrate. It's just a very boring thing. We just took a table of contents and ran it through a program and make links to the different chapters. It's not terribly modern, but it does seem to work.

The other thing I think that Pat mentioned is relatively important to us. I also support a library. The standards which could be helpful to us to get the data from Paul's E print servers into our library systems. We probably all do it, there's probably people sitting there [INAUDIBLE]. We all write programs to take your stuff and manipulate it, rather than having it pre-formatted for us in the first place. And I think that perhaps touches on the point someone made yesterday at dinner, which is how can we convince you to do these things.

The future for us at CERN is to go the way SLAC have gone. They've been way ahead of the game by having SPIRES connected intimately with the Web. We've been in a sense lagging behind. We have the Web, but we don't have a library system connected to the Web that easily. So we don't have SPIRES at CERN, for people who don't know. We made a decision, not me, some time ago to go with a commercial product call ALEPH from an Israeli company. So we have an integrated library systems which is not SPIRES and we have problems exchanging data between SLAC and CERN and DESY, but anyway we have it, it's there, and it works. We're going to make it available in the Web at the end of this year. But one of the other things, the neat things, and perhaps one of the things which I don't think anyone's mentioned so far. One talks about electronic journals and putting Phys Rev Letters on line, or whatever. What we would like to do is to give everyone their own journal, you just have a personalized journal. Why have Phys Rev Letters. Maybe you're interested in two thoughts of that and one thought of this. So, you can have your own journal. And our library system, one of the things it does offer is the possibility to be informed about or sent details of anything relating to "XYZ". And then if you have all the stuff available in postscript form, or electronic form, you have your own electronic journal. I think that's quite an interesting way to go.