next previous
Up: The NASA Astrophysics Data


3 The current system

Currently the ADS system consists of four semi-autonomous (to the user) abstract services covering Astronomy, Instrumentation, Physics, and Astronomy Preprints. Combined there are nearly 1.5 million abstracts and bibliographic references in the system. The Astronomy Service is by far the most advanced, and accounts for $\sim 85$% of all ADS use; it ought be noted, however, that the Instrumentation Service contains more abstracts than Astronomy, and a subset of that service is used by the Society of Photo-Optical Industrial Engineers as the basis of the official SPIE publications web site.

All of what follows will refer only to the Astronomy service.

3.1 Data

Here is a brief overview of the data in the ADS system, a complete description is in DATA.

3.1.1 Abstracts

The ADS began with the abstracts from the NASA STI database, in printed form these abstracts were the union of the International Aerospace Abstracts and the NASA Scientific and Technical Abstracts and Reports (NASA STAR). While the STI branch has had to substantially cut back on their abstracting of the journal literature, we still get abstracts of NASA reports and other materials from them.

We now receive basic bibliographic information (title, author, page number) from essentially every journal of astronomy. Most also send us abstracts, and some cannot send abstracts, but allow us to scan their journals, and we build abstracts through optical character recognition. Finally we receive some abstracts from the editors of conference proceedings, and from individual authors.

The are $\sim$500 000 different astronomy articles indexed in the ADS, the database is nearly complete for the major journals articles beginning in 1975.

3.1.2 Bitmaps

The ADS has obtained permission to scan, and make freely available on-line, page images of the back issues of nearly all of the major journals of astronomy. In most cases the bitmaps of current articles are put on-line after a waiting period, to protect the financial integrity of the journal. DATA describes the current status of these efforts.

We plan to provide for each collaborating journal, in perpetuity, a database of page images (bitmaps) from volume 1 page 1 to the first issue which the journal considers to be fully on-line as published. This will provide (along with the indexing and the more recent archives held by the journals) a complete electronic digital library of the major literature in astronomy.

On a longer term we plan to scan old observatory reports, and defunct journals, to finally have a full historical collection on-line. This work is beginning with a collaboration with the Harvard Preservation Project (Eichhorn et al. 1997; Corbin & Coletti 1995).

3.1.3 Links

ADS responds to a query with a list of references and a set of hyperlinks showing what data is available for each reference. There are $\sim$1.73 million hyperlinks in the ADS, of which $\sim$ 31% are to sources external to the ADS project.

The largest number of external links are to SIMBAD, NED, and the electronic journals. A rapidly growing number, although still small in comparison to the others, are to data tables created by the journals and maintained by the CDS and the ADC at Goddard. SEARCH describes the system of hyperlinks in detail.

3.1.4 Citations and references

The use of citation histories is a well known and effective tool for academic research (Garfield 1979); their inclusion in the ADS has been planned since the conception of the service. In 1996 the AAS purchased a subset of the Science Citation Index from the Institute for Scientific Information, to be used in the ADS; this was updated in 1998. This subset only contains references which were already in the ADS, thus it is seriously incomplete in referring to articles in the non-astronomical literature. This citation information currently spans January 1982-September 1998.

The electronic journals all have machine readable, web accessible, reference pages. The ADS points to these with a hyperlink where possible. Several publishers allow us to use these to maintain citation histories; we do this using our reference resolver software (see ARCHITECTURE). The same software is also used by some publishers to check the validity of their references, pre-publication.

Additionally we use optical character recognition to create reference and citation lists for the historical literature, after it is scanned (Demleitner et al. 1999).

3.1.5 Collaboration with CDS/SIMBAD

The Strasbourg Data Center (CDS) has long maintained several of the most important data services for astronomy (e.g. Jung 1971; Jung et al. 1973; Genova et al. 2000); access to parts of the CDS data via ADS is a key feature of the ADS.

ADS users are able to make joint queries of the ADS bibliographic database and the CDS/SIMBAD bibliographic data base. When SIMBAD contains information on a object which is referred to in a paper whose reference is returned by ADS then ADS also returns a pointer to the SIMBAD data. When a paper has a data table which is kept on-line at the CDS the ADS returns a pointer to it. The CDS-ADS collaboration is at the heart of Urania (Sect. 4). More recently ADS has entered into a collaboration with the National Extragalactic Database (NED; Helou & Madore 1988; Madore et al. 1992) which is similar to the SIMBAD portion of the CDS-ADS collaboration.

3.2 Search engine

The basic design assumption behind the search engine, and other user interfaces, is that the user is an expert astronomer. This differs from the majority of information retrieval systems, which assume that the user is a librarian. The default behavior of the system is to return more relevant information, rather than just the most relevant information, assuming that the user can easily separate the wheat from the chaff. In the language of information retrieval this is favoring recall over precision. SEARCH describes the user interface in detail.

3.3 Hardware and software architecture

The goals of our hardware and software systems are speed of information delivery to the user, and ease of maintainability for the staff. We thus pre-compute many things during our long indexing process for later use by the search engine; we have highly optimized all code which is run by user processes; we have developed a worldwide network of mirror sites to speed up internet access. ARCHITECTURE describes these systems.

3.4 Data ingest

The basic rule for what books and periodicals the ADS covers is: if it is in the Center for Astrophysics library it should be in the ADS. As a goal we are still some ways from realization. We have recently adopted a second rule for inclusion: if it is referenced by an article in a major scholarly journal of astronomy it should be in the ADS. DATA describes the ADS coverage, and ingest procedures.

next previous
Up: The NASA Astrophysics Data

Copyright The European Southern Observatory (ESO)