next previous
Up: The NASA Astrophysics Data


5 Capabilities, usage patterns, and statistics

5.1 Examples

The ADS answers about 5 000 000 queries per year, covering a wide range of possible query type, from the simplest (and most popular): "give me all the papers written by (some author),'' to complex combinations of natural language described subject matter and bibliometric information. Each query is essentially the sum of simultaneous queries (e.g. an author query and a title query), where the evidence is combined to give a final relevance ranking (e.g. Belkin et al. 1995).

The ADS once supported index term (keyword) queries, but does not currently. This is due to the incompatibility of the old STI (NASA-STI 1988) keyword system with the keywords assigned by the journals (Abt 1990; A&A 1992; MNRAS 1992). Work is underway to build a transformation between the two systems (Lee et al. 1999; Lee & Dubin 1999).

Here we show four examples of simple, but sophisticated queries, to give an indication of what is possible using the system. A detailed description of available query options is in SEARCH. We encourage the reader to perform these queries now, to see how the passage of time has changed the results.

Figure 1 shows how to make the query "what papers are about the metallicity of M 87 globular clusters?'' This was the first joint query made after the SIMBAD-ADS connection was completed in 1993.

\includegraphics[width=10cm]{DS1780F1.eps}\end{figure} Figure 1: A query to the ADS Abstract Service requesting a listing of papers on the metallicity of M 87 globular clusters. SIMBAD, NED, the ADS phrase index, the ADS word index and the ADS synonym list are all queried, the results are combined and the list shown in Fig. 2 is returned

There are 1 765 papers on M 87 in SIMBAD, NED, or both; there are 6 425 papers which contain the phrase "globular cluster'' in ADS, and there are 25 599 papers in ADS containing "metallicity'' or a synonym (abundance is an example of a synonym for metallicity). The result, which comes in a couple of seconds, is a list of just those 58 papers desired.

Five different indices are mixed in this query: the SIMBAD object--bibcode index query on M 87 is logically OR'd with the NED object--refcode index query for M 87. The ADS phrase index query for "globular cluster'' is (following the user's request) logically AND'd with the ADS word index query on metallicity, where metallicity is replaced by its group of synonyms from the ADS astronomy synonym list (this replacement is under user control). If the user requires a perfect match, then the combination of these simultaneous queries yields the list of 58 papers shown in Fig. 2. Before the establishment of the Urania core queries like this were nearly impossible.

\includegraphics[width=10cm]{DS1780F2.eps}\end{figure} Figure 2: The top of the list ADS returns when the query shown in Fig. 1 is made

Another simple, but very powerful method for making ADS queries is to use the "Find Similar Abstracts'' feature. Essentially this is an extension of the ability to make natural language queries, whereby the user can choose one or more abstracts to become the natural language query. This can be especially useful when one wants to read in depth on a subject, but only knows one or two authors or papers in the field. This is a typical situation for many researchers, but especially for students.

As an example, suppose one is interested in Ben Bromley's (1994) Ph.D. thesis work. Making an author query on "Bromley'' gets a list of his papers, including his thesis. Next one calls up the abstract of the thesis, goes to the bottom of the page, where the "Find Similar Abstracts'' feature is found, and clicks the "Send'' button. Figure 3 shows the top of the list returned as a result. These are papers listed in order of similarity to Bromley's (1994) thesis; note that the thesis itself is on top, as it matches itself perfectly. This list is a detailed subject matter selected custom bibliography.

\includegraphics[width=10cm]{DS1780F3.eps}\end{figure} Figure 3: The top of the list of papers returned when Ben Bromley's (1994) thesis is used as the query

As a third example of ADS use Fig. 4 shows an intermediate step from the previous example (obtained by clicking on the "Return Query Form'' button, replacing the default "Return Query Results'' in the "Find Similar Abstracts'' query. Here we make one change from the default setting: we change "Items returned'' from the default "Abstracts'' to "References.'' The result, shown in Fig. 5 lists all the papers which are referenced in the 50 papers most like , sorted by the number of times they appear in the 50 reference lists. Thus the paper by appears in 21 reference lists out of 50, the paper by appears in 11 lists out of 50, etc. By this means one has a list of the most cited papers within a very narrowly defined subfield specific to one's personal interest. We are not aware of any other system which currently allows this capability.

\includegraphics[width=10cm]{DS1780F4.eps}\end{figure} Figure 4: A query which returns the papers most cited by the 50 papers most like Ben Bromley's (1994) thesis

\includegraphics[width=10cm]{DS1780F5.eps}\end{figure} Figure 5: The top of the list of papers returned by the query in Fig. 4; these are the most cited papers in a user defined very narrow subfield

Finally we show a somewhat more complex query in Fig. 6. Here we modify the basic query (Bromley's 1994 thesis) by requiring that the papers contain the word "void.'' We do this by changing the logic on the text query to "simple logic'' and adding "+void'' to the query. The returned papers to this query would be very similar to those shown in Fig. 3, but with all papers which do not contain the word "void'' removed. In addition we change "Items returned'' to be "Citations,'' and increase the number of papers to get the citations for to the top 150 closest matches to the query. The result, shown in Fig. 7, are those papers which most cite the 150 papers most like Bromley's (1994) thesis, modified by the requirement that they contain the word "void.'' Thus the paper by cited 26 papers out of the 150, the paper by cited 19, etc. These are the papers with the most extensive discussions of a user defined very narrow subfield. This feature also is unique to the ADS.

\includegraphics[width=10cm]{DS1780F6.eps}\end{figure} Figure 6: A query which returns the papers which most cite the 150 papers most like Ben Bromley's (1994) thesis, as modified by the requirement that they contain the word "void''

\includegraphics[width=10cm]{DS1780F7.eps}\end{figure} Figure 7: The top of the list of papers returned by the query in Fig. 6; these are the papers with the most extensive discussions of a user defined very narrow subfield

5.2 Use of the system

In September 1998 ADS users made 440 000 queries, and received 8 000 000 bibliographic references, 75 000 full-text articles, and 275 000 abstracts (130 000 were individually selected, the rest were obtained through a bulk retrieval process, which typically retrieves between one and fifty), as well as citation histories, links to data, and links to other data centers. Of the 75 000 full-text articles accessed through the ADS in September 1998, already 33% were via pointers to the electronic journals. This number increased to 52% in March 1999.

ADS users access and print (either to the screen, or to paper) more actual pages than are printed in the press runs of all but the very largest journals of astronomy. In September 1998, 472 621 page images were downloaded from the ADS archive of scanned bitmaps. About 75% of these were sent directly to a printer, 22% were viewed on the computer screen, and 2% were downloaded into files; FAXing and viewing thumbnail images make up the rest. If the electronic journals provide "pages'' of information at the same rate as the ADS archive, per article accessed (slightly more than 10 pages/article accessed), then more than 750 000 "pages'' were "printed,'' on demand, in September 1998 by ADS users. This is about three times the number of physical pages published in September 1998 by the PASP.

Viewed as an electronic library the ADS, five years after its inception, provides bibliographic information and services similar to those provided by the sum of all the astronomy libraries in the world, combined. The Center for Astrophysics Library, an amalgamation of the libraries of the Harvard College Observatory and the Smithsonian Astrophysical Observatory, is one of the largest, most complete, and best managed astronomy libraries in the world. For several years the CfA Library has been keeping records of the number of volumes reshelved, as a proxy for the number of papers read (library users are requested not to reshelve anything themselves). This number has remained steady in recent years, and was 1117 in September 1998 (D.J. Coletti & E.M. Bashinka 1998, personal communication). If the CfA represents 2-3% of the use of astronomy libraries, worldwide (the CfA has slightly more than 350 PhDs, the AAS has about 6800 members, the IAU about 8500, CfA users made 2.4% of ADS queries in September 1998, 5.7% of articles in the ADS Astronomy database with 1998 publication dates had at least one CfA author), and if other astronomers use their libraries at the same rate as astronomers at the CfA, then worldwide there would have been 37 000-56 000 reshelves in September 1998. In September 1998 ADS provided access to 75 000 full text articles and 130 000 individually selected abstracts, as well as substantial other information; current use of ADS is clearly similar to the sum of all current traditional astronomy library use.

ADS use continues to increase. Figure 8 shows the number of queries made each month to the ADS Abstract Service from April 1993 to September 1998, the dotted straight line represents a yearly doubling, which represents the five year history reasonably well. Since 1996 use has been increasing at a 17 month doubling rate, shown by the dashed line in the figure.

\resizebox{\hsize}{!}{\includegraphics{DS1780F8.eps}}\end{figure} Figure 8: The number of queries made each month to the ADS Abstract service. The dotted line represents a yearly doubling, while the dashed line represents a doubling period of 17 months, a reasonable match to the recent data

It is difficult to determine the exact number of ADS users. We track usage by the number of unique "cookies''[*] which access ADS, and by the number of unique IP[*] addresses. There are difficulties with each technique. In addition many non-astronomers find ADS through portal sites like Yahoo, which skews the statistics. In September 1998, 10 000, unique cookies accessed the full-text articles, 17 000 made queries, and 30 000 visited the site. 91% of full-text users had cookies, but only 65% of site visitors.

Figure 9 shows the number of unique users who made a query using the ADS each month from April 1993 to September 1998. Before early 1994 users had user names and passwords in the old, proprietary system, and could be counted exactly; after the ADS became available on the WWW users were defined as unique IP addresses. Note the enormous effect the WWW had on ADS use, a factor of four in the first five weeks. The straight dashed line represents the 17 month doubling period seen recently in the number of queries; the dotted line, which better represents the recent growth, is for a 22 month doubling period. The difference between the two is due to a one third increase in the mean number of queries per month per user (from 19 to 25) since 1996.

\resizebox{\hsize}{!}{\includegraphics{DS1780F9.eps}}\end{figure} Figure 9: The number of users who made a query queries made each month to the the ADS Abstract Service. The dashed line represents a doubling every 17 months, the dotted line a doubling every 22 months

From another perspective, the number of unique IP addresses from a single typical research site (STScI) which access the full-text data in a typical month (September 1998) is 107, the number of unique cookies associated with which access the full-text data is 104, the number of unique IP addresses from STScI which make a query to ADS is 148 and the number of cookies is 140. The number of AAS members listing an STScI address is 145 (J. Johnson, personal communication), and the number of different people listing an STScI address in the Astropersons e-mail compilation (Benn & Martin 1995) is 195. Those who access the full-text average one article per day, those who make queries average two per day.

We believe nearly all active astronomy researchers, as well as students and affiliated professionals use the ADS on a regular basis. Most of the recent exponential growth of use of the ADS is due to an increased number of users; this growth cannot last much longer, the 17 000 who made queries in September 1998 are probably the majority of all those who could conceivably want to make a query of the technical astronomy literature.

next previous
Up: The NASA Astrophysics Data

Copyright The European Southern Observatory (ESO)