In the following we will focus on Internet resources that actually provide data, of any kind, as opposed to those describing or documenting an institution or a research project, without giving access to any data set or archive.
One main trend is certainly the increase of interconnections between distributed on-line services, the "Weaving of the Astronomy Web'' (which was the title of a Conference organized in Strasbourg by Egret & Heck [1995]).
More generally, with the development of the Internet, and of a large number of on-line services giving access to data or information, it is clear that tools giving coordinated access to distributed services are needed. This is, for instance, the concern expressed by NASA through the Astrobrowse project (Heikkila et al. [1999]).
In this section we will first describe a tool for managing a "metadata'' dictionary of astronomy information services (GLU); then we will show how the existence of such a metadatabase can be used for building efficient search and discovery tools.
The CDS (Centre de Données astronomiques de Strasbourg) has
recently developed a tool for managing remote links in a context of
distributed heterogeneous services
(GLU,
Générateur de Liens
Uniformes, i.e. Uniform Link Generator;
Fernique et al.
[1998]).
First developed for ensuring efficient interoperability of the several services existing at CDS ( VIZIER, SIMBAD, ALADIN, bibliography, etc.; see Genova et al. [2000]), this tool has also been designed for maintaining addresses (URLs) of distributed services (ADS, NED, etc.).
A key element of the system is the "GLU dictionary'' maintained by the data providers contributing to the system, and distributed to all sites of a given domain. This dictionary contains knowledge about the participating services (URLs, syntax and semantics of input fields, descriptions, etc.), so that it is possible to generate automatically a correct query for submission to a remote database.
The service provider (data center, archive manager, or webmaster of an astronomical institute) can use GLU for coding a query, taking benefit of the easy update of the system: knowing which service to call, and which answer to expect from this service, the programmer does not have to worry about the precise address of the remote service at a given time, nor of the detailed syntax of the query (expected format of the equatorial coordinates, etc.).
The example of GLU demonstrates the usefulness of storing into a database the knowledge about information services (their address, purpose, domain of coverage, query syntax, etc.). In a second step, such a database can be queried when the challenge is to provide information about whom is providing what, for a given object, region of the sky, or domain of interest.
Several projects are working toward providing general solutions.
Astrobrowse is a project that began within the United States
astrophysics community, primarily within NASA data centers, for developing
a user agent which significantly streamlines
the process of locating astronomical data on the web.
Several prototype implementations are already
available.
With any of these prototypes, a user can already
query thousands
of resources without having to deal with out-of-date URLs,
or spend time figuring out how to use each resource's
unique input formats.
Given a user's selection of
web-based astronomical databases and an
object name or coordinates, Astrobrowse will
send queries to all databases identified as containing potentially
relevant data. It provides links to these resources and allows the
user to browse results from each query. Astrobrowse does not recognize,
however, when a query yields a null result, nor does it integrate
query results into a common format to enable intercomparison.
Consider the following scenario: we have a data item I (for example an author's name, the position or name of an astronomical object, a bibliographical reference, etc.), and we would like to know more about it, but we do not know a priori which service S to contact, and what are the different data types D which can be requested. This scenario is typical of a scientist exploring new domains as part of a research procedure.
The GLU dictionary can actually be used for helping to solve this question: the dictionary can be considered as a reference directory, storing the knowledge about all services accepting data item I as input, for retrieving data D1 or D2. For example, we can easily obtain from such a dictionary the list of all services accepting an author's name as input; information which can be accessed, in return, may be an abstract (service ADS), a preprint (LANL/astro-ph), the author's address (RGO e-mail directory) or personal Web page (StarHeads), etc.
Based on such a system, it becomes possible to create automatically a simple interface guiding the user towards any of the services described in the dictionary.
This idea has been developed as a prototype tool, under the name of
AstroGLU
(Egret et al. [1998]).
The aim of this tool is to help the users find their way among
several dozens (for the moment) of possible actions or services.
A number of compromises have to be taken between providing the
user with the full information (which would be too abundant and
thus unusable), and preparing digest lists (which implies hiding some
amount of auxiliary information and making somewhat subjective
selections).
A resulting issue is the fact that the system puts on the same line services which have very different quantitative or qualitative characteristics. AstroGLU has no efficient ways yet to provide the user with a hierarchy of services, as a gastronomic guide would do for restaurants. This might come to be a necessity in the future, as more and more services become (and remain) available.
Copyright The European Southern Observatory (ESO)