The SIMBAD database is primarily organized per astronomical object. The aim is to provide, as much as possible, the user with all published information (identifications, observational data, bibliographical references, and pointers towards external archives) concerning any given object, or list of objects.
The two main channels for feeding the database are the following:
Catalogues are selected for integration with priority given to those which can help provide an optimal support for the large projects conducted within the astronomical community. A large effort was, for instance, devoted in recent years to stellar catalogues (PPM, HIC, CCDM), in the context of the Hipparcos project, and to multi-wavelength identifications (IRAS PSC, Einstein 1E and 2E catalogues, older X-ray catalogues, the IUE Merged Log, etc.). The Hipparcos and Tycho catalogues (ESA [1997]) have been recently included, and inclusion of the ROSAT All Sky Survey is planned in the near future.
In parallel, the systematic scanning of the bibliography reflects the diversity and general trends of research in astronomy, and takes into account shorter lists. The published lists from the microlensing surveys, or e.g. the EUVE catalogues, were folded into the database as a result of this scanning.
When an object is found in the literature or in a catalogue, its possible cross-identification with objects already in SIMBAD is systematically studied, before entering the reference and the new object name in the database. About 4500 different names of catalogues or object lists from published papers can currently be found in SIMBAD, covering all the wavelength domains from high energy astrophysics to radio.
When no proper name is suggested by the authors, or when the acronym generates an ambiguity with already existing ones, the current practice, shared with the NED database, is to create an acronym within brackets using the initials of the last names of the first three authors, and the year of publication. For example, [HFE83] 366 is the 366th entry in the main table of a paper by Helmer, Fabricius, Einicke and colleagues published in 1983. From the year 2000 on, the year will be noted with four digits (e.g., [ABC2000]).
Many objects have more than one name, since the database contains more than 7.5 million object names for 2.7 million objects. Examples of objects with more than 50 identifiers, are the galaxy M 87 in Virgo, the bright stars Procyon and Capella, the quasar 3C 273, the Crab Nebula.
To help the users with the complex nomenclature of astronomical objects, the CDS now maintains and distributes on-line the Dictionary of Nomenclature of Celestial Objects (first developed by Lortet et al. [1994]; see Sect. 8).
In the following, the word object will be used to designate a star, non stellar object, or collection of objects such as a cluster, which corresponds to an individual entry in SIMBAD. For each object, the following data are included when available:
In the following, a more detailed description of some of these elements is given.
The object type refers to a hierarchical classification of the objects in SIMBAD, derived by the CDS team on the basis of the catalogue identifiers (as proposed by Ochsenbein & Dubois [1992]). From Star to Maser source, or Cluster of Galaxies, some 70 different categories, general, or very specific, are proposed (see examples in Table 1). A complete list is available on line.
This classification is intended to help the user select objects out of the database (e.g. through the filter procedure, see Sect. 4.5). It is also a powerful tool for data cross-checking and quality control. It has been designed to be practical and useful, and complements other features also available in SIMBAD (morphological type or spectral type information, catalogues, and measurements). It can follow the evolution of astronomy, with the introduction of new categories recently appeared in the literature (e.g., in the last years, Low-Mass or High-Mass X-Ray binary, Microlensing event, or Void).
Each class has normally a standard designation, a condensed one (used in tables) and an extended explanation. The classification uses a hierarchy with four levels, reflecting our knowledge of the characteristics of the astronomical object. For instance, an object can be classified as a "Star'' (this is level 1). If photometric observations have shown variability of the object, it can be classified as a "Variable star'' (this is level 2). Examples of level 3 and 4 are "Pulsating variable'', and "Cepheid''.
This hierarchy of object types (and their possible synonyms) is managed in the database in such a way that selecting variable stars (V*) is understood as selecting objects classified as V*, and all subdivisions (e.g. PulsV*, Mira, Cepheid, etc.). If the user is only interested in RR Lyrae type stars, he/she will use the RRLyr type, leaving aside all other variable stars for which the variability mode is different, or not known.
The classification emphasizes the physical nature of the object rather than a peculiar emission in some region of the electromagnetic spectrum or the location in peculiar clusters or external galaxies. Therefore objects are classified as peculiar emitters in a given wavelength (such as UV or IR source) only if nothing more about the nature of the object is known, i.e. it cannot be decided on the sole basis of the basic data whether the object is a star, a multiple system, a nebula, or a galaxy. For instance, if an object appears only in the IRAS catalogue, it is automatically classified as IR object: it is left to the user to decide to go further and to derive, e.g. on the basis of the IRAS colors, the probability for the source to be stellar or extragalactic.
Level | Standard | Short | Extended Explanation |
name | name | ||
... | |||
1 | Star | * | Star |
2 | *inCl | *iC | Star in Cluster |
2 | *inNeb | *iN | Star in Nebula |
2 | *inAssoc | *iA | Star in Association |
2 | *in** | *i* | Star in double system |
2 | V*? | V*? | Star suspected of Variability |
2 | Pec* | Pe* | Peculiar Star |
3 | HB* | HB* | Horizontal Branch Star |
3 | YSO | Y*O | Young Stellar Object |
3 | Em* | Em* | Emission-line Star |
4 | Be* | Be* | Be Star |
... | |||
1 | Galaxy | G | Galaxy |
2 | PartofG | PoG | Part of a Galaxy |
2 | GinCl | GiC | Galaxy in Cluster of Galaxies |
2 | GinGroup | GiG | Galaxy in Group of Galaxies |
2 | GinPair | GiP | Galaxy in Pair of Galaxies |
2 | High_z_G | HzG | Galaxy with high redshift |
... | |||
2 | AGN | AGN | Active Galaxy Nucleus |
3 | LINER | LIN | LINER-type Active Galaxy |
Nucleus | |||
3 | Seyfert | SyG | Seyfert Galaxy |
4 | Seyfert_1 | Sy1 | Seyfert 1 Galaxy |
4 | Seyfert_2 | Sy2 | Seyfert 2 Galaxy |
3 | Blazar | Bla | Blazar |
4 | BLLac | BLL | BL Lac - type object |
4 | OVV | OVV | Optically Violently Variable |
3 | QSO | QSO | Quasar |
Because there is at most one object type per object, this classification should be used with caution when extracting samples out of the database. This is typically the case for the wavelength types: using IR or X as a criterion cannot generate a sample of all IRAS sources, or all X-ray emitting objects, since a number of them are in fact classified as stars, galaxies, etc.
The coordinates were originally stored in the database in the FK4 system for equinox and epoch 1950.0. A major change was undergone in 1999, when they were moved to the International Celestial Reference System (ICRS, see Feissel & Mignard [1998]) at epoch 2000.0, after the publication of the Hipparcos and Tycho catalogues. The position data frame has become more complex, grouping together all data needed for computing the coordinates into any reference frame, at any epoch and equinox: the coordinates themselves, the proper motion, the parallax and the radial velocity or redshift.
All these data contain the same subfields: the original data, displayed with a number of digits consistent with the announced precision of the data; a quality code from "A'' (reference data) to "E'' (unreliable origin); an error box (either a standard error, or an ellipse), and the bibliographic reference of the data.
In earlier versions of SIMBAD, the determination of the position for another equinox used to take only precession into account. In the current version, a change of equinox takes into account not only the precession but also the proper motion, the reference frame (FK4, FK5, ICRS), and, when they are known, the parallax and radial velocity. When no epoch is specified, the year of the equinox is used by default.
Data come from various sources. When astrometric data are available, the most accurate one has been selected for the "basic data''. Other values may be available as measurements (in the pos type). The Hipparcos and Tycho catalogues (ESA [1997]) constitute the major source of positions for stars.
The coordinates precision may vary from to 1/10 mas. The default display format provides equatorial coordinates in the ICRS system at epoch 2000.0, and in the FK5 system at equinoxes 2000 and 1950, as well as galactic coordinates. Coordinates in the FK4 system, and ecliptic or super-galactic coordinates can be computed on request.
The proper motions ( ) are given in mas/year, together with their standard errors (in mas/year). The primary source of proper motions is the Hipparcos and Tycho catalogues (ESA [1997]).
The errors for positions or proper motions are expressed as error ellipses, made of three numbers, within brackets: the major axis, the minor axis, and the position angle of the major axis (measured from North to East). Major and minor axes are expressed in mas for the position, and mas/yr for the proper motion; the position angle is expressed in degrees, in the range .
When available, the stellar parallax is given in mas, together with the associated error within brackets. The primary source is the Hipparcos and Tycho catalogues (ESA [1997]).
Radial velocity (in km s-1), or redshift (for extragalactic objects) are currently available for some 160 000 objects. They are stored in their original type (either redshift, or radial velocity in km s-1), associated with the standard error. Display can be done in the original type or forced to be one of the two types, using the corresponding translation formula.
Stellar radial velocity data have been compiled with the collaboration of Observatoire de Marseille.
For extragalactic objects, up-to-date redshift information has recently been imported from the NASA/IPAC Extragalactic Database (NED, Helou et al. [1991]) as a result of the ongoing exchange agreement: the SIMBAD team is providing NED with bibliographic coverage of extragalactic objects for all astronomical journals, and is being given access, in return, to extragalactic data collected by NED.
Tables from individual articles constitute the other major source of information.
B and V magnitudes are given, when possible, in the Johnson's UBV system. Both magnitudes may be followed by a semicolon meaning they cannot be made homogeneous to the UBV system. In addition the following flags may appear:
When possible the magnitudes have been taken from the Tycho Reference Catalogue (Høg et al. [1998]) where B and V magnitudes are derived from the original and . Another major source is the UBV compilation of Mermilliod ([1987]). Otherwise the data would come from one of the published papers associated to the object.
The spectral types of stars have been selected preferably in the Michigan Catalogues of Two-Dimensional Spectral Types for the HD stars (Houk [1975], and seq.), or in the bibliographical surveys of MK classifications (Jaschek [1978]). In the absence of a full MK classification, the HD spectral type is recorded.
Most of the spectral types need less than 5 characters, but this field can be as long as 36 characters.
The morphological types of galaxies have been selected primarily from the Uppsala General Catalogue of Galaxies (UGC, Nilson [1973]), the Morphological Catalogue of Galaxies (MCG, Vorontsov-Velyaminov [1962]), and other catalogues (see Dubois et al. [1983]).
In complement, the following data, primarily from UGC, are given, when available, for galaxies:
Cross-identifications of stars and galaxies have been searched for SIMBAD entries from (currently) about 4500 source catalogues and tables, included, either completely or partially, in the data base. The index of 7.5 million aliases, thus constituted, is one of the unique features of the SIMBAD database.
Aliases may serve as entry points for related catalogues and tables (e.g. in VIZIER). Cross-fertilization of a given research with previous studies of the same object published in the astronomical literature is made directly possible from the alias list.
The index of names and aliases constitutes the basis for the SIMBAD name resolver which provides, in response to any object name, the set of coordinates corresponding to the object position on the celestial sphere, or the list of papers citing the object. The name resolving power of SIMBAD is used by many archives and information systems (such as the archives of Hubble Space Telescope or European Southern Observatory, the High Energy Astrophysics Science Archive Center, the Astrophysics Data System, servers of the Digitized Sky Surveys, etc.).
There is no SIMBAD preferred name for objects: all aliases can be equally used. A short list of major catalogues is used internally to put at the top of the list the most common name according to the object type (e.g., Messier or NGC identifier for galaxies and nebulae). All other identifiers are presented in alphabetical order.
A command of the SIMBAD native node ("selectid''), and an option in the sampling form of the WWW interface, allow the user to impose a list of identifiers to be used when displaying object lists.
It is to be noted that for a double system in which the components can be observed separately, SIMBAD frequently includes three entries: A and B components, and an additional entry for the joint system (AB), the latter entry carrying the observational data and references related to the system as a whole. This has to be taken into account in statistical studies such as stellar counts.
Observational data are presently given for the measurement types listed in Table 2.
Name | Observational data | # |
CEL | Ultraviolet photometry from Celescope | 5230 |
Cl.G | Clusters of Galaxies (Abell & Corwin [1989]) | 5345 |
Einstein | Einstein Observatory Soft X-ray Source List | 5668 |
GEN | U B V B1 B2 V1 G Geneva photometry | 3650 |
GJ | Absolute magnitudes and spatial velocities of nearby stars | 2368 |
Hbet | H index | 32278 |
HGAM | H equivalent width | 723 |
IRAS | IRAS Point Source Catalog | 245784 |
IRC | KI photometry from Two Micron Sky | 4880 |
Survey | ||
IUE | International Ultraviolet Explorer (Merged Observation Log) | 66805 |
JP11 | UBVRIJKLMNH 11-colour Johnson photometry | 5892 |
MK | Stellar spectral classification in Morgan-Keenan system | 190231 |
oRV | Stellar Radial velocities (also under GCRV) | 68783 |
PLX | Trigonometric parallaxes | 16329 |
pm | Proper motions (from various astrometric catalogues) | 243065 |
pos | Positions (from various astrometric catalogues) | 668953 |
ROT | Rotational velocities () | 7181 |
RVEL | Radial velocities of extragalactic objects | 36552 |
SAO | Positions and proper motions from SAO catalogue | 252384 |
TD1 | Ultraviolet magnitudes from TD1 satellite | 25972 |
UBV | Johnson UBV photometry | 141215 |
uvby | Strömgren uvby photometry | 37986 |
V* | Data related to variable stars | 25764 |
z | Redshifts (of distant galaxies and quasars) | 88888 |
For each data type, one can retrieve individual data with their bibliographical references, and, when available, weighted means computed from existing observed values by specialists in the related field.
When measurements are listed as a result of a SIMBAD query, they are normally preceded by a header providing a very short title to each listed parameter.
The important rôle now played by the VIZIER database of catalogues (Ochsenbein et al. [2000]), coming with easier interoperability of services, is changing the strategy for inclusion of observational measurements into SIMBAD. Let us take the example of the Hipparcos and Tycho catalogues (ESA [1997]): once the HIP or TYC identifier is available from SIMBAD it appears convenient enough to provide the user with a WWW link to the corresponding data in VizieR rather than overloading the SIMBAD database with the full Hipparcos and Tycho catalogues. This functionality is currently being implemented for important catalogues which have already been cross-identified. As a complement, the WWW interface includes pointers to external archives, currently: the INES database of the IUE project (Rodriguez-Pascual et al. [1999]); the high-energy observational archives at HEASARC (HEASARC team [1995]).
One of the key features of the SIMBAD astronomical database is the unique coverage of bibliographical references to objects. The bibliographic index contains references to stars from 1950 onwards, and to galaxies and all other objects outside the solar system from 1983 onwards. Presently (November 1999) there are about 3 million references taken from 110 000 papers published in the 100 most important astronomical periodical publications.
Articles are scanned in their entirety, and references to all objects mentioned in the title, in the abstract, in the text, in the figures, or in the tables are included in the bibliography. Tables larger than 1000 objects are usually considered as catalogues and processed separately.
No assessment is made of the relevance of the citation in terms of astronomical contents: the paper can be entirely devoted to the object, or simply give a side mention of it - in both cases this gives a reference in SIMBAD. Note that, for instance, the NED team (Helou et al. [1991]) applies a different strategy when covering bibliography of extragalactic objects, and tends to select only those papers that appear most relevant. Clearly, SIMBAD approach favours exhaustivity, at the cost of increased information noise.
A code (nicknamed bibcode) is assigned to each considered paper: this 19-digit bibcode contains in principle enough information to locate the article (including year of publication, journal, volume, page, etc.).
When one retrieves the bibliography of a SIMBAD object, a list of codes is usually given, and - according to the options used - these codes are automatically matched against a bibliographic file which provides the full reference, title and list of authors for each citation, together with an anchor pointing to the electronic version of the article.
Currently, in SIMBAD, about 50% of the objects have no bibliographic reference. Among the most cited objects (more than 2000 references) are the Large Magellanic Cloud, M 31, 3C 273, and the supernova SN 1987A.
The structure of the 19-digit bibcode has been defined in close collaboration with the NED group at NASA/IPAC so that both databases share the same coding system (Schmitz et al. [1995]). It is also used, with some adjustments, by the Abstract Service of the Astrophysics Data System (ADS, Kurtz et al. [2000]), and by the electronic journals (see e.g., Boyce & Dalterio [1996]). Reference codes have the following general structure:
YYYYJJJJJVVVVMPPPPA
L | letter |
p | pink page (in MNRAS) |
a-z | issue number within a volume |
A-K | issue designation used by publisher |
Q-Z | to distinguish articles on the same page. |
Example: 1991A&A...246L..24M for Astron. Astrophys. 246, L24, 1991, a Letter to the Editor of Astronomy & Astrophysics, by Motch et al.
For a complete description see Schmitz et al. ([1995]), or the WWW server.
Several types of comments are associated with the references in SIMBAD and normally displayed after the reference:
The astronomical content of SIMBAD results from the complex process of folding into the database a selection of important catalogues, and of surveying the complete astronomical literature.
This can be illustrated by the histogram in V magnitudes of Fig. 1. The coverage is reasonably complete up to beyond magnitude 10 for stars, after the inclusion of the Tycho catalogue. Many objects in the range 12 to 26th mag come from extensive studies of objects in selected sky areas: deep fields, external galaxies, etc.
Some well-known very large catalogues are not part of SIMBAD: for instance the Hubble Telescope Guide Star Catalogue (GSC, Lasker et al. [1990]) is not systematically included (even if GSC identifiers have been added for all Tycho stars present in SIMBAD). This results from a compromise aiming to save database load as well as manpower for cross-identification and quality control. Note that VIZIER and ALADIN give access to the full GSC catalogue (and to even larger catalogues and databases such as USNO-A, DENIS, 2MASS).
Figure 1: Histogram of the number of objects in SIMBAD per magnitude interval (V magnitude; logarithmic scale) |
Figure 2 illustrates the increase of the data contents of the database in the years 1990 to 1999.
Copyright The European Southern Observatory (ESO)