next previous
Up: Search and discovery tools


Subsections

6 Information clustering and advanced user interfaces

A major challenge in current information systems research is to find efficient ways for users to be able to visualize the contents and understand the correlations within large databases. The technologies being developed are likely to be applicable to astronomical information systems. For example, information retrieval by means of "semantic road maps'' was first detailed in Doyle ([1961]), using a powerful spatial metaphor which lends itself quite well to modern distributed computing environments such as the Web. The Kohonen self-organizing feature map (SOM; Kohonen [1982]) method is an effective means towards this end of a visual information retrieval user interface.

6.1 Interfacing datasets with a Self-organizing Map

The Kohonen map is, at heart, k-means clustering with the additional constraint that cluster centers be located on a regular grid (or some other topographic structure) and furthermore their location on the grid be monotonically related to pairwise proximity (Murtagh & Hernández-Pajares [1995]).

A regular grid is quite convenient for an output representation space, as it maps conveniently onto a visual user interface. In a web context, it can easily be made interactive and responsive.

Figure 1 shows an example of such a visual and interactive user interface map, in the context of a set of journal articles described by their keywords. Color is related to density of document clusters located at regularly spaced nodes of the map, and some of these nodes/clusters are annotated. The map is installed on the Web as a clickable image map, with CGI programs accessing lists of documents and - through further links - in many cases, the full documents. In the example shown, the user has queried a node and results are seen in the right-hand panel. Such maps are maintained for (currently) 12000 articles from the Astrophysical Journal, 7000 from Astronomy and Astrophysics, over 2000 astronomical catalogs, and other data holdings. More information on the design of this visual interface and user assessment can be found in Poinçot et al. ([1998,2000]).


  \begin{figure}\includegraphics[width=10cm]{map.ps}\end{figure} Figure 1: Visual interactive user interface to a set of articles from the journal Astronomy and Astrophysics. Original in color

6.2 Hyperlink clustering

Guillaume & Murtagh ([2000]) have recently developed a Java-based visualization tool for hyperlink-based data, in XML, consisting of astronomers, astronomical object names, article titles, and possibly other objects (images, tables, etc.). Through weighting, the various types of links could be prioritized. An iterative refinement algorithm was developed to map the nodes (objects) to a regular grid of cells, which, as for the Kohonen SOM map, are clickable and provide access to the data represented by the cluster. Figure 2 shows an example for an astronomer (Prof. Jean Heyvaerts, Strasbourg Astronomical Observatory).


  \begin{figure}\includegraphics[width=10cm]{grid_Heyvaerts_small.ps}\end{figure} Figure 2: Visual interactive user interfaces, based on graph edges. Vertices are author names, article titles and (not shown here) astronomical object names. Map for astronomer Jean Heyvaerts. Original in color

These new cluster-based visual user interfaces are not computationally demanding. In general they cannot be created in real time, but they are scalable in the sense that many tens of thousands of documents or other objects can be easily handled. Document management (see e.g. Cartia[*]) is less the motivation as is instead the interactive user interface.

Further information on these visual user interfaces can be found in Guillaume ([2000]) and Poinçot ([1999]).

6.3 Future developments for advanced interfaces

Two directions of development are planned in the near future. Firstly, visual user interfaces need to be coupled together. A comprehensive "master'' map is one possibility, but this has the disadvantage of centralized control and/or configuration control. Another possibility is to develop a protocol such that a map can refer a user to other maps in appropriate circumstances. Such a protocol was developed a number of years ago in a system called Ingrid[*] developed by P. Francis at NTT Software Labs in Tokyo (see Guillaume [2000]). However this work has been reoriented since then.

Modern middleware tools may offer the following solution. This is to define an information sharing bus, which will connect distributed information maps. It will be interesting to look at the advantages of CORBA (Common Object Request Broker Architecture) or, more likely, EJB (Enterprise Java Beans), for ensuring this interoperability infrastructure (Lunney & McCaughey [2000]).

A second development path is to note the clustering which is at the core of these visual user interfaces and to ask whether this can be further enhanced to facilitate construction of query and response agents. It is clear to anyone who uses Internet search engines such as AltaVista, Lycos, etc. that clustering of results is very desirable. A good example of such clustering of search results in practice is the Ask Jeeves search engine[*]. The query interface, additionally, is a natural language one, another plus.



next previous
Up: Search and discovery tools

Copyright The European Southern Observatory (ESO)