The Integrity Papers Genre    Heiner Benking   US Website


GeoJournal 26.3 323-334
Ó 1992 (March)            Access and Assimilation:

Access and Assimilation:

Pivotal Environmental Information Challenges -
Linking, Archiving, and Exploiting Multi-Lingual and
Multi-Scale Environmental Information Repositories
 

Benking, Heiner; Kampffmeyer, Ulrich, B., Dr.,
Project Consult, Isestr. 63, 2000 Hamburg 13. Germany

 

ABSTRACT: The world-wide environmental exchange of information is hampered by difficulties bridging terminologies and languages so that trans-cultural, cross-sectoral, and multi-dimensional information can be made available in a usable form at low cost.
The alternative concepts and elements described are planned to contribute to HEMIS, UNEP-HEMs world-wide information and reference system. The potential content includes: environmentally relevant organizations, activities, systems, methods, data sources, qualities, and access methods.
The key design element is a multi-lingual thesaurus which has harmonization effect resulting from the structuring and strategic access to information. Documentation and retrieval is another concern beside the provision of tools to ease identification, dissemination, and validation of information. The maintenance of validated source data and documents will reduce the volumes of requested information and thereby counteract the present information overkill.
The views expressed in this paper are personal and are not intended to anticipate decisions or commitments.

 
Introduction

The GeoJournal trilogy "Enhancing the Credibility of Ecology" (di Castri et al. 1988) requested "interaction along and across hierarchical scales" and the development of appropriate information exchange and the use of communication technologies to support interdisciplinary cooperation.

The profile of a community depends on the quality and intensity of communication. The lack of the above credibility of ecology may therefore be explained by the missing identity and podium in society to share and evaluate available information, beside the inhomogeneity caused by a variety of disciplines, terminologies, and languages involved. Conventional processing and management approaches provide no means to structure and handle the extremely heterogeneous subject "Life and Earth". Nevertheless, the scientific community and various institutions have repeatedly requested coordination of the international agenda and harmonization of environmental research activities to allow exchange, comparison and simulation, and thereby secure the widest possible use of available information (BMFT 1985; EEES 1986/87; UNEP 1987, Carter et al. 1990).

Some catalytic function is expected from well applied communication technologies. Di Castri et al. (1988) request linkage of efforts of researchers and expect "new opportunities comparable to the introduction of the microscope in the sixteenth century". - But a clear distinction must be made between overwhelming "freely available data" (Freedom of Information Act) and assimilated "contagious" information (v. Weizsäcker 1987). One first step towards bridging and linking original data is describing origin, time, context, use, quality, and relevance in a comparable format. Descriptors might be used to structure information and combine sources from different disciplines.

The authors believe that the conclusion for the field of analytical chemistry drawn by (Hulpke 1991) that "relevant and competent interpretation are impossible without access to the original data", also applies to the whole body of environmental data.

The challenge addressed in this article is seen in increasing the credibility of ecology by increasing information exchange, and the availability and usability of information, by developing tools to identify undetected areas of missing information or areas of inadequate or faulty information.

Fig 1 Abstraction Niveaus
Diffuse boundaries exist beween the above abstraction levels. Abstraction is reached by inducing knowledge and individual perspectives/perceptions in the typical aggregation and summanzation processes. Such a process is irreversible and original data are lost.

The positive thinking of (McNeill et al. 1991) "With global communications and ever greater access to information, people are now beginning to exercise responsibility for every part of the world" seems unrealistic, if no effective access and selection technologies are implemented to select and distribute relevant information in time. Awareness, without appropriate means to handle the information overflow, might instead result in violence, negligence and apathy.

In the era of "meshing the world's economy and earth's ecology", environmental research and management has to leave the ivory tower of geo/biosciences and make use of the tools available in information analysis and management (Ahituv et al. 1981) or archiving, socio-economic databases, and marketing research in respect to their utilization for inhomogeneous and qualitative information. Modern systems will ease, but not solve, the problems ahead.

The general problem of underuse of data as a result of lack of genuine comparability and compatibility or missing availability will be addressed, misuse of data only lightly touched.

The article describes requirements for structuring life and earth data and presents alternative design considerations and modules for their management in environmental information and referral systems with a focus on original and quality information. The design concept which is presented can and will not preclude a final system which depends solely on the requirements of a very heterogeneous user community and evolving external factors and cooperations.

 

The Problem Analysis
Complexity and Volume of Information

The world is a complex system, which can only be rudimentarily described by cybernetic models. Scope and quality of statements depend on the image and knowledge of the world and are created according to the information available. Environmental research and management concerns all parts of the world. The possible volume of information is prohibitive and hides a general view (Benking 1 990a). There are no conventional methods available for accessing the information or appropriately identifying gaps or redundancies in the distributed information repositories.

The original information and context is lost after transferral to various abstraction levels (Fig I) and aggregation niveaus, but the need for their retrieval or even consultation to study the local factors and circumstances is undisputed. Going back to the original data by having the experts interview the local specialists, was called by (Grossmann 1983, 1989) "softcoupling". But this can not be a viable approach for regional and global issues.

One possible approach to address the above problem might be the preservance and distribution of primary, original, contextual information or, using modern terminology, "metainformation" (information about information) which explains and documents the original data. Modern multi-media information systems can store and later disseminate the backgrounds and circumstances and thereby complement the tendency of summarization, aggregation, and abstraction. In this respect modern information technology provides means not only to verify specialization on a statistical basis, and reverse generalization, but also to assure the use of "source proved" original data, instead of second-hand information.

Jeffers (1978) stated that "large central databases lead to disaster in understanding, if the source, definition, and meaning of the data is not known". To avoid this result the authors proposed a system design which can incorporate data repositories in different languages and terminologies and combine them into a single common multi-dimensional descriptor-driven retrieval structure (Benking 1990b; Kampffmeyer 1990).

Specialization vs. Generalization       -           Segregation vs. Integration
The questionable value of single data was described by J. L. Lions (1991) as "Models without data have no predictive value, data without models will only bring confusion". Abstraction, summarization, and aggregation are indespensable for the development of expertise and its implementation.
 

The Project - Background and Outline

The Harmonization of Environmental Measurement project - HEM - was established by UNEP under the auspices of the Global Environmental Monitoring System (GEMS) as a part of EARTHWATCH (Keune et al. 1991 a). The Environmental Experts of the Economic Summit - EEES - (G7) made recommendations for the improvement and harmonization of techniques and practices for environmental measurements. the UNEP Governing Council adopted and the German National Environmental Advisors - SRU - recommended the establishment of the above Centre (EEES 1987).

One activity area of the HEM office is concerned with the collection and dissemination of information about data (Fig 2) and the implementation (Keune et al. 1991b) of Global Environmental Meta-database/Information System (HEMIS) capable of referencing, navigating, and linking multicompartmental environmental information repositories.

Optimal use of collected data requires that information on its existence is available, that it can be accessed and - most important of all - that the data is compiled and classified in ways which are compatible. Achieving this is the basic aim of harmonization of environmental measurement. Although great

care is generally taken to harmonize data collected within programmes, harmonization between programmes remains a major goal for the future.

Given the sheer volume of data sources and the lack of the descriptive information necessary to assess data-relevance and data-quality, it is clear that a system which includes Global Information Systems, Global, Regional, and National Data Centres, and Sectorial Data Centres must also address terminological and linguistic issues.

UNEP appointed a group of international experts (Crain 1990), to advise and study the proposals and recommend implementation procedures (UNEP-GEMS Report 8, 1991). No other group was identified which currently records coordinated information on environmental databases, programmes and analytical methods worldwide.



  
INTEGRATED MONITORING:
"Integrated monitoring is defined as the repeated measurement of a range of related environmental variables or indicators in the living and non-living compartments of the environment, and the investigation of the transfer of substances or energy from one compartment to another. Monitoring becomes truly integrated when the measurements or different variables in different compartments are co-ordinated in time and space to provide a comprehensive picture of the system under study." (Wiersma 1990)

CROSS-MEDIA MONITORING:
For the purpose of the Survey, Cross-Media monitoring is taken to occur when several living and non-living environmental variables or indicators are monitored at the same site.


Fig 2: Examples of Tables from UNEP-HEM's "Survey of Environmental Monitoring & Information Management Programmes of International Organizations". The cross referencing of areas, activities, and participants, and the description of individual programmes and organizations, provides insights into possible areas of research, cooperation, and management. Source: UNEP-HEM Survey 1991)
 


Fig 3 Blackbox Natur" (Zauberwürfel der Ökologie)
The Rubik's Cube of Ecology is an exhibition exponat, a physical box with the dimensions: subject/discipline, scale/size, and time. It was built to demonstrate the "Challenge of Ecology" according to (di Castri's 1988) the "interaction along and across hierarchical scales". One intent was the promotion of advanced visualization, documentation, linking, and retrieval technologies, which can create "Hyper Frames" or "Hyper Images" and might help to navigate in the given scales. The exponat is part of the "GLOBAL CHANGE" touring exhibition which was opened in May 1990 in the German Federal Chancellery in Bonn.
 



INTEGRATED MONITORING AND MODELLING

FOR ENVIRONMENTAL RESEARCH AND MANAGEMENT
 

There are several research priorities and areas for integration prerequisite for an integrated representation of terrestrial ecosystems:

Fig 4 Integrated Monitoring and modelling bridging measurement, theory and validation
(Copy of Poster done by Heiner Benking on behalf of UNEP-HEM for the LOCAL & GLOBAL CHANGE exhibition at Geotechnica 1991, Cologne)

  • harmonizaton of classification schemes, definitions and measurements
  • hierarchical Geographic Information Systems (GIS) to handle and transfer data on different scales: resolution levels
  • concurrent management and comparison of landcape or biota dynamics (handling different time slices of one scene incorporation of the concept of dynamic maps
  • incorporation of the altitude and hight of objects in real world coordinates
  • integration ol various coded- and non-coded data formats alphanumeric vector and pixel raster}
  • real time feedback of data-base content Into mission control and data acquisition and data verification strategies (groundtruth)

Based on an initial awareness of research and pilot projects in these areas, informal consultations were undertaken to ascertain how the integration and the concept of harmonization could be promoted. MBBs MONITORING 2000 Study played an initial role in regard to feasibility considerations.

It is widely thought that object-catalogues. defined over specific scaling ranges, are the best vehicle with which to transfer structural information across a hierarchy of scales. Conceptual "meeting points" like the atmospheric sciences platform have been defined. but it is indispensable to agree upon object definitions and classification schemes early.

The question is thus how technically and logically the contents of one scale range can be transferred or superimposed onto other grids.
Inter-ecosystem comparison and research depends on a variety of processes and phenomena that occur along a hierarchy of physical scales. Understanding the dynamical processes involved and the scales at which they operate, allows those patterns and structures observed in the biosphere to be more clearly defined.
The key to unlocking such complexity lies in the ability to bridge gaps between measurement, theory and validation. This would in turn provide further understanding and support for research and field-work.
Harmonization can therefore play a critical role in the integration of the environmental and ecological sciences and propel pragmatic integrated approaches.

PARTNERS:

  • Centre for Harmonizaton of Environmental Measurement (HEM)
  • United Nations Environment Programme (UNEP)
  • Forschungsverbund Agrarökosysteme Munchen (FAMI
  • GSF Forschungszentrum für Umweld und Gesundheit GmbH
  • Messerschmitt-Bolkow Blohm GmbH
  • Bayerisches Staatsministenum für Landesentwicklung und Umweltfragen
  • Forschungsstelle Okosystemforschung und Umweltstudien
  • Österreichische Akademie der Wissenschaften
  • Arbeitsgruppe Theoretische Okologie Forschungszentrum Jülich GmbH

  



 
Scales and Platforms

Objects, objectives, technologies and terminologies are scale related. Definition of hierarchical exchange platforms should be continued not only because of the harmonization effect of object catalogues, but also because of the ability to bridge gaps between measurement, theory, and validation (Benking et al. 1991) (Fig 4). Noting object granularity or resolution, including variance, will ease the transfer between platforms or niveaus. Interaction along and across hierarchical scales (Fig 3), panning and zooming into repositories requires information management, the storage of raw data. refined data and ancillary data in one structured and fault-tolerant documentation system.

Policy makers are in an awkward situation in regard to the "Freedom of Information Act" distribution policies. More information may in fact be less. The rising demand for "informed discussions" and "educated guesses" asks for expertise in conjunction with first hand information and orientation.

 
Visualization and Quality of Information

Man is image-oriented, aware of changing values of information and therefore requests the "whole picture". A. Clo-Arceduc (1965) requested in "Photographie Aerienne" the use of reliable source material, like aerial imagery, to counteract uncontrolled knowledge and allegations. Optical and analog devices have today found their pendant in Optical Information Systems which can preserve original data, counteract false pretence, and establish reliable environmental data (Kampffmeyer et al. 1989; Benking et al. 1989). Information is sensitive to interpretation, manipulations, and omissions and much credibility has been lost already due to biased or faulty information.

I. Jarrett (1964), a computer pioneer, proposed an ethics code to sanction any local or regional manipulation of, datafields or views. Even more critical is fusion and merging of information with modern system technology. The credibility is at stake and the concept of originality is endangered with photorealistic artificial images (virtual reality) preparing the ground for manipulations and demagogic drives (Benking 1988). "CAD" = Computer-aided-Demagogy. This is another good reason to store originals like documents only on writeprotected media, and to develop some form of storage or labeling early.

Last but not least, information needs to be trusted, requested, found, understood, and assimilated in the right time to be of any value at all.

The following proposed option to assess quality, sources, completeness, and circumstances together with the data, in a user-friendly and logical way might create some countereffect to the above tendencies and help build up expertise.



What is Hypertext ?

Hypertext programmes allow to branch directly within applications from one term highlighted on a computer screen to other terms, images, graphics or xnnotations.
 

What are Hyperlinks ?

Such a connection between the above objects is called Hyperlink.

With such Hyperlinks knowledge-bases can be built which are not restricted by the limitations of relational, hierarchical or index-sequential databases. Hyperlinks can be used for text databases, documentation, zeoreference, and management information systems.
 

What is a Thesaurus ?

According to the dictionary a thesaurus is a store of words, of synenyms and antonyms, which can be used as a categorized index of terms for use in information retrieval.

Thesauri are tools used in terminology in order to translate the user's Ianguage into an artificial but agreed upon. binding, and more restrictive language. It is a hierarchical succession in a scientific representation with normative character. This secures an unequivocal access even if the information is unknown.

Some databases today incorporate thesauri structures which a]low to branch from selection lists to underlying lists. They typically follow the hierarchical ISO-Norm approach. In addition, modern thesauri, similar to hypertext, allow direct hierarchy-independent links. The state of dependence is presented in a network structure. Electronic thesauri may consist of means like truncated search, lemmatization, matching of synonyms, acronyms, and homonyms and other methods which ease and simplify utilization.

Multilingual thesauri are essential, so that researching documents indexed in various languages do not depend on specific knowledge and use of a prevailing language. Whenever possible, indexers and users should have the opportunity of working in their mother tongue or, at least, in a familiar language. In the same way, multilingual thesauri might also be considered as playing an important part in improving control of literature and exploring subjects.



 

Information Management Tools

Data Collections

The availability of 4000 on-line databases plus a growing number of CD-ROM databases does not secure a transparent answer to a specific question. Standards and interchange formats will help to reduce redundancies and to organize objects

and entities. But what happens if the content or a subject like the environment is not clearly outlined, or the request is vague?

Recently, intermediary systems have been proposed and developed to provide integrated views of existing heterogeneous repositories. We can see them as translators, but much knowledge has to be induced and they can only be as good as the dictionaries and sourcebooks. This is another reason to invest in terminology and thesauri.

 
Databases for Fact and Reference Retrieval

Conventional (incl. full-text) databases contain entities with a specific semantic homogenity. Problems arise when the inhomogenity of a subject like the environment dees not allow satisfying attributable data-models, or if inconsistent results are caused by semantic or idiomatic turn, pecularities, or oddities like synonyms or homonyms.

The diversity and heterogeneity of environmental dataobjects prohibit rigid categories or chemata. Additional requirements may be multiple entries, large numbers of interrelated and interlocked fields, a broad vanety of field length, and a combination of coded and non-coded content. Meta-information of the objects might also need to be contained in the system and requires additional attention as do units, scales, intervals, and reference systems, as well as all semantic and terminological inconsistencies. Automatic conversion of measurements is possible, but the transfer of qualitative and distinct features require context-related and knowledge-based transfers. Another major requirement for broad acceptance is the support of search profiles, guided tours, and tactical proposals to define the focus in order to reduce or extend the search argument. Furthermore, generic thesauri, attribute relations, and statistical combinations will help to encircle and identify topics and issues.

Already today the available multi-media information systems contain functions to handle some of the above requirements. "Hyperimages or Frames" (Fig 3) were proposed to help bridging scales, subjects, and times. Kampffmeyer (1990, 1991a) suggested facetted thesauri to manage sections and segments of the above scales including their terminology. (Fig 6).

The differentiation of fact and reference databases in literature is questionable. Systems are becoming available to combine hypertext features, descriptor fields, full-text search, and automatic lemmatization techniques (Choros 1981) (Bozzi et al. 1991). For a detailed evaluation of descriptor versus fulltext search strategies see (Kampffmeyer 1991b).
 

The Multilingual Descriptor System MDS of Inventories on the Environment

Origin of the MDS-Thesaurus

A system of European Inventories on the Environment was initiated in 1975 by Decision 76/161/EEC of the Council of Ministers. The task of implementing this decision was entrusted by the Committee for Scientific Information and Documentation (CSTID) to a working group initially called the Environmental Protection Information Group (EPIG), but later renamed the Environmental Information Group (ENIG).

The decision of the Council required the setting up of multilingual inventories of which the most extensive was to cover environmental research programs carried out within the CEC member states. It soon became clear that the existence of six official languages (1975) for the European Community presented problems which required special attention. As a consequence, ENIG set up a small "ad hoc" working group to prepare a Multilingual Descriptors System dedicated to environmental inventories. The MDS group aimed:

1. to create a master-list of "descriptors" to be used for indexing environmental subjects
2. to provide translations of each descriptor in all community languages.

Sources of the MDS-Thesaurus

The MDS group began its work by merging into a single list all the descriptors uscd in the following national information systems dedicated to cnvilonmental research:

  • Umweltforschungskatalog (UFOKAT), issued by the Umweltbundesamt (Environmental Federal Office) in Germany.
  • Register of Research and Surveys produced by the Departments of Environment and Transport in the United Kingdom.
  • Inventory of Environmental Research in the Netherlands, produced by the Study and Information Centre TNO on environmental research (SCMO-TNO).
  • INFOTERRA, the international referral system for sources of environmental information of the United Nations Environment Program (UNEP).

The integrated index list comprised about 2500 terms and was first translated into Dutch, French, German, and English so as to provide a draft document to form the basis for discussions. It soon appeared that the too diversified sources of the project resulted in a list which was too large and confusing and which often proved to be too detailed for the study of a specific subject. In addition, the list contained some concepts which where known in one language but could not be understood in another. Descriptors were thus grouped in subjects related either to environmental topics or matters in order to simplify discussions about the list and its publishing. The resulting filing system proved to be most useful and was maintained in the MDS experimental issue (1983).

At present, only the words proceeding from original national systems are included in the list, but other descriptors and non-descriptors are expected to be added on the basis of the experience acquired when

- indexing records in CEC environmental inventories,

  • searching for information from the inventories through ECHO, the European Community server, and various lists printed and published from the inventories.

The MDS group broke up several years ago and its projects ENREP (Environmental Research Projects) and ENDOC (Environmental Documentation) were stopped in early 1990. Since that time the thesaurus has been developed in two divergent directions; by TNO in the Netherlands and by CNR in Italy.

The MDS thesaurus has recently been implemented into the ISIS software of UNESCO.
 


INFOterra

Thesaurus of Environmental Terms

INFOterra is the International Environmental Information System of the United Nations Environment Programme - UNEP - with government-designated offices in 140 countries. INFOterra is part of EARTHWATCH, which comprises also the Global Environmental Monitoring System (GEMS), the International Register of Potentially Toxic Chemicals (IRPTC), and the State of the Environment Report (SOE).

The third edition of the Thesaurus of Environmental Terms from 1990 reflects substantial changes which took place since the last edition in 1984. It retains the established format, with changes consisting mostly of amendments and additions to the original list of terms and improvements in the code system and the relational structure. The Thesaurus is to identify sources of expertise. and with the third edition, to be used as a tool for indexing and cataloguing documents and other materials relevant to the environment.

INFOterra works in four of six official UN languages (English, French, Spanish and Russian). Chinese and Arabic have so far not been adapted.

 The 1990 INFOterra Thesaurus consists of the following lists:

  1. Outline of Categories and Sub-categories,
  2. List of Terms in Code Sequence,
  3. Categorized List of Terms,
  4. Alphabetical List of Terms.

  


 The Proposed Solution

The Concept

The design of HEMIS is governed by the principles of acceptance, low-cost for wide dissemination, the envisioned harmonization effect, and an open system approach to make use of available expertise and modules (Benking 1990; Kampffmeyer 1990; Crain 1990). The final configuration is only secondary and has to follow, beside the above criteria, the mission of the HEM office and the requirements of the core user-community like:

 

  • providing a low-cost platform for the widest possible community
  • harmonization of terminology and approaches - easy to use
  • structuring and describing the subject "Environment"
  • through integration and segregation identifying expertise which can be more easily compared and linked with other bodies of knowledge
  • linking the slices for languages or fields/areas to contribute to efforts of networking the international community
  • standardization of access

 

Objectives

One of the main targets of UNEP-HEM is the standardization and harmonization of nomenclature, measurements, and other information for the sake of more efficient environmental research and management. To fulfill this task, information to be harmonized has to be analysed and distributed to all relevant institutions. The harmonization process is already started by having third parties evaluate, contribute, and make amendments to the distributed materials. This has an accelerating effect after agreement on scope and content is reached. UNEP has a primarily catalytic function and will only contribute to and combine the findings and efforts of partner and sister organizations (UNEP-GEMS 1991).

 

Harmonization of Information

There is no chance to harmonize, structure, and standardize the nomenclature and data already existing in for example natural, social, and ecological sciences into one common body of environmental science, which exists in thousands of lovations, hundreds of discipline-oriented terminologies, and tens of languages. The only chance is on the user side in combination with a strong application orientation, thereby allowing easy access to information of different sources, structures, and qualities by standardized access methods and having the mainstream develop itself without regulatory jurisdiction and enforcement. Anyway this is the fastest way for adaption. A neutral, flexible, and extendable facetted thesaurus is therefore the main aspect of the proposed information- and referencebase (Fig 5).

 

System Configuration

The technical basis for the end-user is according to the expert-group recommendations a standard low-cost computer(PC). The system design specifications request an open system approach, not only to serve the internal user, but also to distribute the information world-wide. This might lead to the external use of CD-ROM media, which have enough capacity to distribute high-volume multi-media information.

The proposed concept allows central networked processing as well as local off-line access. The developments require broad acceptance and flexibility according to the user requirements of a very broad community. "One stop shopping" might be ideal for searching of information and depends strongly on developments in telematics. For research and analysis it is considered necessary to download from existing databases, and have locally available, voluminous repositories containing original data. The design has to be flexible, depending on spatial and temporal scope and analysis functionality. Present design outlines request low-cost add-on and easy-to-use operations which is possible at any time and everywhere in the world. Present financial guidelines do not allow an on-line or networked operation.


Fig 5 The Database.

An obiect oriented reference svstem
The proposed system includes different databases and an information management system for the media used (object access database). The databases themselves will be a relational program system available as a standard product. The stored information (data set, text file, image etc.) are objects (refered as documents), which are linked via the unique document identifier with the descriptors.


Fig 6 "Slice" structure of a multi-lingual hypertext thesaurus.

Each line in every slice represents one main keyword. Each keyword refers to one unique identifier (ID). The IDs are used for access of the documents via a database. Each keyword may have several predecessors on a higher, same, or lower hierarchical level. The use of several predecessors and successors allows the creation of a multidimensional network of relations. The position in the hierarchy is only for display purpose in the form of a standard hierarchical thesaurus. A list of secondary keywords, synonyms, interpretations, homonyms, acronyms, etc. can be added to each keyword. These items can be retrieved by a "global search'' and lead to the main keyword.    An explanation for the meaning and usage of the keyword can be added as help text.
 

Contents

The system will only contain meta-information. One aspect is the combination of all information available through partner organizations, computerized coded datafiles and scanned text or images (facsimilies), if coded information is not available or appropriate.

Possible data-holdings include:

  • directories and access information
  • institutions and programs
  • excerpts from publications, maps, graphics, and associated explanatory information
  • standardization and quality regulations for environmental measurements.

 
Another requested content is to alocate objects in space and time for cross-media research. The functionality and analytic capabilities depend on the referencing system and the detail mSintSined with sre:}-codes or world coordinates.
 

Descriptor-oriented Search and Hypertext Functionality

Information is collected centrally, and with the help of a thesaurus program a harmonization process is initiated. The thesaurus is the main tool for harmonization and standardization. Such a tool allow the use of descriptors and keywords not appearing in the harmonized thesaurus and to retrieve information in different languages. Descriptors of other thesauri, translations, synonyms, homonyms, and antonyms may be added. Incorporation of systems and de facto standards like INFOTERRA or the NASA Master Directory pose no problem. These directories may be integrated as special "slices" (Fig 6) or via the additional keywords in the synonym field. Modern information processing allows processing such descriptors in parallel (Fig 7). The thesaurus includes an internal translator function and ensures that the requested information is found, even when not associated directly to the search argument.

Beside conventional descriptor-oriented search mechanisms it is essential to retrieve with a global query field independent oi the hierarchically organized descriptors. Associated with the descriptor, fields of one or more hierarchival multi-dimensional thesauri may be used. They are internally organized in a network structure. An additional method for retrieval are hypertext features, which allow links between different datasets and documents as well as guided-tours to bridge issues and topics.

 
Using the THESAURUS for retrieval purposes
 


 
Fig 7 Using independent database modules speeds up the system

 
Creation and Structure of the Thesaurus

The thesaurus program includes a tool for thesaurus definition, which allows to create a hierarchical thesaurus structure on the computer screen with multiple selection lists. The fields themselves are organized in a relational system with no restrictions of the hierarchy. The thesaurus enables to add to each descnptor displayed in the hierachy translations, synonyms, homonyms, references to other descriptors, etc.

The complete thesaurus is object-oriented and contains references to the documents stored (text, data, images ie.). Each language slice can be translated seperately. Descriptors may point to the same identifier. Identifiers are digits not taking a lot of memory space and are fast to retrieve. They follow as well the grouping of information and allow easy linkage. The HEMIS prototype operates with one or several access languages. The access to various terminologies will help clarify concepts and foster interdisciplinary cooperation. Note must be made that languages are one dimension for multi-dimensional thesauri.

 

Operation

By the use of an ergonomic easy-to-use graphic user interface (which is applied already in various optical information system applications) users can retrieve information without prior knowledge about the contents and structure of the information-base. The following three different ways to access are possible and may be used in conjunction via:

 

  • keywords or first character of a keyword with truncation and wildcard in a universal retrieval field which searches through the complete thesaurus
  • descriptors. running through the displayed thesaurus with selection of entry (retrieving very quick all information to this entry)
  • guided tours, following the links from one document to the other. Guided tours are instructions, references and pointers to other contextual information (see Hyperlinks), which lead the user from one topic to another in predefined ways.

 

Conclusions

Information goes through many hands and, today, through many systems and media. Without the promotion of a (virtual) label containing origin and the original purpose, the use of information will remain uncontrolled and be subject to growing suspicion. The question of how best to eliminate the alterations and redundancies which form the major part of the huge body of information, at the same time preserving the original data and first-hand information, has not so far been addressed and might only be tackled with information technology. The prevailing quantitative argumentation could then be reduced and original, more comprehensive, qualitative information be used.

To model at least some of the complexities of the environment, the linking and archiving relations and contexts might prove helpful. Guided tours will help to digest and assimilate the above complexities and maintain the original line of thought. A new willingness and curiosity - spirit of play might develop, resulting from access to original data and bridging subjects.

The proposed system provides navigation and evaluation aids, not primarily factual data. Orientation is indispensable for assessment and application and should help to reduce the dissonance between information, knowledge, and expertise.

No way is seen to enforce standards, but mainstream, easy to follow, efficient guidance will be appreciated and besides will have a harmonization effect.

To be sure, the preliminary prototype described in this article is not the urgently needed "well conceived directory, catalog and inquiry system" (Data Management 1991; Environmental Information Statement 1991). The whole subject is considered much to heterogeneous, but the design could contribute to ongoing considerations, like international "Institutional Measures" (Hansen 1991), the European CORINE catalog of data-sources (CDS and MDS), Earth and Space Science Information Systems ISY-ESSIS (Benking et al. 1992 in addition to other available systems and sources in social, economic, ecologival, biological, and geo sciences (Ashdown et al. 1990; ACCIS 1988;).

 

References

ACCIS Guide to United Nations Information Sources on the Environment, United Nations, New York, 1988.

ACCIS Directory of United Nations Databases and Information Services; Advisory Committee for the Co-ordination of Information Systems, New York 199()

Ahituv, N.; Munro, M. C.; Wand, Y.: The Value of Information in Information Analysis, Information & Management 4. 143-150 (1981)

Ashdown, M.; Schaller, 1.: Geographic Information Systems and their Application in MAB Projects; Ecosystem Research and Environmental Monitoring, International Co-ordinating Council 11th Session, UNESCO Headquarters, Paris, November 1990.

Benking, H.: Möglichkeiten und Grenzen der Datenpräsentation im Umweltbereich. In: Jaeschke, A.; Page, B. (eds.), Computer Sciences for Environmental Protection. Informatik Fachberichte 170, 155-168 (1988)

Benking, H.: Lessing, H.: Large Scale Biomonitoring for Renaturation. In: GEOOKODYNAMIK X. 2/3. 277-290 (1989)

Benking H. Schmidt von Braun, H.: Geo-/Object Coding for LocalChance Assessment. GeoJournal 20, 2, 167- 173 (1990)

Benking, H.: Information about Environmental Information: Introduction and Preliminary Design Considerations, Pre-Studies Discussion Paper for the 1. Expert Group Meeting, unpublished hand-out, UNEP-HEM, Febr.- July 1990.

Benking, H.; McGlade, J.; Grossmann, W. D.; Braedt, J.; Keil, K.-H., Hantschel, R.: Integrated Monitoring and Modelling for Environmental Research and Management. In: Benking, H. (coord.), Local and Global Change Exhibition - Research and Co-ordination efforts of International Organi7ations, geotechnica, Cologne 1991.

Benking, H.; Kampffmeyer, U. B.: Harmonization of Environmental Metalnformation with a Thesaurus-Based Multi-Lingual Multi-Media Information System, ISY-ESSIS proceedings, Pasadena, February 1992

BMFT: Report of the Technology, Growth and Employment Working Group on further steps regarding "Improvement and Harmonization of Techniques and Practices of Environmental Measurements" to the Tokyo Economic Summit 1986, The Federal Minister of Rescarch and Technolocv, Bonn, December 1985.

Bozzi, A.; Capelli, G.: Automatic Lemmatization of Latin Texts. In: Best, H.; Mochmannn, E.; Thaffer, M. (eds.)., Computers in the Humanities and Social Sciences, pp. 373-378, K.G.Sauer, München 1991.

Carter, G. C.; Diamondstone, B. 1.: Directions for Internationally Compatible Environmental Data, Hemisphere Pub. Corp. for CODATA, New York, 1990.

Cogels, M.: MDS Thesaurus - A multilingual Descriptor System Prototype Description for the EEA-TF CORINE, under preparation

Choros, K.: Weighted deseriptor indexing of documents, Aktualne Problemy Informacji i Documentacji 26, 22-26 (1981)

Castn, di, F.; Hadley, M.: Enhancing the Credibility of Ecology: Interaction along and across Hierarchical Scales, GeoJournal 17, 1, 3-35 (1988)

Crain, 1. K.: A Meta-Database for Harmonization of Environmental Measurements, Discussion Paper, 1. Expert Group Meeting, UNEPHEM. München 1990.

Data Management for Global Change Research Policy Statements Executive Office of the President, Office of Science and Technology Policv (19914

Directory of Global and Regional Data Sets Supporting Global Change Research: Project Summary; National Geophysical Data Centre (NOAA/NESDIS/EGC 1). Boulder, April 1989.

Environmental Information Statement, International Forum for Environmental Information for the Twenty-First Century. Montreal 1991.

EEES Reports - Enviromllelltal Experts oi the Ecollomic Sumlllit - Report on Current h)tcrnatiollal SCiClltitiC Activities h1 Improvement and Harmonization of Techniques and Practices of Environlllelltal Measurement. Report on Prioritv Areas for h1lprovemcnt and Harmonization ..., Report Ensuring Continuing Progress in Improvement and Harmonization ..., GSF PFU Secretariat, Munich Meeting, December 1986. - Final Report including summit 1987 Declaration June 1987.

Grossmann, W. D.: Systems approaches towards complex systems. In: Messerli, P.; Strucki, E. (eds.), Fachbeiträge der schweizerischen MAB lnformationen 19, Bundesamt für Umweltschutz, Bern, 1983.

Grossmann, W. D.: Model- and Strategy-dnven Geographical Maps for Ecological Research and Management. In: Deutsches Nationalkommitee, Long-Term Ecological Research - A Global Perspective, UNESCO-MAB Mitteilungen 31, 43-61 (1989)

Gwynne, M. D.: Global Monitoring, Data Management and Assessment within GEMS and GRID: GEMS/PAC. Nairohi November 1989.

Hansen, P.: International Action and Institutional Measures - New Approaches. Malente Symposium IX, The ECO-Nomic Revolution

Challenge and Opportunity for the 21st Century, hand-out, Dräger Foundation in Cooperation with UNCED, Malente, November 1991.

Hulpke, H.: The Analytical Challenge: Conversion Data to Results. In: Alfred Wegener Stiftung (Herausgeber), Handbook for geotechnica Trade Fair and Congress, pp. 111 - 112, Verlag Rüdiger Flock, Cologne 1991.

Jeffers, J. N. R.: An Introduction to Systems Analysis: with ecological applivations. Arnold, London 1978

Kampffmeyer, U. B., Benking, H.: Die Gewinnung, Auswertung und Archivierung verläßlicher Umweltinformationen am Beispiel TOPOGRAMM. In: Jaeschke, A.; Page, B. (eds.), Informatik im Umweltschutz, Informatik Fachberichte 228 (1989)

Kampffmeyer, U.: Global Information Systems: Design, Classification and Multi-lingual Thesaun Development Issues and Concepts Related to the Task of Meta-Database Development. 1. Expert Group Meeting, handout unpublished, UNEP-HEM, July 1990.

Kampffmeyer, U., Benking, H., Keune, H., Theissen, A.: Discussion-paper for the 2. Expert-group Meeting, UNEP-HEM Meta-Database and Information System HEMIS, Chapter 3: Design of HEMIS, hand-out, unpublished, UNEP-HEM, September 1991.

Kampffmeyer, U: Deskriptoren versus Volltext, DGD/LID Tagung Frankfurt "Strategien für Optical Filing - Anwendungen im Pressearchiv", Frankfurt, November 1991b.

Kampffmeyer, U. B.: Von der Datenverarbeitung zur Informationsverarbeitung, GIGATREND, Series 1/91, 2/91, 3/91, Klaes Verlag, Düsseldorf 1991d

Keune, H.; Murray, A. B.; Benking, H.: Harmonization of Environmental Measurement, GeoJournal 23.3, 249-255 (1991)

Keune, H., Theisen, A.: Environmental Databases and Information Management Programmes of International Organizations. In: Hälker, M.; Jaeschke, A. (eds.), Computer Science for Environmental Protection, Informatik Fachberichte 546-553 (1991)

MacNeill, J.; Winsemius P.; Yakushiji, T.: Beyond Interdependence The Meshing of the World's Economy and the Earth's Ecology Oxford University Press, 1991.

The State of the Environment; OECD, Paris, 1991.

UNEP Expert Meeting on Improvement and Harmonization of Environmental Measurement, UNEP-GEMS, Munich, December 1987.

UNEP-GEMS Report Series No. 8: Towards the Design for a Metadatabase for the Harmonization of Environmental Measurement, Report of the Expert Group Meeting, July 26-27, 1990, Nairobi, June 1991.

UNEP-HEM Survey of Environmental Monitoring & Information Management Programmes of International Organizations, UNEP-HEM of fice, 2nd ediXon Apnl 1991

Wiersma, G. B.: Feasibility Study of a Global Network of Integrated Monitoring Sites; Idaho National Engineering Laboratory, Idaho Falls, April 1990.

Weizsäcker, v., E. U.: Ganzheitlicher Umweltschutz. In: Jaeschke, A.; Page, B. (eds.), Informatikanwendungen im Umweltbereich, Informatik Fachberichte 170, pp. 1-7, Springer, Heidelberg 1988.


Links


Integrity / Ceptual Institute Homepage |