Diagnostic and Evaluation Tools for Natural language Applications - DIET
The DiET project has developed a comprehensive environment for the construction, annotation and maintenance of structured reference data for the diagnosis and evaluation of NLP (Natural Language Processing) applications for developers and professional evaluators. The system is based on an open client/server architecture and allows test data to be annotated by choosing from annotation types, and comes with edit, display and storage functions. The project also produced structured test data on the morphology, syntax and discourse levels. The user can adapt the data to new domains through customisation functions. Links can be established between the controlled test items and related phenomena in domain specific corpora through text profiling. Lexical replacement allows users to adapt the vocabulary to their specific terminology. The tools support the set-up of evaluation scenarios and the recording of test results.
The main goal is to develop the methods and tools for the glass box evaluation of NL components. This includes:
The technology used involves the construction, annotation and application of systematic NL test suites. Database and evaluation technologies, as well as statistical and corpus annotation methods, are based on the results of the EU funded projects TSNLP, FRACAS, TEMAA, EAGLES, and of the commercially funded SLT (Spoken Language Translator) project.
Benefits & Users
Effective and efficient assessment of Natural Language (NL) components is often severely hampered by the lack of suitable test material and technology, which are expensive and time consuming to develop. Available evaluation tools are highly specialised and unsuitable for reuse. DIET provides a set of reusable diagnostic and evaluation tools and data needed by suppliers to monitor the quality of NLP products and services.
The users in the consortium have developed different evaluation scenarios. AEROSPATIALE already has significant experience in using and evaluating textual information processing systems for document production and manipulation where it needs to manage and produce huge amounts of documentation in several different languages. Aerospatiale scenarios cover two applications - controlled language checking and grammar checking.
The Localisation Resources Centre evaluates tools and assists clients with the implementation of suitable tools by providing a consultancy and training service. LRCs scenarios focus on translation memory systems, which are often used in the localisation process, and investigates the need for new data and annotation types.
As a major technology provider, IBM is a prime user of NLP test suites and tools. A systematic test suite helps to overcome inherent limits of ad hoc compiled test data and a ready-made and flexible diagnostics and evaluation tool facilitates quality assurance. The scenario at IBM focuses on the use of the lexical replacement module to generate derived test suites.
Even though the benefits have been demonstrated in specific evaluation scenarios, the flexibility and adaptability of the tools allows for a much broader range of applications. The results could be exploited for any kind of construction and maintenance of structured linguistic resources and interest has already been expressed by a number of external users.
[ Projects Home | Alphabetical List | Programme List | Group List | Search Projects ]
Please report problems to