Generating and browsing multiple taxonomies over a document collection

成果类型:
Article
署名作者:
Spangler, S; Kreulen, JT; Lessler, J
署名单位:
International Business Machines (IBM); IBM USA; General Motors
刊物名称:
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS
ISSN/ISSBN:
0742-1222
发表日期:
2003
页码:
191-212
关键词:
摘要:
We present a novel system and methodology for generating and then browsing multiple taxonomies over a document collection. Taxonomies are generated using a broad set of capabilities, including meta data, key word queries, and automated clustering techniques that serve as a seed taxonomy. The taxonomy editor, eClassifier, provides powerful tools to visualize and edit each taxonomy to make it reflective of the desired theme. Cluster validation tools allow the editor to verify that documents received in the future can be automatically classified into each taxonomy with sufficiently high accuracy. In general, those seeking knowledge from a document collection may have only a vague notion of exactly what they are attempting to understand, and would like to explore related topics and concepts rather than simply being given a set of documents. For this purpose, we have developed MindMap, an interface utilizing multiple taxonomies and the ability to interact with a document collection.