ISSO Y2005 Annual Report | Contents A Text-Mining Technique for Literature Profiling and Information
Extraction from Abstract--Massive amounts of biomedical literature are readily available online in many forms. Huge amounts of valuable knowledge and relationships are embedded in these resources and need to be properly extracted, discovered, and utilized. Recognizing and classifying biomedical entity names and terms are important steps for developing efficient knowledge/information extraction techniques from these repositories. This research investigates and develops effective computational methods for literature profiling for the biomedical field. Specifically, this paper presents new techniques for biomedical term identification and classification. We utilize the advances in feature selection techniques (e.g., MI, X2) in IR in this task to select the key features for term identification and classification. We evaluated the method using Genia 3.0 corpus with about 3,000 to more than 34,000 biomedical terms and entity names. The outcome of this project can be applied in various fields including the Aerospace domain. In the aerospace field, there is a great interest in discovering the relations between certain changes in the body of astronauts and changes in structure at the levels of genes, proteins, and bindings. |