Full Record

New Search | Similar Records

Author
Title Text and Network Mining for Literature-Based Scientific Discovery in Biomedicine.
URL
Publication Date
Date Accessioned
Degree PhD
Discipline/Department Computer Science & Engineering
Degree Level doctoral
University/Publisher University of Michigan
Abstract Most of the new and important findings in biomedicine are only available in the text of the published scientific articles. The first goal of this thesis is to design methods based on natural language processing and machine learning to extract information about genes, proteins, and their interactions from text. We introduce a dependency tree kernel based relation extraction method to identify the interacting protein pairs in a sentence. We propose two kernel functions based on cosine similarity and edit distance among the dependency tree paths connecting the protein names. Using these kernel functions with supervised and semi-supervised machine learning methods, we report significant improvement (59.96% F-Measure performance over the AIMED data set) compared to the previous results in the literature. We also address the problem of distinguishing factual information from speculative information. Unlike previous methods that formulate the problem as a sentence classification task, we propose a two-step method to identify the speculative fragments of sentences. First, we use supervised classification to identify the speculation keywords using a diverse set of linguistic features that represent their contexts. Next, we use the syntactic structures of the sentences to resolve their linguistic scopes. Our results show that the method is effective in identifying speculative portions of sentences. The speculation keyword identification results are close to the upper bound of human inter-annotator agreement. The second goal of this thesis is to generate new scientific hypotheses using the literature-mined protein/gene interactions. We propose a literature-based discovery approach, where we start with a set of genes known to be related to a given concept and integrate text mining with network centrality analysis to predict novel concept-related genes. We present the application of the proposed approach to two different problems, namely predicting gene-disease associations and predicting genes that are important for vaccine development. Our results provide new insights and hypotheses worth future investigations in these domains and show the effectiveness of the proposed approach for literature-based discovery.
Subjects/Keywords Information Extraction; Natural Language Processing; Text Mining; Bioinformatics; Literature-based Discovery; Network Analysis; Computer Science; Engineering; Science
Contributors Radev, Dragomir Radkov (committee member); Abney, Steven P. (committee member); Athey, Brian D. (committee member); Baveja, Satinder Singh (committee member); Jagadish, Hosagrahar V. (committee member)
Language en
Rights Unrestricted
Country of Publication us
Record ID handle:2027.42/78956
Repository umich
Date Retrieved
Date Indexed 2019-06-03
Grantor University of Michigan, Horace H. Rackham School of Graduate Studies
Issued Date 2010-01-01 00:00:00
Note [thesisdegreename] Ph.D.; [thesisdegreediscipline] Computer Science & Engineering; [thesisdegreegrantor] University of Michigan, Horace H. Rackham School of Graduate Studies; [bitstreamurl] http://deepblue.lib.umich.edu/bitstream/2027.42/78956/1/ozgur_1.pdf;

Sample Search Hits | Sample Images

…TEXT AND NETWORK MINING FOR LITERATURE-BASED SCIENTIFIC DISCOVERY IN BIOMEDICINE by ur Arzucan Ozg A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Computer Science and Engineering…

…Background . . . . . . . . . . . . . . . . . . 1.2.1 Protein Interaction Networks . . . 1.2.2 Biomedical Information Extraction 1.2.3 Literature-Based Discovery . . . . Guide to Remaining Chapters…

…62 64 69 69 69 71 72 73 76 76 77 78 83 V. Literature-Based Discovery of Vaccine Mediated Gene Interaction Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.5 5.1 5.2 5.3 5.4 5.5 5.6…

…earliest crucial steps in the lysis of normal and dex-resistant CEM cells, or might serve as a marker for the process.” . . . . . . . . . . . . . . . . . . . 55 Description of the literature-based discovery system for identifying gene-disease associations…

…70 4.2 Gene name normalization example. . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.1 General framework of the literature-based discovery approach. . . . . . . . . . . . . 86 5.2 Description of the literature-based discovery

…generate new scientific hypotheses using the xii literature-mined protein/gene interactions. We propose a literature-based discovery approach, where we start with a set of genes known to be related to a given concept and integrate text mining with network…

…provide new insights and hypotheses worth future investigations in these domains and show the effectiveness of the proposed approach for literature-based discovery. xiii CHAPTER I Introduction 1.1 Motivation The post-genome era, which started with…

…extraction and literature-based discovery. Work more closely related to ours is discussed in the related work sections of the subsequent chapters. We conclude this chapter with a summary of the remaining chapters. 1.2 1.2.1 Background Protein Interaction…

.