You searched for +publisher:"Virginia Tech" +contributor:("Heath, Lenwood S.")
.
Showing records 1 – 30 of
83 total matches.
◁ [1] [2] [3] ▶
1.
Nayyar, Krati.
Input Sensitive Analysis of a Minimum Metric Bipartite Matching Algorithm.
Degree: MS, Computer Science and Applications, 2017, Virginia Tech
URL: http://hdl.handle.net/10919/86518
► In various business and military settings, there is an expectation of on-demand delivery of supplies and services. Typically, several delivery vehicles (also called servers) carry…
(more)
▼ In various business and military settings, there is an expectation of on-demand delivery of supplies and services. Typically, several delivery vehicles (also called servers) carry these supplies. Requests arrive one at a time and when a request arrives, a server is assigned to this request at a cost that is proportional to the distance between the server and the request. Bad assignments will not only lead to larger costs but will also create bottlenecks by increasing delivery time. There is, therefore, a need to design decision-making algorithms that produce cost-effective assignments of servers to requests in real-time.
In this thesis, we consider the online bipartite matching problem where each server can serve exactly one request.
In the online minimum metric bipartite matching problem, we are provided with a set of server locations in a metric space. Requests arrive one at a time that have to be immediately and irrevocably matched to a free server. The total cost of matching all the requests to servers, also known as the online matching is the sum of the cost of all the edges in the matching. There are many well-studied models for request generation. We study the problem in the adversarial model where an adversary who knows the decisions made by the algorithm generates a request sequence to maximize ratio of the cost of the online matching and the minimum-cost matching (also called the competitive ratio). An algorithm is a-competitive if the cost of online matching is at most 'a' times the minimum cost.
A recently discovered robust and deterministic online algorithm (we refer to this as the robust matching or the RM-Algorithm) was shown to have optimal competitive ratios in the adversarial model and a relatively weaker random arrival model.
We extend the analysis of the RM-Algorithm in the adversarial model and show that the competitive ratio of the algorithm is sensitive to the input, i.e., for "nice" input metric spaces or "nice" server placements, the performance guarantees of the RM-Algorithm is significantly better. In fact, we show that the performance is almost optimal for any fixed metric space and server locations.
Advisors/Committee Members: Raghvendra, Sharath (committeechair), Heath, Lenwood S. (committee member), Murali, T. M. (committee member).
Subjects/Keywords: online algorithms; weighted matching; competitive ratio; input sensitive
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Nayyar, K. (2017). Input Sensitive Analysis of a Minimum Metric Bipartite Matching Algorithm. (Masters Thesis). Virginia Tech. Retrieved from http://hdl.handle.net/10919/86518
Chicago Manual of Style (16th Edition):
Nayyar, Krati. “Input Sensitive Analysis of a Minimum Metric Bipartite Matching Algorithm.” 2017. Masters Thesis, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/86518.
MLA Handbook (7th Edition):
Nayyar, Krati. “Input Sensitive Analysis of a Minimum Metric Bipartite Matching Algorithm.” 2017. Web. 28 Feb 2021.
Vancouver:
Nayyar K. Input Sensitive Analysis of a Minimum Metric Bipartite Matching Algorithm. [Internet] [Masters thesis]. Virginia Tech; 2017. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/86518.
Council of Science Editors:
Nayyar K. Input Sensitive Analysis of a Minimum Metric Bipartite Matching Algorithm. [Masters Thesis]. Virginia Tech; 2017. Available from: http://hdl.handle.net/10919/86518
2.
Robertson, Jeffrey Alan.
Entropy Measurements and Ball Cover Construction for Biological Sequences.
Degree: MS, Computer Science and Applications, 2018, Virginia Tech
URL: http://hdl.handle.net/10919/84470
► As improving technology is making it easier to select or engineer DNA sequences that produce dangerous proteins, it is important to be able to predict…
(more)
▼ As improving technology is making it easier to select or engineer DNA sequences that produce dangerous proteins, it is important to be able to predict whether a novel DNA sequence is potentially dangerous by determining its taxonomic identity and functional characteristics. These tasks can be facilitated by the ever increasing amounts of available biological data. Unfortunately, though, these growing databases can be difficult to take full advantage of due to the corresponding increase in computational and storage costs. Entropy scaling algorithms and data structures present an approach that can expedite this type of analysis by scaling with the amount of entropy contained in the database instead of scaling with the size of the database. Because sets of DNA and protein sequences are biologically meaningful instead of being random, they demonstrate some amount of structure instead of being purely random. As biological databases grow, taking advantage of this structure can be extremely beneficial. The entropy scaling sequence similarity search algorithm introduced here demonstrates this by accelerating the biological sequence search tools BLAST and DIAMOND. Tests of the implementation of this algorithm shows that while this approach can lead to improved query times, constructing the required entropy scaling indices is difficult and expensive. To improve performance and remove this bottleneck, I investigate several ideas for accelerating building indices that support entropy scaling searches. The results of these tests identify key tradeoffs and demonstrate that there is potential in using these techniques for sequence similarity searches.
Advisors/Committee Members: Heath, Lenwood S. (committeechair), Marathe, Madhav Vishnu (committee member), Eubank, Stephen G. (committee member).
Subjects/Keywords: Bioinformatics; Entropy Scaling; Sequence Search; BLAST
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Robertson, J. A. (2018). Entropy Measurements and Ball Cover Construction for Biological Sequences. (Masters Thesis). Virginia Tech. Retrieved from http://hdl.handle.net/10919/84470
Chicago Manual of Style (16th Edition):
Robertson, Jeffrey Alan. “Entropy Measurements and Ball Cover Construction for Biological Sequences.” 2018. Masters Thesis, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/84470.
MLA Handbook (7th Edition):
Robertson, Jeffrey Alan. “Entropy Measurements and Ball Cover Construction for Biological Sequences.” 2018. Web. 28 Feb 2021.
Vancouver:
Robertson JA. Entropy Measurements and Ball Cover Construction for Biological Sequences. [Internet] [Masters thesis]. Virginia Tech; 2018. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/84470.
Council of Science Editors:
Robertson JA. Entropy Measurements and Ball Cover Construction for Biological Sequences. [Masters Thesis]. Virginia Tech; 2018. Available from: http://hdl.handle.net/10919/84470
3.
Ni, Ying.
A Machine Learning Approach to Predict Gene Regulatory Networks in Seed Development in Arabidopsis Using Time Series Gene Expression Data.
Degree: MS, Computer Science and Applications, 2016, Virginia Tech
URL: http://hdl.handle.net/10919/81463
► Gene regulatory networks (GRNs) provide a natural representation of relationships between regulators and target genes. Though inferring GRN is a challenging task, many methods, including…
(more)
▼ Gene regulatory networks (GRNs) provide a natural representation of relationships between regulators and target genes. Though inferring GRN is a challenging task, many methods, including unsupervised and supervised approaches, have been developed in the literature. However, most of these methods target non-context-specific GRNs. Because the regulatory relationships consistently reprogram under different tissues or biological processes, non-context-specific GRNs may not fit some specific conditions. In addition, a detailed investigation of the prediction results has remained elusive. In this study, I propose to use a machine learning approach to predict GRNs that occur in developmental stage-specific networks and to show how it improves our understanding of the GRN in seed development.
I developed a Beacon GRN inference tool to predict a GRN in seed development in Arabidopsis based on a support vector machine (SVM) local model. Using the time series gene expression levels in seed development and prior known regulatory relationships, I evaluated and predicted the GRN at this specific biological process. The prediction results show that one gene may be controlled by multiple regulators. The targets that are strongly positively correlated with their regulators are mostly expressed at the beginning of seed development. The direct targets were detected when I found a match between the promoter regions of the targets and the regulator'
s binding sequence. Our prediction provides a novel testable hypotheses of a GRN in seed development in Arabidopsis, and the Beacon GRN inference tool provides a valuable model system for context-specific GRN inference.
Advisors/Committee Members: Grene, Ruth (committeechair), Heath, Lenwood S. (committeechair), Li, Song (committee member).
Subjects/Keywords: Network inference; signal transduction pathways; gene expression; support vector machines
…professors at Virginia Tech. Further details
of the experiments can be found in [67]. In…
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Ni, Y. (2016). A Machine Learning Approach to Predict Gene Regulatory Networks in Seed Development in Arabidopsis Using Time Series Gene Expression Data. (Masters Thesis). Virginia Tech. Retrieved from http://hdl.handle.net/10919/81463
Chicago Manual of Style (16th Edition):
Ni, Ying. “A Machine Learning Approach to Predict Gene Regulatory Networks in Seed Development in Arabidopsis Using Time Series Gene Expression Data.” 2016. Masters Thesis, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/81463.
MLA Handbook (7th Edition):
Ni, Ying. “A Machine Learning Approach to Predict Gene Regulatory Networks in Seed Development in Arabidopsis Using Time Series Gene Expression Data.” 2016. Web. 28 Feb 2021.
Vancouver:
Ni Y. A Machine Learning Approach to Predict Gene Regulatory Networks in Seed Development in Arabidopsis Using Time Series Gene Expression Data. [Internet] [Masters thesis]. Virginia Tech; 2016. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/81463.
Council of Science Editors:
Ni Y. A Machine Learning Approach to Predict Gene Regulatory Networks in Seed Development in Arabidopsis Using Time Series Gene Expression Data. [Masters Thesis]. Virginia Tech; 2016. Available from: http://hdl.handle.net/10919/81463
4.
Wagner, Mitchell James.
Reconstructing Signaling Pathways Using Regular-Language Constrained Paths.
Degree: MS, Computer Science and Applications, 2018, Virginia Tech
URL: http://hdl.handle.net/10919/85044
► Signaling pathways are widely studied in systems biology. Several databases catalog our knowledge of these pathways, including the proteins and interactions that comprise them. However,…
(more)
▼ Signaling pathways are widely studied in systems biology. Several databases catalog our knowledge of these pathways, including the proteins and interactions that comprise them. However, high-quality curation of this information is slow and painstaking. As a result, many interactions still lack annotation concerning the pathways they participate in. A natural question that arises is whether or not it is possible to automatically leverage existing annotations to identify new interactions for inclusion in a given pathway.
Here, we present RegLinker, an algorithm that achieves this purpose by computing multiple short paths from pathway receptors to transcription factors (TFs) within a background interaction network. The key idea underlying RegLinker is the use of regular-language constraints to control the number of non-pathway edges present in the computed paths. We systematically evaluate RegLinker and alternative approaches against a comprehensive set of 15 signaling pathways and demonstrate that RegLinker exhibits superior recovery of withheld pathway proteins and interactions. These results show the promise of our approach for prioritizing candidates for experimental study and the broader potential of automated analysis to attenuate difficulties of traditional manual inquiry.
Advisors/Committee Members: Murali, T. M. (committeechair), Heath, Lenwood S. (committee member), Prakash, B. Aditya (committee member).
Subjects/Keywords: Regular Languages; Shortest Paths; Signaling Networks
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Wagner, M. J. (2018). Reconstructing Signaling Pathways Using Regular-Language Constrained Paths. (Masters Thesis). Virginia Tech. Retrieved from http://hdl.handle.net/10919/85044
Chicago Manual of Style (16th Edition):
Wagner, Mitchell James. “Reconstructing Signaling Pathways Using Regular-Language Constrained Paths.” 2018. Masters Thesis, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/85044.
MLA Handbook (7th Edition):
Wagner, Mitchell James. “Reconstructing Signaling Pathways Using Regular-Language Constrained Paths.” 2018. Web. 28 Feb 2021.
Vancouver:
Wagner MJ. Reconstructing Signaling Pathways Using Regular-Language Constrained Paths. [Internet] [Masters thesis]. Virginia Tech; 2018. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/85044.
Council of Science Editors:
Wagner MJ. Reconstructing Signaling Pathways Using Regular-Language Constrained Paths. [Masters Thesis]. Virginia Tech; 2018. Available from: http://hdl.handle.net/10919/85044

Virginia Tech
5.
Paul, Ann Molly.
QBank: A Web-Based Dynamic Problem Authoring Tool.
Degree: MS, Computer Science and Applications, 2013, Virginia Tech
URL: http://hdl.handle.net/10919/23889
► Widespread accessibility to the Internet and the proliferation of Web 2.0 technologies has led to the growth of online tools for educational content creation, delivery,…
(more)
▼ Widespread accessibility to the Internet and the proliferation of Web 2.0 technologies has led to the growth of online tools for educational content creation, delivery, and assessment. Maintaining high quality of assessment using this medium is made more practical by using tools to author and represent a broad range of assessment problems. A survey of existing problem-authoring tools uncovered two main deficiencies: (a) lack of support for authoring "dynamic" (parameterized) problems, and (b) lack of tools that are independent of a specific publishing format, persistence format, and/or authoring platform.
Dynamic problems are assessment problem templates that support parameterization of the problems by the use of variables. Variables dynamically take values assigned at random to generate different instances of a problem from a template. This provides for greater diversity of authored problems, and permits students to practice with different variations of a problem. In existing problem authoring tools, the problem types supported are often limited to static problems.
A formal definition of an assessment problem structure is presented. This formal definition served as a design aide for a new problem authoring system named QBank, a web-based tool that supports authoring dynamic problems. The proof-of-concept implementation of QBank supports export of questions in CSV format and the Khan Academy Exercise format. The extensible nature of the framework allows future development of features supporting export of authored problems into other publishing and/or persistence formats.
Advisors/Committee Members: Shaffer, Clifford A. (committeechair), Heath, Lenwood S. (committee member), Fox, Edward A. (committee member).
Subjects/Keywords: Formal Problem Definition; Problem Authoring Tool; Question Banking; Computer Education; Parameterized Questions
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Paul, A. M. (2013). QBank: A Web-Based Dynamic Problem Authoring Tool. (Masters Thesis). Virginia Tech. Retrieved from http://hdl.handle.net/10919/23889
Chicago Manual of Style (16th Edition):
Paul, Ann Molly. “QBank: A Web-Based Dynamic Problem Authoring Tool.” 2013. Masters Thesis, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/23889.
MLA Handbook (7th Edition):
Paul, Ann Molly. “QBank: A Web-Based Dynamic Problem Authoring Tool.” 2013. Web. 28 Feb 2021.
Vancouver:
Paul AM. QBank: A Web-Based Dynamic Problem Authoring Tool. [Internet] [Masters thesis]. Virginia Tech; 2013. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/23889.
Council of Science Editors:
Paul AM. QBank: A Web-Based Dynamic Problem Authoring Tool. [Masters Thesis]. Virginia Tech; 2013. Available from: http://hdl.handle.net/10919/23889

Virginia Tech
6.
Aggarwal, Deepti.
Inferring Signal Transduction Pathways from Gene Expression Data using Prior Knowledge.
Degree: MS, Electrical Engineering, 2015, Virginia Tech
URL: http://hdl.handle.net/10919/56601
► Plants have developed specific responses to external stimuli such as drought, cold, high salinity in soil, and precipitation in addition to internal developmental stimuli. These…
(more)
▼ Plants have developed specific responses to external stimuli such as drought, cold, high salinity in soil, and precipitation in addition to internal developmental stimuli. These stimuli trigger signal transduction pathways in plants, leading to cellular adaptation. A signal transduction pathway is a network of entities that interact with one another in response to given stimulus. Such participating entities control and affect gene expression in response to stimulus . For computational purposes, a signal transduction pathway is represented as a network where nodes are biological molecules. The interaction of two nodes is a directed edge.
A plethora of research has been conducted to understand signal transduction pathways. However, there are a limited number of approaches to explore and integrate signal transduction pathways. Therefore, we need a platform to integrate together and to expand the information of each signal transduction pathway. One of the major computational challenges in inferring signal transduction pathways is that the addition of new nodes and edges can affect the information flow between existing ones in an unknown manner. Here, I develop the Beacon inference engine to address these computational challenges. This software engine employs a network inference approach to predict new edges. First, it uses mutual information and context likelihood relatedness to predict edges from gene expression time-series data. Subsequently, it incorporates prior knowledge to limit false-positive predictions. Finally, a naive Bayes classifier is used to predict new edges. The Beacon inference engine predicts new edges
with a recall rate 77.6% and precision 81.4%. 24% of the total predicted edges are new i.e., they are not present in the prior knowledge.
Advisors/Committee Members: Parikh, Devi (committeechair), Heath, Lenwood S. (committeechair), Yu, Guoqiang (committee member), Grene, Ruth (committee member).
Subjects/Keywords: Signal Transduction Pathways; Gene Expression; Inference Engine
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Aggarwal, D. (2015). Inferring Signal Transduction Pathways from Gene Expression Data using Prior Knowledge. (Masters Thesis). Virginia Tech. Retrieved from http://hdl.handle.net/10919/56601
Chicago Manual of Style (16th Edition):
Aggarwal, Deepti. “Inferring Signal Transduction Pathways from Gene Expression Data using Prior Knowledge.” 2015. Masters Thesis, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/56601.
MLA Handbook (7th Edition):
Aggarwal, Deepti. “Inferring Signal Transduction Pathways from Gene Expression Data using Prior Knowledge.” 2015. Web. 28 Feb 2021.
Vancouver:
Aggarwal D. Inferring Signal Transduction Pathways from Gene Expression Data using Prior Knowledge. [Internet] [Masters thesis]. Virginia Tech; 2015. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/56601.
Council of Science Editors:
Aggarwal D. Inferring Signal Transduction Pathways from Gene Expression Data using Prior Knowledge. [Masters Thesis]. Virginia Tech; 2015. Available from: http://hdl.handle.net/10919/56601

Virginia Tech
7.
Ye, Jiacheng.
Computing Exact Bottleneck Distance on Random Point Sets.
Degree: MS, Computer Science and Applications, 2020, Virginia Tech
URL: http://hdl.handle.net/10919/98669
► Consider the problem of matching taxis to an equal number of requests. While matching them, one objective is to minimize the largest distance between a…
(more)
▼ Consider the problem of matching taxis to an equal number of requests. While matching
them, one objective is to minimize the largest distance between a request and its match.
Finding such a matching is called the bottleneck matching problem. In addition, this optimization problem arises in topological data analysis as well as machine learning. In this
thesis, I conduct an empirical analysis of a new algorithm, which is called the FAST-MATCH
algorithm, to find the bottleneck matching. I find that, when a large input data is randomly
generated from a unit square, the FAST-MATCH algorithm performs substantially faster
than the classical methods
Advisors/Committee Members: Raghvendra, Sharath (committeechair), Heath, Lenwood S. (committee member), Fox, Edward A. (committee member).
Subjects/Keywords: bipartite graph; bottleneck matching
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Ye, J. (2020). Computing Exact Bottleneck Distance on Random Point Sets. (Masters Thesis). Virginia Tech. Retrieved from http://hdl.handle.net/10919/98669
Chicago Manual of Style (16th Edition):
Ye, Jiacheng. “Computing Exact Bottleneck Distance on Random Point Sets.” 2020. Masters Thesis, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/98669.
MLA Handbook (7th Edition):
Ye, Jiacheng. “Computing Exact Bottleneck Distance on Random Point Sets.” 2020. Web. 28 Feb 2021.
Vancouver:
Ye J. Computing Exact Bottleneck Distance on Random Point Sets. [Internet] [Masters thesis]. Virginia Tech; 2020. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/98669.
Council of Science Editors:
Ye J. Computing Exact Bottleneck Distance on Random Point Sets. [Masters Thesis]. Virginia Tech; 2020. Available from: http://hdl.handle.net/10919/98669

Virginia Tech
8.
Maji, Nabanita.
An Interactive Tutorial for NP-Completeness.
Degree: MS, Computer Science and Applications, 2015, Virginia Tech
URL: http://hdl.handle.net/10919/52973
► A Theory of Algorithms course is essential to any Computer Science curriculum at both the undergraduate and graduate levels. It is also considered to be…
(more)
▼ A Theory of Algorithms course is essential to any Computer Science curriculum at both the undergraduate and graduate levels. It is also considered to be difficult material to teach or to learn. In particular the topics of Computational Complexity Theory, reductions, and the NP-Complete class of problems are considered difficult by students.
Numerous algorithm visualizations (AVs) have been developed over the years to portray the dynamic nature of known algorithms commonly taught in undergraduate classes. However, to the best of our knowledge, the instructional material available for NP-Completeness is mostly static and textual, which does little to alleviate the complexity of the topic.
Our aim is to improve the pedagogy of NP-Completeness by providing intuitive, interactive, and easy-to-understand visualizations for standard NP Complete problems, reductions, and proofs. In this thesis, we present a set of visualizations that we developed using the OpenDSA framework for certain NP-Complete problems. Our paradigm is a three step process. We first use an AV to illustrate a particular NP-Complete problem. Then we present an exercise to provide a first-hand experience with attempting to solve a problem instance. Finally, we present a visualization of a reduction as a part of the proof for NP-Completeness.
Our work has been delivered as a collection of modules in OpenDSA, an interactive eTextbook system developed at
Virginia Tech. The tutorial has been introduced as a teaching supplement in both a senior undergraduate and a graduate class. We present an analysis of the system use based on records of online interactions by students who used the tutorial. We also present results from a survey of the students.
Advisors/Committee Members: Shaffer, Clifford A. (committeechair), North, Christopher L. (committee member), Heath, Lenwood S. (committee member).
Subjects/Keywords: NP Completeness; Complexity Theory; Reductions; Algorithm Visualization; Computer Science Education; Automated Assessment
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Maji, N. (2015). An Interactive Tutorial for NP-Completeness. (Masters Thesis). Virginia Tech. Retrieved from http://hdl.handle.net/10919/52973
Chicago Manual of Style (16th Edition):
Maji, Nabanita. “An Interactive Tutorial for NP-Completeness.” 2015. Masters Thesis, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/52973.
MLA Handbook (7th Edition):
Maji, Nabanita. “An Interactive Tutorial for NP-Completeness.” 2015. Web. 28 Feb 2021.
Vancouver:
Maji N. An Interactive Tutorial for NP-Completeness. [Internet] [Masters thesis]. Virginia Tech; 2015. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/52973.
Council of Science Editors:
Maji N. An Interactive Tutorial for NP-Completeness. [Masters Thesis]. Virginia Tech; 2015. Available from: http://hdl.handle.net/10919/52973

Virginia Tech
9.
Parikh, Nidhi Kiranbhai.
Generating Random Graphs with Tunable Clustering Coefficient.
Degree: MS, Computer Science, 2011, Virginia Tech
URL: http://hdl.handle.net/10919/31591
► Most real-world networks exhibit a high clustering coefficientâ the probability that two neighbors of a node are also neighbors of each other. We propose four…
(more)
▼ Most real-world networks exhibit a high clustering coefficientâ the probability that two neighbors
of a node are also neighbors of each other. We propose four algorithms CONF-1, CONF-2,
THROW-1, and THROW-2 which are based on the configuration model and that take triangle degree
sequence (representing the number of triangles/corners at a node) and single-edge degree sequence
(representing the number of single-edges/stubs at a node) as input and generate a random graph
with a tunable clustering coefficient. We analyze them theoretically and empirically for the case of
a regular graph. CONF-1 and CONF-2 generate a random graph with the degree sequence and the
clustering coefficient anticipated from the input triangle and single-edge degree sequences. At each
time step, CONF-1 chooses each node for creating triangles or single edges with the same probability,
while CONF-2 chooses a node for creating triangles or single edge with a probability proportional
to their number of unconnected corners or unconnected stubs, respectively. Experimental results
match quite well with the anticipated clustering coefficient except for highly dense graphs, in which
case the experimental clustering coefficient is higher than the anticipated value. THROW-2 chooses
three distinct nodes for creating triangles and two distinct nodes for creating single edges, while
they need not be distinct for THROW-1. For THROW-1 and THROW-2, the degree sequence and the
clustering coefficient of the generated graph varies from the input. However, the expected degree
distribution, and the clustering coefficient of the generated graph can also be predicted using analytical
results. Experiments show that, for THROW-1 and THROW-2, the results match quite well with
the analytical results. Typically, only information about degree sequence or degree distribution is
available. We also propose an algorithm DEG that takes degree sequence and clustering coefficient
as input and generates a graph with the same properties. Experiments show results for DEG that
are quite similar to those for CONF-1 and CONF-2.
Advisors/Committee Members: Heath, Lenwood S. (committeechair), Vullikanti, Anil Kumar S. (committee member), Marathe, Madhav V. (committee member).
Subjects/Keywords: Clustering coefficient; complex networks; random graphs; algorithms
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Parikh, N. K. (2011). Generating Random Graphs with Tunable Clustering Coefficient. (Masters Thesis). Virginia Tech. Retrieved from http://hdl.handle.net/10919/31591
Chicago Manual of Style (16th Edition):
Parikh, Nidhi Kiranbhai. “Generating Random Graphs with Tunable Clustering Coefficient.” 2011. Masters Thesis, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/31591.
MLA Handbook (7th Edition):
Parikh, Nidhi Kiranbhai. “Generating Random Graphs with Tunable Clustering Coefficient.” 2011. Web. 28 Feb 2021.
Vancouver:
Parikh NK. Generating Random Graphs with Tunable Clustering Coefficient. [Internet] [Masters thesis]. Virginia Tech; 2011. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/31591.
Council of Science Editors:
Parikh NK. Generating Random Graphs with Tunable Clustering Coefficient. [Masters Thesis]. Virginia Tech; 2011. Available from: http://hdl.handle.net/10919/31591

Virginia Tech
10.
Yang, Yanshen.
MCAT: Motif Combining and Association Tool.
Degree: MS, Computer Science and Applications, 2018, Virginia Tech
URL: http://hdl.handle.net/10919/84999
► De novo motif discovery in biological sequences is an important and computationally challenging problem. A myriad of algorithms have been developed to solve this problem…
(more)
▼ De novo motif discovery in biological sequences is an important and computationally challenging problem. A myriad of algorithms have been developed to solve this problem with varying success, but it can be difficult for even a small number of these tools to reach a consensus. Because individual tools can be better suited for specific scenarios, an ensemble tool that combines the results of many algorithms can yield a more confident and complete result. We present a novel and fast tool MCAT (Motif Combining and Association Tool) for de novo motif discovery by combining six state-of-the-art motif discovery tools (MEME, BioProspector, DECOD, XXmotif, Weeder, and CMF). We apply MCAT to data sets with DNA sequences that come from various species and compare our results with two well-established ensemble motif finding tools, EMD and DynaMIT. The experimental results show that MCAT is able to identify exact match motifs in DNA sequences efficiently, and it has a better performance in practice.
Advisors/Committee Members: Heath, Lenwood S. (committeechair), Zhang, Liqing (committee member), Hauf, Silke (committee member).
Subjects/Keywords: Motif finding
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Yang, Y. (2018). MCAT: Motif Combining and Association Tool. (Masters Thesis). Virginia Tech. Retrieved from http://hdl.handle.net/10919/84999
Chicago Manual of Style (16th Edition):
Yang, Yanshen. “MCAT: Motif Combining and Association Tool.” 2018. Masters Thesis, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/84999.
MLA Handbook (7th Edition):
Yang, Yanshen. “MCAT: Motif Combining and Association Tool.” 2018. Web. 28 Feb 2021.
Vancouver:
Yang Y. MCAT: Motif Combining and Association Tool. [Internet] [Masters thesis]. Virginia Tech; 2018. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/84999.
Council of Science Editors:
Yang Y. MCAT: Motif Combining and Association Tool. [Masters Thesis]. Virginia Tech; 2018. Available from: http://hdl.handle.net/10919/84999

Virginia Tech
11.
Senthil, Rathna.
IDLE: A Novel Approach to Improving Overlapping Community Detection in Complex Networks.
Degree: MS, Computer Science and Applications, 2016, Virginia Tech
URL: http://hdl.handle.net/10919/65160
► Complex systems in areas such as biology, physics, social science, and technology are extensively modeled as networks due to the rich set of tools available…
(more)
▼ Complex systems in areas such as biology, physics, social science, and technology are extensively
modeled as networks due to the rich set of tools available for their study and analysis. In such
networks, groups of nodes that correspond to functional units or those that share some common
attributes result in densely connected structures called communities. Community formation is an
inherent process, and it is not easy to detect these structures because of the complex ways in which components of these systems interact.
Detecting communities in complex networks is important because it helps us to understand
their internal dynamics better, thereby leading to significant insights into the underlying systems.
Overlapping communities are formed when nodes in the network simultaneously belong to more
than one community, and it has been shown that most real networks naturally contain such an overlapping community structure. In this thesis, I introduce a new approach to overlapping community detection called IDLE that incorporates ideas from another interesting problem: the identification of influential spreaders. Influential spreaders are nodes that play an important role in the propagation of information or diseases in networks. Research suggests that the main core identified by k-core decomposition techniques are the most influential spreaders. In my approach, I use these k-cores as candidate seeds for local community detection. Following a well-defined seed selection process, IDLE builds and prunes their corresponding local communities. It then augments the resulting local communities and puts them together to obtain the global overlapping community
structure of the network.
My approach improves on the current local community detection techniques, because they use
either random nodes or maximal k-cliques as seeds, and they do not focus explicitly on detecting
overlapping nodes in the network. Hence their results can be significantly improved in building
ground-truth overlapping communities. The results of my experiments on real and synthetic networks indicate that IDLE results in enhanced overlapping community detection and thereby a
better identification of overlapping nodes that could be important or influential components in the
underlying system.
Advisors/Committee Members: Heath, Lenwood S. (committeechair), Prakash, B. Aditya (committee member), Raghvendra, Sharath (committee member).
Subjects/Keywords: Overlapping Community Detection; Complex Networks; Local Expansion
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Senthil, R. (2016). IDLE: A Novel Approach to Improving Overlapping Community Detection in Complex Networks. (Masters Thesis). Virginia Tech. Retrieved from http://hdl.handle.net/10919/65160
Chicago Manual of Style (16th Edition):
Senthil, Rathna. “IDLE: A Novel Approach to Improving Overlapping Community Detection in Complex Networks.” 2016. Masters Thesis, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/65160.
MLA Handbook (7th Edition):
Senthil, Rathna. “IDLE: A Novel Approach to Improving Overlapping Community Detection in Complex Networks.” 2016. Web. 28 Feb 2021.
Vancouver:
Senthil R. IDLE: A Novel Approach to Improving Overlapping Community Detection in Complex Networks. [Internet] [Masters thesis]. Virginia Tech; 2016. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/65160.
Council of Science Editors:
Senthil R. IDLE: A Novel Approach to Improving Overlapping Community Detection in Complex Networks. [Masters Thesis]. Virginia Tech; 2016. Available from: http://hdl.handle.net/10919/65160
12.
Vijayan, Vinaya.
Understanding and Improving Identification of Somatic Variants.
Degree: PhD, Genetics, Bioinformatics, and Computational Biology, 2016, Virginia Tech
URL: http://hdl.handle.net/10919/72969
► It is important to understand the entire spectrum of somatic variants to gain more insight into mutations that occur in different cancers for development of…
(more)
▼ It is important to understand the entire spectrum of somatic variants to gain more insight into mutations that occur in different cancers for development of better diagnostic, prognostic and therapeutic tools. This thesis outlines our work in understanding somatic variant calling, improving the identification of somatic variants from whole genome and whole exome platforms and identification of biomarkers for lung cancer.
Integrating somatic variants from whole genome and whole exome platforms poses a challenge as variants identified in the exonic regions of the whole genome platform may not be identified on the whole exome platform and vice-versa. Taking a simple union or intersection of the somatic variants from both platforms would lead to inclusion of many false positives (through union) and exclusion of many true variants (through intersection). We develop the first framework to improve the identification of somatic variants on whole genome and exome platforms using a machine learning approach by combining the results from two popular somatic variant callers. Testing on simulated and real data sets shows that our framework identifies variants more accurately than using only one somatic variant caller or using variants from only one platform.
Short tandem repeats (STRs) are repetitive units of 2-6 nucleotides. STRs make up approximately 1% of the human genome and have been traditionally used as genetic markers in population studies. We conduct a series of in silico analyses using the exome data of 32 individuals with lung cancer to identify 103 STRs that could potentially serve as cancer diagnostic markers and 624 STRs that could potentially serve as cancer predisposition markers.
Overall these studies improve the accuracy in identification of somatic variants and highlight the association of STRs to lung cancer.
Advisors/Committee Members: Zhang, Liqing (committeechair), Wu, Xiaowei (committee member), Heath, Lenwood S. (committee member), Franck, Christopher T. (committee member).
Subjects/Keywords: Somatic variants; Somatic variant callers; Somatic point mutations; Short tandem repeat variation; Lung squamous cell carcinoma
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Vijayan, V. (2016). Understanding and Improving Identification of Somatic Variants. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/72969
Chicago Manual of Style (16th Edition):
Vijayan, Vinaya. “Understanding and Improving Identification of Somatic Variants.” 2016. Doctoral Dissertation, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/72969.
MLA Handbook (7th Edition):
Vijayan, Vinaya. “Understanding and Improving Identification of Somatic Variants.” 2016. Web. 28 Feb 2021.
Vancouver:
Vijayan V. Understanding and Improving Identification of Somatic Variants. [Internet] [Doctoral dissertation]. Virginia Tech; 2016. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/72969.
Council of Science Editors:
Vijayan V. Understanding and Improving Identification of Somatic Variants. [Doctoral Dissertation]. Virginia Tech; 2016. Available from: http://hdl.handle.net/10919/72969
13.
Hasan, Mohammad Shabbir.
Identifying and Analyzing Indel Variants in the Human Genome Using Computational Approaches.
Degree: PhD, Computer Science and Applications, 2019, Virginia Tech
URL: http://hdl.handle.net/10919/90797
► Insertion and deletion (indel), a common form of genetic variation in the human genome, is associated with genetic diseases and cancer. However, indels are heavily…
(more)
▼ Insertion and deletion (indel), a common form of genetic variation in the human genome, is associated with genetic diseases and cancer. However, indels are heavily understudied due to experimental and computational challenges. This dissertation addresses the computational challenges in three aspects. First, the current approach of representing indels is ambiguous and causes significant database redundancy. A universal positioning system, UPS-indel, is proposed to represent equivalent indels unambiguously and the UPS-indel algorithm is theoretically proven to find all equivalent indels and is thus exhaustive. Second, a significant number of indels are hidden in DNA reads not mapped to the reference genome. Genesis-indel, a computational pipeline that explores the unmapped reads to identify novel indels that are initially missed, is developed. Genesis-indel has been shown to uncover indels that can be important genetic markers for breast cancer. Finally, mutations occurring in somatic cells play a vital role in transforming healthy cells into cancer cells. Therefore, accurate identification of somatic mutation is essential for a better understanding of cancer genomes. SomaticHunter, an ensemble of two sensitive variant callers, is developed. Simulated studies using whole genome and whole exome sequences have shown that SomaticHunter achieves recall comparable to state-of-the-art somatic mutation callers while delivering the highest precision and therefore resulting in the highest F1 score among all the callers compared.
Advisors/Committee Members: Zhang, Liqing (committeechair), Shi, Xinghua (committee member), Wu, Xiaowei (committee member), Huang, Bert (committee member), Heath, Lenwood S. (committee member).
Subjects/Keywords: Genetic Variants; Indel; Somatic Mutation; Next Generation Sequencing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Hasan, M. S. (2019). Identifying and Analyzing Indel Variants in the Human Genome Using Computational Approaches. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/90797
Chicago Manual of Style (16th Edition):
Hasan, Mohammad Shabbir. “Identifying and Analyzing Indel Variants in the Human Genome Using Computational Approaches.” 2019. Doctoral Dissertation, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/90797.
MLA Handbook (7th Edition):
Hasan, Mohammad Shabbir. “Identifying and Analyzing Indel Variants in the Human Genome Using Computational Approaches.” 2019. Web. 28 Feb 2021.
Vancouver:
Hasan MS. Identifying and Analyzing Indel Variants in the Human Genome Using Computational Approaches. [Internet] [Doctoral dissertation]. Virginia Tech; 2019. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/90797.
Council of Science Editors:
Hasan MS. Identifying and Analyzing Indel Variants in the Human Genome Using Computational Approaches. [Doctoral Dissertation]. Virginia Tech; 2019. Available from: http://hdl.handle.net/10919/90797

Virginia Tech
14.
Torkey, Hanaa A.
Machine Learning Approaches for Identifying microRNA Targets and Conserved Protein Complexes.
Degree: PhD, Computer Science and Applications, 2017, Virginia Tech
URL: http://hdl.handle.net/10919/77536
► Much research has been directed toward understanding the roles of essential components in the cell, such as proteins, microRNAs, and genes. This dissertation focuses on…
(more)
▼ Much research has been directed toward understanding the roles of essential components in the cell, such as proteins, microRNAs, and genes. This dissertation focuses on two interesting problems in bioinformatics research: microRNA-target prediction and the identification of conserved protein complexes across species. We define the two problems and develop novel approaches for solving them. MicroRNAs are short non-coding RNAs that mediate gene expression. The goal is to predict microRNA targets. Existing methods rely on sequence features to predict targets. These features are neither sufficient nor necessary to identify functional target sites and ignore the cellular conditions in which microRNA and mRNA interact. We developed MicroTarget to predict microRNA-mRNA interactions using heterogeneous data sources. MicroTarget uses expression data to learn candidate target set for each microRNA. Then, sequence data is used to provide evidence of direct interactions and ranking the predicted targets. The predicted targets overlap with many of the experimentally validated ones. The results indicate that using expression data helps in predicting microRNA targets accurately.
Protein complexes conserved across species specify processes that are core to cell machinery. Methods that have been devised to identify conserved complexes are severely limited by noise in PPI data. Behind PPIs, there are domains interacting physically to perform the necessary functions. Therefore, employing domains and domain interactions gives a better view of the protein interactions and functions. We developed novel strategy for local network alignment, DONA. DONA maps proteins into their domains and uses DDIs to improve the network alignment. We developed novel strategy for constructing an alignment graph and then uses this graph to discover the conserved sub-networks. DONA shows better performance in terms of the overlap with known protein complexes with higher precision and recall rates than existing methods. The result shows better semantic similarity computed with respect to both the biological process and the molecular function of the aligned sub-networks.
Advisors/Committee Members: Heath, Lenwood S. (committeechair), Zhang, Liqing (committee member), Grene, Ruth (committee member), Deng, Xinwei (committee member), ElHefnawi, Mahmoud M. (committee member).
Subjects/Keywords: microRNA target; machine learning; network alignment; protein complex.
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Torkey, H. A. (2017). Machine Learning Approaches for Identifying microRNA Targets and Conserved Protein Complexes. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/77536
Chicago Manual of Style (16th Edition):
Torkey, Hanaa A. “Machine Learning Approaches for Identifying microRNA Targets and Conserved Protein Complexes.” 2017. Doctoral Dissertation, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/77536.
MLA Handbook (7th Edition):
Torkey, Hanaa A. “Machine Learning Approaches for Identifying microRNA Targets and Conserved Protein Complexes.” 2017. Web. 28 Feb 2021.
Vancouver:
Torkey HA. Machine Learning Approaches for Identifying microRNA Targets and Conserved Protein Complexes. [Internet] [Doctoral dissertation]. Virginia Tech; 2017. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/77536.
Council of Science Editors:
Torkey HA. Machine Learning Approaches for Identifying microRNA Targets and Conserved Protein Complexes. [Doctoral Dissertation]. Virginia Tech; 2017. Available from: http://hdl.handle.net/10919/77536

Virginia Tech
15.
Lahn, Nathaniel Adam.
A Separator-Based Framework for Graph Matching Problems.
Degree: PhD, Computer Science and Applications, 2020, Virginia Tech
URL: http://hdl.handle.net/10919/98618
► Assume we are given a list of objects, and a list of compatible pairs of these objects. A matching consists of a chosen subset of…
(more)
▼ Assume we are given a list of objects, and a list of compatible pairs of these objects. A matching consists of a chosen subset of these compatible pairs, where each object participates in at most one chosen pair. For any chosen pair of objects, we say the these two objects are matched. Generally, we seek to maximize the number of compatible matches. A maximum cardinality matching is a matching with the largest possible size. In many cases, there are multiple options for maximizing the number of compatible pairings. While maximizing the size of the matching is often the primary concern, one may also seek to minimize the cost of the matching. This is known as the minimum-cost maximum-cardinality matching problem. These two matching problems have been well studied, since they play a fundamental role in algorithmic theory as well as motivate many practical applications. Our interest is in the design of algorithms for both of these problems that are efficiently scalable, even as the number of objects involved grows very large. To aid in the design of scalable algorithms, we observe that some inputs have good separators, meaning that by removing some subset
S of objects, one can divide the remaining objects into two sets V and V', where all pairs of objects between V and V' are incompatible. We design several new algorithms that exploit good separators, and prove that these algorithms scale better than previously existing approaches.
Advisors/Committee Members: Raghvendra, Sharath (committeechair), Murali, T. M. (committee member), Heath, Lenwood S. (committee member), Khanna, Sanjeev (committee member), Back, Godmar Volker (committee member).
Subjects/Keywords: Matching; graphs; graph separators; combinatorial optimization
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Lahn, N. A. (2020). A Separator-Based Framework for Graph Matching Problems. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/98618
Chicago Manual of Style (16th Edition):
Lahn, Nathaniel Adam. “A Separator-Based Framework for Graph Matching Problems.” 2020. Doctoral Dissertation, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/98618.
MLA Handbook (7th Edition):
Lahn, Nathaniel Adam. “A Separator-Based Framework for Graph Matching Problems.” 2020. Web. 28 Feb 2021.
Vancouver:
Lahn NA. A Separator-Based Framework for Graph Matching Problems. [Internet] [Doctoral dissertation]. Virginia Tech; 2020. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/98618.
Council of Science Editors:
Lahn NA. A Separator-Based Framework for Graph Matching Problems. [Doctoral Dissertation]. Virginia Tech; 2020. Available from: http://hdl.handle.net/10919/98618

Virginia Tech
16.
Altarawy, Doaa Abdelsalam Ahmed Mohamed.
DeTangle: A Framework for Interactive Prediction and Visualization of Gene Regulatory Networks.
Degree: PhD, Computer Science and Applications, 2017, Virginia Tech
URL: http://hdl.handle.net/10919/85504
► With the abundance of biological data, computational prediction of gene regulatory networks (GRNs) from gene expression data has become more feasible. Although incorporating other prior…
(more)
▼ With the abundance of biological data, computational prediction of gene regulatory networks (GRNs) from gene expression data has become more feasible. Although incorporating other
prior knowledge (PK), along with gene expression, greatly improves prediction accuracy, the accuracy remains low. PK in GRN inference can be categorized into noisy and curated.
Several algorithms were proposed to incorporate noisy PK, but none address curated PK. Another challenge is that much of the PK is not stored in databases or not in a unified structured format to be accessible by inference algorithms. Moreover, no GRN inference method exists that supports post-prediction PK.
This thesis addresses those limitations with three solutions: PEAK algorithm for integrating both curated and noisy PK, Online-PEAK for post-prediction interactive feedback, and DeTangle for visualization and navigation of GRNs. PEAK integrates both curated as well as noisy PK in GRN inference. We introduce a novel method for GRN inference, CurInf, to effectively integrate curated PK, and we use the previous method, Modified Elastic Net, for noisy PK, and we call it NoisInf. Using 100% curated PK, CurInf improves the AUPR accuracy score over NoisInf by 27.3% in synthetic data, 86.5% in E. coli data, and 31.1% in
S. cerevisiae data. Moreover, we developed an online algorithm, online-PEAK, that enables the biologist to interact with the inference algorithm, PEAK, through a visual interface to add their domain experience about the structure of the GRN as a feedback to the system. We experimentally verified the ability of online-PEAK to achieve incremental accuracy when PK is added by the user, including true and false PK. Even when the noise in PK is 10 times more than true PK, online-PEAK performs better than inference without any PK.
Finally, we present DeTangle, a Web server for interactive GRN prediction and visualization. DeTangle provides a seamless analysis of GRN starting from uploading gene expression, GRN inference, post-prediction feedback using online-PEAK, and visualization and navigation of the predicted GRN. More accurate prediction of GRN can facilitate studying complex molecular interactions, understanding diseases, and aiding drug design.
Advisors/Committee Members: Heath, Lenwood S. (committeechair), North, Christopher L. (committee member), Grene, Ruth (committee member), Shaffer, Clifford A. (committee member), Ismail, Mahamed (committee member).
Subjects/Keywords: Gene regulation; prior knowledge; gene regulatory network inference; visualization; machine learning
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Altarawy, D. A. A. M. (2017). DeTangle: A Framework for Interactive Prediction and Visualization of Gene Regulatory Networks. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/85504
Chicago Manual of Style (16th Edition):
Altarawy, Doaa Abdelsalam Ahmed Mohamed. “DeTangle: A Framework for Interactive Prediction and Visualization of Gene Regulatory Networks.” 2017. Doctoral Dissertation, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/85504.
MLA Handbook (7th Edition):
Altarawy, Doaa Abdelsalam Ahmed Mohamed. “DeTangle: A Framework for Interactive Prediction and Visualization of Gene Regulatory Networks.” 2017. Web. 28 Feb 2021.
Vancouver:
Altarawy DAAM. DeTangle: A Framework for Interactive Prediction and Visualization of Gene Regulatory Networks. [Internet] [Doctoral dissertation]. Virginia Tech; 2017. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/85504.
Council of Science Editors:
Altarawy DAAM. DeTangle: A Framework for Interactive Prediction and Visualization of Gene Regulatory Networks. [Doctoral Dissertation]. Virginia Tech; 2017. Available from: http://hdl.handle.net/10919/85504

Virginia Tech
17.
Kakumanu, Akshay.
Effects of Drought on Gene Expression in Maize Reproductive and Leaf Meristem Tissues as Revealed by Deep Sequencing.
Degree: MSin Life Sciences, Plant Pathology, Physiology, and Weed Science, 2012, Virginia Tech
URL: http://hdl.handle.net/10919/33907
► Drought is a major environmental stress factor that poses a serious threat to food security. The effects of drought on early reproductive tissue at 1-2…
(more)
▼ Drought is a major environmental stress factor that poses a serious threat to food security. The effects of drought on early reproductive tissue at 1-2 DAP (days after pollination) is irreversible in nature and leads to embryo abortion, directly affecting the grain yield production. We developed a working RNA-Seq pipeline to study maize (Zea mays) drought transcriptome sequenced by Illumina GSIIx technology to compare drought treated and well- watered fertilized ovary (1-2DAP) and basal leaf meristem tissue. The pipeline also identified novel splice junctions - splice variants of previously known gene models and potential novel transcription units. An attempt was also made to exploit the data to understand the drought mediated transcriptional events (e.g. alternative splicing). Gene Ontology (GO) enrichment analysis revealed massive down-regulation of cell division and cell cycle genes in the drought stressed ovary only. Among GO categories related to carbohydrate metabolism, changes in starch and sucrose metabolism-related genes occurred in the ovary, consistent with a decrease in starch levels, and in sucrose transporter function, with no comparable changes occurring in the leaf meristem. ABA-related processes responded positively, but only in the ovaries. GO enrichment analysis also suggested differential responses to drought between the two tissues in categories such as oxidative stress-related and cell cycle events. The data are discussed in the context of the susceptibility of maize kernel to drought stress leading to embryo abortion, and the relative robustness of actively dividing vegetative tissue taken at the same time from the same plant subjected to the same conditions. A hypothesis is formulated, proposing drought-mediated intersecting effects on the expression of invertase genes, glucose signaling (hexokinase 1-dependent and independent), ABA-dependent and independent signaling, antioxidant responses, PCD, phospholipase C effects, and cell cycle related processes.
This work was supported by the National Science Foundation Plant Genome Research Pro- gram (grant no. DBI0922747), iPlant Collaborative (NSF DBI-0735191) and also NSF ABI1062472.
Advisors/Committee Members: Grene, Ruth (committeechair), Gillaspy, Glenda E. (committee member), Murali, T. M. (committee member), Heath, Lenwood S. (committee member).
Subjects/Keywords: Drought; Illumina; RNA-Ser; Maize; Ovaries; Leaf Meristem
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Kakumanu, A. (2012). Effects of Drought on Gene Expression in Maize Reproductive and Leaf Meristem Tissues as Revealed by Deep Sequencing. (Masters Thesis). Virginia Tech. Retrieved from http://hdl.handle.net/10919/33907
Chicago Manual of Style (16th Edition):
Kakumanu, Akshay. “Effects of Drought on Gene Expression in Maize Reproductive and Leaf Meristem Tissues as Revealed by Deep Sequencing.” 2012. Masters Thesis, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/33907.
MLA Handbook (7th Edition):
Kakumanu, Akshay. “Effects of Drought on Gene Expression in Maize Reproductive and Leaf Meristem Tissues as Revealed by Deep Sequencing.” 2012. Web. 28 Feb 2021.
Vancouver:
Kakumanu A. Effects of Drought on Gene Expression in Maize Reproductive and Leaf Meristem Tissues as Revealed by Deep Sequencing. [Internet] [Masters thesis]. Virginia Tech; 2012. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/33907.
Council of Science Editors:
Kakumanu A. Effects of Drought on Gene Expression in Maize Reproductive and Leaf Meristem Tissues as Revealed by Deep Sequencing. [Masters Thesis]. Virginia Tech; 2012. Available from: http://hdl.handle.net/10919/33907

Virginia Tech
18.
Belal, Nahla Ahmed.
Two Problems in Computational Genomics.
Degree: PhD, Computer Science, 2011, Virginia Tech
URL: http://hdl.handle.net/10919/26318
► This work addresses two novel problems in the field of computational genomics. The first is whole genome alignment and the second is inferring horizontal gene…
(more)
▼ This work addresses two novel problems in the field of computational genomics. The first is whole genome
alignment and the second is inferring horizontal gene transfer using posets. We define these two problems
and present algorithmic approaches for solving them. For the whole genome alignment, we define alignment
graphs for representing different evolutionary events, and define a scoring function for those graphs. The
problem defined is proven to be NP-complete. Two heuristics are presented to solve the problem, one is
a dynamic programming approach that is optimal for a class of sequences that we define in this work as
breakable arrangements. And, the other is a greedy approach that is not necessarily optimal, however, unlike
the dynamic programming approach, it allows for reversals. For inferring horizontal gene transfer, we define
partial order sets among species, with respect to different genes, and infer genes involved in horizontal gene
transfer by comparing posets for different genes. The posets are used to construct a tree for each gene.
Those trees are then compared and tested for contradiction, where contradictory trees correspond to genes
that are candidates of horizontal gene transfer.
Advisors/Committee Members: Heath, Lenwood S. (committeechair), Abdel-Hamid, Ayman (committee member), Murali, T. M. (committee member), Grene, Ruth (committee member), Setubal, João C. (committee member).
Subjects/Keywords: horizontal gene transfer; Two Problems in Computational Genomics; whole genome alignment; dynamic programming; Graph theory; biology and genetics; graph algorithms; partial order sets
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Belal, N. A. (2011). Two Problems in Computational Genomics. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/26318
Chicago Manual of Style (16th Edition):
Belal, Nahla Ahmed. “Two Problems in Computational Genomics.” 2011. Doctoral Dissertation, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/26318.
MLA Handbook (7th Edition):
Belal, Nahla Ahmed. “Two Problems in Computational Genomics.” 2011. Web. 28 Feb 2021.
Vancouver:
Belal NA. Two Problems in Computational Genomics. [Internet] [Doctoral dissertation]. Virginia Tech; 2011. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/26318.
Council of Science Editors:
Belal NA. Two Problems in Computational Genomics. [Doctoral Dissertation]. Virginia Tech; 2011. Available from: http://hdl.handle.net/10919/26318

Virginia Tech
19.
Modise, Thero.
Genomic Instability and Gene Dosage Obscures Clues to Virulence Mechanisms of F. tularensis species.
Degree: PhD, Genetics, Bioinformatics, and Computational Biology, 2016, Virginia Tech
URL: http://hdl.handle.net/10919/72885
► The pathogen Francisella tularensis subsp. tularensis has been classified as a Center for Disease Control (CDC) select agent. However, little is still known of what…
(more)
▼ The pathogen Francisella tularensis subsp. tularensis has been classified as a Center for Disease Control (CDC) select agent. However, little is still known of what makes the bacteria cause dis-ease, especially the highly virulent type A1 strains. The work in this dissertation focused on type A1 strains from the Inzana laboratory, including a wildtype virulent strain TI0902, an avirulent chemical mutant strain TIGB03 with a Single Nucleotide Polymorphism in the wbtK gene, and several complemented mutants, [wbtK+]TIGB03, with dramatic differences in virulence and growth rates. One of the complemented clones (Clone12 or avp-[wbtK+]TIGB03-C12) was aviru-lent, but protected mice against challenge of a lethal dose of TI0902 and was considered as a possible vaccine strain.
Whole genome sequencing was performed to identify genetic differences between the virulent, avirulent and protective strains using both Roche/454 and Illumina next-generation sequencing technologies. Additionally, RNASeq analysis was performed to identify differentially expressed genes between the different strains. This comprehensive genomic analysis revealed the critical role of transposable elements in inducing genomic instability resulting in large du-plications and deletions in the genomes of the chemical mutant and the complemented clones that in turn affect gene dosage and expression of genes known to regulate virulence. For exam-ple, whole genome sequencing of the avirulent chemical mutant TIGB03 revealed a large 75.5 kb tandem duplication flanked by transposable elements, while the genome of a virulent Clone01 (vir-[wbtK+]TIGB03-C1) lost one copy of the 75.5 kb tandem duplicated region but gained a tandem duplication of another large 80kb region that contains a virulence associated transcription factor SspA. RNAseq data showed that the dosage effect of this extra region in Clone1 suppresses expression of MglA regulated genes. Since MglA regulates genes that are known to be crucial for virulence, including the well-studied Francisella Pathogenicity Island (FPI), these results suggest that gene dosage effects arising from large duplications can trigger unknown virulence mechanisms in F. tularensis subsp. tularensis. These results are important especially when designing live vaccine strains that have repeated insertion elements in their genomes.
Advisors/Committee Members: Jensen, Roderick V. (committeechair), Inzana, Thomas J. (committee member), Heath, Lenwood S. (committee member), Good, Deborah J. (committee member).
Subjects/Keywords: Francisella tularensis; genome instability; vaccine; transposase; inversion; duplica-tion; deletion; RNAseq; DNAseq; Assembly; Gene Dosage; Pathogen
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Modise, T. (2016). Genomic Instability and Gene Dosage Obscures Clues to Virulence Mechanisms of F. tularensis species. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/72885
Chicago Manual of Style (16th Edition):
Modise, Thero. “Genomic Instability and Gene Dosage Obscures Clues to Virulence Mechanisms of F. tularensis species.” 2016. Doctoral Dissertation, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/72885.
MLA Handbook (7th Edition):
Modise, Thero. “Genomic Instability and Gene Dosage Obscures Clues to Virulence Mechanisms of F. tularensis species.” 2016. Web. 28 Feb 2021.
Vancouver:
Modise T. Genomic Instability and Gene Dosage Obscures Clues to Virulence Mechanisms of F. tularensis species. [Internet] [Doctoral dissertation]. Virginia Tech; 2016. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/72885.
Council of Science Editors:
Modise T. Genomic Instability and Gene Dosage Obscures Clues to Virulence Mechanisms of F. tularensis species. [Doctoral Dissertation]. Virginia Tech; 2016. Available from: http://hdl.handle.net/10919/72885

Virginia Tech
20.
Yang, Kuan.
Ancestral Genome Reconstruction in Bacteria.
Degree: PhD, Genetics, Bioinformatics, and Computational Biology, 2012, Virginia Tech
URL: http://hdl.handle.net/10919/28091
► The rapid accumulation of numerous sequenced genomes has provided a golden opportunity for ancestral state reconstruction studies, especially in the whole genome reconstruction area. However,…
(more)
▼ The rapid accumulation of numerous sequenced genomes has provided a golden opportunity for ancestral state reconstruction studies, especially in the whole genome reconstruction area. However, most ancestral genome reconstruction methods developed so far only focus on gene or replicon sequences instead of whole genomes. They rely largely on either detailed modeling of evolutionary events or edit distance computation, both of which can be computationally prohibitive for large data sets. Hence, most of these methods can only be applied to a small number of features and species. In this dissertation, we describe the design, implementation, and evaluation of an ancestral genome reconstruction system (REGEN) for bacteria. It is the first bacterial genome reconstruction tool that focuses on ancestral state reconstruction at the genome scale instead of the gene scale. It not only reconstructs ancestral gene content and contiguous gene runs using either a maximum parsimony or a maximum likelihood criterion but also replicon structures of each ancestor. Based on the reconstructed genomes, it can infer all major events at both the gene scale, such as insertion, deletion, and translocation, and the replicon scale, such as replicon gain, loss, and merge. REGEN finishes by producing a visual representation of the entire evolutionary history of all genomes in the study. With a model-free reconstruction method at its core, the computational requirement for ancestral genome reconstruction is reduced sufficiently for the tool to be applied to large data sets with dozens of genomes and thousands of features. To achieve as accurate a reconstruction as possible, we also develop a homologous gene family prediction tool for preprocessing. Furthermore, we build our in-house Prokaryote Genome Evolution simulator (PEGsim) for evaluation purposes. The homologous gene family prediction refinement module can refine homologous gene family predictions generated by third party de novo prediction programs by combining phylogeny and local gene synteny. We show that such refinement can be accomplished for up to 80% of homologous gene family predictions with ambiguity (mixed families). The genome evolution simulator, PEGsim, is the first random events based high level bacteria genome evolution simulator with models for all common evolutionary events at the gene, replicon, and genome scales. The concepts of conserved gene runs and horizontal gene transfer (HGT) are also built in. We show the validation of PEGsim itself and the evaluation of the last reconstruction component with simulated data produced by it. REGEN, REconstruction of GENomes, is an ancestral genome reconstruction tool based on the concept of neighboring gene pairs (NGPs). Although it does not cover the reconstruction of actual nucleotide sequences, it is capable of reconstructing gene content, contiguous genes runs, and replicon structure of each ancestor using either a maximum parsimony or a maximum likelihood criterion. Based on the reconstructed genomes, it can infer all major events…
Advisors/Committee Members: Vinatzer, Boris A. (committee member), Dickerman, Allan W. (committee member), Tyler, Brett M. (committee member), Setubal, João C. (committeecochair), Heath, Lenwood S. (committeecochair).
Subjects/Keywords: homology; genomics; NGP; phylogenetics; genome evolution simulation; ancestral genome reconstruction
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Yang, K. (2012). Ancestral Genome Reconstruction in Bacteria. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/28091
Chicago Manual of Style (16th Edition):
Yang, Kuan. “Ancestral Genome Reconstruction in Bacteria.” 2012. Doctoral Dissertation, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/28091.
MLA Handbook (7th Edition):
Yang, Kuan. “Ancestral Genome Reconstruction in Bacteria.” 2012. Web. 28 Feb 2021.
Vancouver:
Yang K. Ancestral Genome Reconstruction in Bacteria. [Internet] [Doctoral dissertation]. Virginia Tech; 2012. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/28091.
Council of Science Editors:
Yang K. Ancestral Genome Reconstruction in Bacteria. [Doctoral Dissertation]. Virginia Tech; 2012. Available from: http://hdl.handle.net/10919/28091

Virginia Tech
21.
Ahmadian, Mansooreh.
Hybrid Modeling and Simulation of Stochastic Effects on Biochemical Regulatory Networks.
Degree: PhD, Computer Science and Applications, 2020, Virginia Tech
URL: http://hdl.handle.net/10919/99481
► Cell cycle is a process in which a growing cell replicates its DNA and divides into two cells. Progression through the cell cycle is regulated…
(more)
▼ Cell cycle is a process in which a growing cell replicates its DNA and divides into two cells. Progression through the cell cycle is regulated by complex interactions between networks of genes, transcripts, and proteins. These interactions inside the confined volume of a cell are subject to inherent noise. To provide a quantitative description of the cell cycle, several deterministic and stochastic models have been developed. However, deterministic models cannot capture the intrinsic noise. In addition, stochastic modeling poses the following challenges.
First, stochastic models generally require extensive computations, particularly when applied to large networks. Second, the accuracy of stochastic models is highly dependent on the accuracy of the estimated model parameters. The goal of this dissertation is to address these challenges by developing new efficient methods for modeling and simulation of stochastic effects in biochemical networks. The results show that the proposed hybrid model that combines stochastic and deterministic modeling approaches can achieve high computational efficiency while generating accurate simulation results. Moreover, a new machine learning-based method is developed to address the parameter estimation problem in biochemical systems. The results show that the proposed method yields accurate ranges for the model parameters and highlight the potentials of model-free learning for parameter estimation in stochastic modeling of complex biochemical networks.
Advisors/Committee Members: Cao, Young (committeechair), Tyson, John J. (committeechair), Heath, Lenwood S. (committee member), Peccoud, Jean (committee member), Karpatne, Anuj (committee member).
Subjects/Keywords: Cell Cycle Modeling; Hybrid Stochastic Modeling; Cell size control; Parameter estimation; Neural network; Theory-guided machine learning
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Ahmadian, M. (2020). Hybrid Modeling and Simulation of Stochastic Effects on Biochemical Regulatory Networks. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/99481
Chicago Manual of Style (16th Edition):
Ahmadian, Mansooreh. “Hybrid Modeling and Simulation of Stochastic Effects on Biochemical Regulatory Networks.” 2020. Doctoral Dissertation, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/99481.
MLA Handbook (7th Edition):
Ahmadian, Mansooreh. “Hybrid Modeling and Simulation of Stochastic Effects on Biochemical Regulatory Networks.” 2020. Web. 28 Feb 2021.
Vancouver:
Ahmadian M. Hybrid Modeling and Simulation of Stochastic Effects on Biochemical Regulatory Networks. [Internet] [Doctoral dissertation]. Virginia Tech; 2020. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/99481.
Council of Science Editors:
Ahmadian M. Hybrid Modeling and Simulation of Stochastic Effects on Biochemical Regulatory Networks. [Doctoral Dissertation]. Virginia Tech; 2020. Available from: http://hdl.handle.net/10919/99481

Virginia Tech
22.
Badr, Eman.
Identifying Splicing Regulatory Elements with de Bruijn Graphs.
Degree: PhD, Computer Science and Applications, 2015, Virginia Tech
URL: http://hdl.handle.net/10919/73366
► Splicing regulatory elements (SREs) are short, degenerate sequences on pre-mRNA molecules that enhance or inhibit the splicing process via the binding of splicing factors, proteins…
(more)
▼ Splicing regulatory elements (SREs) are short, degenerate sequences on pre-mRNA molecules that enhance or inhibit the splicing process via the binding of splicing factors, proteins that regulate the functioning of the spliceosome. Existing methods for identifying SREs in a genome are either experimental or computational. This work tackles the limitations in the current approaches for identifying SREs. It addresses two major computational problems, identifying variable length SREs utilizing a graph-based model with de Bruijn graphs and discovering co-occurring sets of SREs (combinatorial SREs) utilizing graph mining techniques. In addition, I studied and analyzed the effect of alternative splicing on tissue specificity in human.
First, I have used a formalism based on de Bruijn graphs that combines genomic structure, word count enrichment analysis, and experimental evidence to identify SREs found in exons. In my approach, SREs are not restricted to a fixed length (i.e., k-mers, for a fixed k). Consequently, the predicted SREs are of different lengths. I identified 2001 putative exonic enhancers and 3080 putative exonic silencers for human genes, with lengths varying from 6 to 15 nucleotides. Many of the predicted SREs overlap with experimentally verified binding sites. My model provides a novel method to predict variable length putative regulatory elements computationally for further experimental investigation.
Second, I developed CoSREM (Combinatorial SRE Miner), a graph mining algorithm for discovering combinatorial SREs. The goal is to identify sets of exonic splicing regulatory elements whether they are enhancers or silencers. Experimental evidence is incorporated through my graph-based model to increase the accuracy of the results. The identified SREs do not have a predefined length, and the algorithm is not limited to identifying only SRE pairs as are current approaches. I identified 37 SRE sets that include both enhancer and silencer elements in human genes. These results intersect with previous results, including some that are experimental. I also show that the SRE set GGGAGG and GAGGAC identified by CoSREM may play a role in exon skipping events in several tumor samples.
Further, I report a genome-wide analysis to study alternative splicing on multiple human tissues, including brain, heart, liver, and muscle. I developed a pipeline to identify tissue-specific exons and hence tissue-specific SREs. Utilizing the publicly available RNA-Seq data set from the Human BodyMap project, I identified 28,100 tissue-specific exons across the four tissues. I identified 1929 exonic splicing enhancers with 99% overlap with previously published experimental and computational databases. A complicated enhancer regulatory network was revealed, where multiple enhancers were found across multiple tissues while some were found only in specific tissues. Putative combinatorial exonic enhancers and silencers were discovered as well, which may be responsible for exon inclusion or exclusion across tissues. Some of the…
Advisors/Committee Members: Heath, Lenwood S. (committeechair), Grene, Ruth (committee member), ElHefnawi, Mahmoud M. (committee member), Shaffer, Clifford A. (committee member), Zhang, Liqing (committee member).
Subjects/Keywords: Alternative splicing; de Bruijn graphs; algorithms; graph mining; splicing regulatory elements
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Badr, E. (2015). Identifying Splicing Regulatory Elements with de Bruijn Graphs. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/73366
Chicago Manual of Style (16th Edition):
Badr, Eman. “Identifying Splicing Regulatory Elements with de Bruijn Graphs.” 2015. Doctoral Dissertation, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/73366.
MLA Handbook (7th Edition):
Badr, Eman. “Identifying Splicing Regulatory Elements with de Bruijn Graphs.” 2015. Web. 28 Feb 2021.
Vancouver:
Badr E. Identifying Splicing Regulatory Elements with de Bruijn Graphs. [Internet] [Doctoral dissertation]. Virginia Tech; 2015. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/73366.
Council of Science Editors:
Badr E. Identifying Splicing Regulatory Elements with de Bruijn Graphs. [Doctoral Dissertation]. Virginia Tech; 2015. Available from: http://hdl.handle.net/10919/73366
23.
Krishnan, Siddharth.
Seeing the Forest for the Trees: New approaches to Characterizing and Forecasting Cascades.
Degree: PhD, Computer Science and Applications, 2018, Virginia Tech
URL: http://hdl.handle.net/10919/83362
► Cascades are a popular construct to observe and study information propagation (or diffusion) in social media such as Twitter and are defined using notions of…
(more)
▼ Cascades are a popular construct to observe and study information propagation (or diffusion) in social media such as Twitter and are defined using notions of influence, activity, or discourse commonality (e.g., hashtags). While these notions of cascades lead to different perspectives, primarily cascades are modeled as trees. We argue in this thesis an alternative viewpoint of cascades as forests (of trees) which yields a richer vocabulary of features to understand information propagation. We propose to develop a framework to extract forests and analyze their growth by studying their evolution at the tree-level and at the node-level. Furthermore, we outline four different problems that use the forest framework. First, we show that such forests of information cascades can be used to design counter-contagion algorithms to disrupt the spread of negative campaigns or rumors. Secondly, we demonstrate how such forests of information cascades can give us a rich set of features (structural and temporal), which can be used to forecast information flow. Thirdly, we argue that cascades modeled as forests can help us glean social network sensors to detect future contagious outbreaks that occur in the social network. To conclude, we show preliminary results of an approach - a generative model, that can describe information cascades modeled as forests and can generate synthetic cascades with empirical properties mirroring cascades extracted from Twitter.
Advisors/Committee Members: Heath, Lenwood S. (committeechair), Ras, Zbigniew W. (committee member), Mitra, Tanushree (committee member), Ribbens, Calvin J. (committee member), Marathe, Madhav Vishnu (committee member).
Subjects/Keywords: Information cascades; Forecasting
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Krishnan, S. (2018). Seeing the Forest for the Trees: New approaches to Characterizing and Forecasting Cascades. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/83362
Chicago Manual of Style (16th Edition):
Krishnan, Siddharth. “Seeing the Forest for the Trees: New approaches to Characterizing and Forecasting Cascades.” 2018. Doctoral Dissertation, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/83362.
MLA Handbook (7th Edition):
Krishnan, Siddharth. “Seeing the Forest for the Trees: New approaches to Characterizing and Forecasting Cascades.” 2018. Web. 28 Feb 2021.
Vancouver:
Krishnan S. Seeing the Forest for the Trees: New approaches to Characterizing and Forecasting Cascades. [Internet] [Doctoral dissertation]. Virginia Tech; 2018. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/83362.
Council of Science Editors:
Krishnan S. Seeing the Forest for the Trees: New approaches to Characterizing and Forecasting Cascades. [Doctoral Dissertation]. Virginia Tech; 2018. Available from: http://hdl.handle.net/10919/83362

Virginia Tech
24.
Arango Argoty, Gustavo Alonso.
Computational Tools for Annotating Antibiotic Resistance in Metagenomic Data.
Degree: PhD, Computer Science and Applications, 2019, Virginia Tech
URL: http://hdl.handle.net/10919/88987
► Antimicrobial resistance (AMR) is one of the biggest threats to human public health. It has been estimated that the number of deaths caused by AMR…
(more)
▼ Antimicrobial resistance (AMR) is one of the biggest threats to human public health. It has been estimated that the number of deaths caused by AMR will surpass the ones caused by cancer on 2050. The seriousness of these projections requires urgent actions to understand and control the spread of AMR. In the last few years, metagenomics has stand out as a reliable tool for the analysis of the microbial diversity and the AMR. By the use of next generation sequencing, metagenomic studies can generate millions of short sequencing reads that are processed by computational tools. However, with the rapid adoption of metagenomics, a large amount of data has been generated. This situation requires the development of computational tools and pipelines to manage the data scalability, accessibility, and performance. In this thesis, several strategies varying from command line, web-based platforms to machine learning have been developed to address these computational challenges. In particular, by the development of computational pipelines to process metagenomics data in the cloud and distributed systems, the development of machine learning and deep learning tools to ease the computational cost of detecting antibiotic resistance genes in metagenomic data, and the integration of crowdsourcing as a way to curate and validate antibiotic resistance genes.
Advisors/Committee Members: Zhang, Liqing (committeechair), Heath, Lenwood S. (committee member), Xiao, Weidong (committee member), Pruden, Amy (committee member), Meng, Na (committee member).
Subjects/Keywords: bioinformatics; metagenomics; antibiotic resistance; machine learning
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Arango Argoty, G. A. (2019). Computational Tools for Annotating Antibiotic Resistance in Metagenomic Data. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/88987
Chicago Manual of Style (16th Edition):
Arango Argoty, Gustavo Alonso. “Computational Tools for Annotating Antibiotic Resistance in Metagenomic Data.” 2019. Doctoral Dissertation, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/88987.
MLA Handbook (7th Edition):
Arango Argoty, Gustavo Alonso. “Computational Tools for Annotating Antibiotic Resistance in Metagenomic Data.” 2019. Web. 28 Feb 2021.
Vancouver:
Arango Argoty GA. Computational Tools for Annotating Antibiotic Resistance in Metagenomic Data. [Internet] [Doctoral dissertation]. Virginia Tech; 2019. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/88987.
Council of Science Editors:
Arango Argoty GA. Computational Tools for Annotating Antibiotic Resistance in Metagenomic Data. [Doctoral Dissertation]. Virginia Tech; 2019. Available from: http://hdl.handle.net/10919/88987

Virginia Tech
25.
Jiang, Xiaofang.
Genomics and Transcriptomics Analysis of the Asian Malaria Mosquito Anopheles stephensi.
Degree: PhD, Genetics, Bioinformatics, and Computational Biology, 2016, Virginia Tech
URL: http://hdl.handle.net/10919/79959
► Anopheles stephensi is a potent vector of malaria throughout the Indian subcontinent and Middle East. An. stephensi is emerging as a model for molecular and…
(more)
▼ Anopheles stephensi is a potent vector of malaria throughout the Indian subcontinent and Middle East. An. stephensi is emerging as a model for molecular and genetic studies of mosquito-parasite interactions. Here we conducted a series of genomic and transcriptomic studies to improve the understanding of the biology of Anopheles stephensi and mosquito in general.
First we reported the genome sequence and annotation of the Indian strain of the type form of An. stephensi. The 221 Mb genome assembly was produced using a combination of 454, Illumina, and PacBio sequencing. This hybrid assembly method was significantly better than assemblies generated from a single data source. A total of 11,789 protein-encoding genes were annotated using a combination of homology and de novo prediction.
Secondly, we demonstrated the presence of complete dosage compensation in An. stephensi by determining that autosomal and X-linked genes have very similar levels of expression in both males and females. The uniformity of average expression levels of autosomal and X-linked genes remained when An. stephensi gene expression was normalized by that of their Ae. aegypti orthologs, strengthening the conclusion of complete dosage compensation in Anopheles.
Lastly, we investigated trans-splicing events in Anopheles stephensi. We identified six trans-splicing events and all the trans-splicing sites are conserved and present in Ae. aegypti. The proteins encoded by the trans-spliced mRNAs are also highly conserved and their orthologs are co-linearly transcribed in out-groups of family Culicidae. This finding indicates the need to preserve the intact mRNA and protein function of the broken-up genes by trans-splicing during evolution.
In summary, we presented the first genome assembly of Anopheles stephensi and studied two interesting evolution events" dosage compensation and trans-splicing - via transcriptomic analysis.
Advisors/Committee Members: Tu, Zhijian Jake (committeechair), Bevan, David R. (committee member), Heath, Lenwood S. (committee member), Zhang, Liqing (committee member), Sharakhov, Igor V. (committee member).
Subjects/Keywords: genomic; comparative transcriptomes; dosage compensation; sex-specific expression Iso-Seq; trans-splicing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Jiang, X. (2016). Genomics and Transcriptomics Analysis of the Asian Malaria Mosquito Anopheles stephensi. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/79959
Chicago Manual of Style (16th Edition):
Jiang, Xiaofang. “Genomics and Transcriptomics Analysis of the Asian Malaria Mosquito Anopheles stephensi.” 2016. Doctoral Dissertation, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/79959.
MLA Handbook (7th Edition):
Jiang, Xiaofang. “Genomics and Transcriptomics Analysis of the Asian Malaria Mosquito Anopheles stephensi.” 2016. Web. 28 Feb 2021.
Vancouver:
Jiang X. Genomics and Transcriptomics Analysis of the Asian Malaria Mosquito Anopheles stephensi. [Internet] [Doctoral dissertation]. Virginia Tech; 2016. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/79959.
Council of Science Editors:
Jiang X. Genomics and Transcriptomics Analysis of the Asian Malaria Mosquito Anopheles stephensi. [Doctoral Dissertation]. Virginia Tech; 2016. Available from: http://hdl.handle.net/10919/79959

Virginia Tech
26.
Aghamirzaie, Delasa.
Isoform-Specific Expression During Embryo Development in Arabidopsis and Soybean.
Degree: PhD, Genetics, Bioinformatics, and Computational Biology, 2016, Virginia Tech
URL: http://hdl.handle.net/10919/73054
► Almost every precursor mRNA (pre-mRNA) in a eukaryotic organism undergoes splicing, in some cases resulting in the formation of more than one splice variant, a…
(more)
▼ Almost every precursor mRNA (pre-mRNA) in a eukaryotic organism undergoes splicing, in some cases resulting in the formation of more than one splice variant, a process called alternative splicing. RNA-Seq provides a major opportunity to capture the state of the transcriptome, which includes the detection of alternative spicing events. Alternative splicing is a highly regulated process occurring in a complex machinery called the spliceosome. In this dissertation, I focus on identification of different splice variants and splicing factors that are produced during Arabidopsis and soybean embryo development. I developed several data analysis pipelines for the detection and the functional characterization of active splice variants and splicing factors that arise during embryo development. The main goal of this dissertation was to identify transcriptional changes associated with specific stages of embryo development and infer possible associations between known regulatory genes and their targets. We identified several instances of exon skipping and intron retention as products of alternative splicing. The coding potential of the splice variants were evaluated using CodeWise. I developed CodeWise, a weighted support vector machine classifier to assess the coding potential of novel transcripts with respect to RNA secondary structure free energy, conserved domains, and sequence properties. We also examined the effect of alternative splicing on the domain composition of resulting protein isoforms. The majority of splice variants pairs encode proteins with identical domains or similar domains with truncation and in less than 10% of the cases alternative splicing results in gain or loss of a conserved domain. I constructed several possible regulatory networks that occur at specific stages of embryo development. In addition, in order to gain a better understanding of splicing regulation, we developed the concept of co-splicing networks, as a group of transcripts containing common RNA-binding motifs, which are co-expressed with a specific splicing factor. For this purpose, I developed a multi-stage analysis pipeline to integrate the co-expression networks with de novo RNA binding motif discovery at inferred splice sites, resulting in the identification of specific splicing factors and the corresponding cis-regulatory sequences that cause the production of splice variants. This approach resulted in the development of several novel hypotheses about the regulation of minor and major splicing in developing Arabidopsis embryos. In summary, this dissertation provides a comprehensive view of splicing regulation in Arabidopsis and soybean embryo development using computational analysis.
Advisors/Committee Members: Grene, Ruth (committeechair), Collakova, Eva (committeechair), Heath, Lenwood S. (committee member), Holliday, Jason A. (committee member), Li, Song (committee member).
Subjects/Keywords: Alternative splicing; data analysis; bioinformatics; transcriptomics; RNA-Seq; noncoding RNAs; machine learning; computational biology
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Aghamirzaie, D. (2016). Isoform-Specific Expression During Embryo Development in Arabidopsis and Soybean. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/73054
Chicago Manual of Style (16th Edition):
Aghamirzaie, Delasa. “Isoform-Specific Expression During Embryo Development in Arabidopsis and Soybean.” 2016. Doctoral Dissertation, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/73054.
MLA Handbook (7th Edition):
Aghamirzaie, Delasa. “Isoform-Specific Expression During Embryo Development in Arabidopsis and Soybean.” 2016. Web. 28 Feb 2021.
Vancouver:
Aghamirzaie D. Isoform-Specific Expression During Embryo Development in Arabidopsis and Soybean. [Internet] [Doctoral dissertation]. Virginia Tech; 2016. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/73054.
Council of Science Editors:
Aghamirzaie D. Isoform-Specific Expression During Embryo Development in Arabidopsis and Soybean. [Doctoral Dissertation]. Virginia Tech; 2016. Available from: http://hdl.handle.net/10919/73054

Virginia Tech
27.
Dang, Ha Xuan.
Mold Allergomics: Comparative and Machine Learning Approaches.
Degree: PhD, Genetics, Bioinformatics, and Computational Biology, 2014, Virginia Tech
URL: http://hdl.handle.net/10919/64205
► Fungi are one of the major organisms that cause allergic disease in human. A number of proteins from fungi have been found to be allergenic…
(more)
▼ Fungi are one of the major organisms that cause allergic disease in human. A number of proteins from fungi have been found to be allergenic or possess immunostimulatory properties. Identifying and characterizing allergens from fungal genomes will help facilitate our understanding of the mechanism underlying host-pathogen interactions in allergic diseases. Currently, there is a lack of tools that allow us to rapidly and accurately predict allergens from whole genomes. In the context of whole genome annotation, allergens are rare compared to non-allergens and thus the data is considered highly skewed. In order to achieve a confident set of predicted allergens from a genome, false positive rates must be lowered. Current allergen prediction tools often produce many false positives when applied to large-scale data set such as whole genomes, and thus lower the precision. Moreover, the most accurate tools are relatively slow because they use sequence alignment to construct feature vectors for allergen classifiers. This dissertation presents computational approaches in characterizing the allergen repertoire in fungal genomes as part of the whole genome studies of Alternaria, an important allergenic/opportunistic human pathogenic fungus and necrotrophic plant parasite. In these studies, the genomes of multiple Alternaria species were characterized for the first time. Functional elements (e.g. genes, proteins) were first identified and annotated from these genomes using computational tools. Protein annotation and comparative genomics approaches revealed the link between Alternaria genotypes and its prolific saprophytic lifestyle that provides at least a partial explanation for the development of pathological relationships between Alternaria and humans. A machine learning based tool (Allerdictor) was developed to address the neglected problem of allergen prediction in highly skewed large-scale data sets. Allerdictor exhibited high precision over high recall at fast speed and thus it is a more practical tool for large-scale allergen annotation compared with existing tools. Allerdictor was then used together with a comparative genomics approach to survey the allergen repertoire of known allergenic fungi. We predicted a number of mold allergens that have not been experimentally characterized. These predicted allergens are potential candidates for further experimental and clinical validation. Our approaches will not only facilitate the study of allergens in the increasing number of sequenced fungal genomes but also will be useful for allergen annotation in other species and rapid prescreening of synthesized sequences for potential allergens.
Advisors/Committee Members: Lawrence, Christopher B. (committeechair), Tyler, Brett M. (committee member), Heath, Lenwood S. (committee member), Murali, T. M. (committee member).
Subjects/Keywords: Fungal genomics; comparative genomics; allergy; allergen prediction
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Dang, H. X. (2014). Mold Allergomics: Comparative and Machine Learning Approaches. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/64205
Chicago Manual of Style (16th Edition):
Dang, Ha Xuan. “Mold Allergomics: Comparative and Machine Learning Approaches.” 2014. Doctoral Dissertation, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/64205.
MLA Handbook (7th Edition):
Dang, Ha Xuan. “Mold Allergomics: Comparative and Machine Learning Approaches.” 2014. Web. 28 Feb 2021.
Vancouver:
Dang HX. Mold Allergomics: Comparative and Machine Learning Approaches. [Internet] [Doctoral dissertation]. Virginia Tech; 2014. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/64205.
Council of Science Editors:
Dang HX. Mold Allergomics: Comparative and Machine Learning Approaches. [Doctoral Dissertation]. Virginia Tech; 2014. Available from: http://hdl.handle.net/10919/64205

Virginia Tech
28.
Liu, Mingming.
Predicting the Functional Effects of Human Short Variations Using Hidden Markov Models.
Degree: PhD, Computer Science and Applications, 2015, Virginia Tech
URL: http://hdl.handle.net/10919/73703
► With the development of sequencing technologies, more and more sequence variants are available for investigation. Different types of variants in the human genome have been…
(more)
▼ With the development of sequencing technologies, more and more sequence variants are available for investigation. Different types of variants in the human genome have been identified, including single nucleotide polymorphisms (SNPs), short insertions and deletions (indels), and large structural variations such as large duplications and deletions. Of great research interest is the functional effects of these variants. Although many programs have been developed to predict the effect of SNPs,
few can be used to predict the effect of indels or multiple variants, such as multiple SNPs,
multiple indels, or a combination of both. Moreover, fine grained prediction of the functional outcome
of variants is not available. To address these limitations, we developed a prediction framework, HMMvar, to predict the functional effects of coding variants (SNPs or indels), using profile hidden Markov models (HMMs). Based on HMMvar, we proposed HMMvar-multi to explore the joint effects of multiple variants in the same gene. For fine grained functional outcome prediction, we developed HMMvar-func to computationally define and predict four types of functional outcome of a variant: gain, loss, switch, and conservation of function.
Advisors/Committee Members: Zhang, Liqing (committeechair), Heath, Lenwood S. (committee member), Wu, Xiaowei (committee member), Hu, Jianjun (committee member), Watson, Layne T. (committee member).
Subjects/Keywords: Genetic variation; Indel; SNP; Hidden Markov Model
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Liu, M. (2015). Predicting the Functional Effects of Human Short Variations Using Hidden Markov Models. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/73703
Chicago Manual of Style (16th Edition):
Liu, Mingming. “Predicting the Functional Effects of Human Short Variations Using Hidden Markov Models.” 2015. Doctoral Dissertation, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/73703.
MLA Handbook (7th Edition):
Liu, Mingming. “Predicting the Functional Effects of Human Short Variations Using Hidden Markov Models.” 2015. Web. 28 Feb 2021.
Vancouver:
Liu M. Predicting the Functional Effects of Human Short Variations Using Hidden Markov Models. [Internet] [Doctoral dissertation]. Virginia Tech; 2015. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/73703.
Council of Science Editors:
Liu M. Predicting the Functional Effects of Human Short Variations Using Hidden Markov Models. [Doctoral Dissertation]. Virginia Tech; 2015. Available from: http://hdl.handle.net/10919/73703
29.
Song, Qi.
Developing machine learning tools to understand transcriptional regulation in plants.
Degree: PhD, Genetics, Bioinformatics, and Computational Biology, 2019, Virginia Tech
URL: http://hdl.handle.net/10919/93512
► Abiotic stresses constitute a major category of stresses that negatively impact plant growth and development. It is important to understand how plants cope with environmental…
(more)
▼ Abiotic stresses constitute a major category of stresses that negatively impact plant growth and development. It is important to understand how plants cope with environmental stresses and reprogram gene responses which in turn confers stress tolerance to plants. Genomics technology has been used in past decade to generate gene expression data under different abiotic stresses for the model plant, Arabidopsis. Recent new genomic technologies, such as DAP-seq, have generated large scale regulatory maps that provide information regarding which gene has the potential to regulate other genes in the genome. However, this technology does not provide context specific interactions. It is unknown which transcription factor can regulate which gene under a specific abiotic stress condition. To address this challenge, several computational tools were developed to identify regulatory interactions and co-regulating genes for stress response. In addition, using single cell RNA-seq data generated from the model plant organism Arabidopsis, preliminary analysis was performed to build model that classifies Arabidopsis root cell types. This analysis is the first step towards the ultimate goal of constructing cell-typespecific regulatory network for Arabidopsis, which is important for improving current understanding of stress response in plants.
Advisors/Committee Members: Li, Song (committeechair), Grene, Ruth (committee member), Heath, Lenwood S. (committee member), Haak, David C. (committee member), Zhang, Liqing (committee member).
Subjects/Keywords: regulatory network; machine learning; genomics
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Song, Q. (2019). Developing machine learning tools to understand transcriptional regulation in plants. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/93512
Chicago Manual of Style (16th Edition):
Song, Qi. “Developing machine learning tools to understand transcriptional regulation in plants.” 2019. Doctoral Dissertation, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/93512.
MLA Handbook (7th Edition):
Song, Qi. “Developing machine learning tools to understand transcriptional regulation in plants.” 2019. Web. 28 Feb 2021.
Vancouver:
Song Q. Developing machine learning tools to understand transcriptional regulation in plants. [Internet] [Doctoral dissertation]. Virginia Tech; 2019. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/93512.
Council of Science Editors:
Song Q. Developing machine learning tools to understand transcriptional regulation in plants. [Doctoral Dissertation]. Virginia Tech; 2019. Available from: http://hdl.handle.net/10919/93512
30.
Elmarakeby, Haitham Abdulrahman.
Deep Learning for Biological Problems.
Degree: PhD, Computer Science and Applications, 2017, Virginia Tech
URL: http://hdl.handle.net/10919/86264
► The last decade has witnessed a tremendous increase in the amount of available biological data. Different technologies for measuring the genome, epigenome, transcriptome, proteome, metabolome,…
(more)
▼ The last decade has witnessed a tremendous increase in the amount of available biological data. Different technologies for measuring the genome, epigenome, transcriptome, proteome, metabolome, and microbiome in different organisms are producing large amounts of high-dimensional data every day. High-dimensional data provides unprecedented challenges and opportunities to gain a better understanding of biological systems. Unlike other data types, biological data imposes more constraints on researchers. Biologists are not only interested in accurate predictive models that capture complex input-output relationships, but they also seek a deep understanding of these models.
In the last few years, deep models have achieved better performance in computational prediction tasks compared to other approaches. Deep models have been extensively used in processing natural data, such as images, text, and recently sound. However, application of deep models in biology is limited. Here, I propose to use deep models for output prediction, dimension reduction, and feature selection of biological data to get better interpretation and understanding of biological systems. I demonstrate the applicability of deep models in a domain that has a high and direct impact on health care.
In this research, novel deep learning models have been introduced to solve pressing biological problems. The research shows that deep models can be used to automatically extract features from raw inputs without the need to manually craft features. Deep models are used to reduce the dimensionality of the input space, which resulted in faster training. Deep models are shown to have better performance and less variant output when compared to other shallow models even when an ensemble of shallow models is used. Deep models are shown to be able to process non-classical inputs such as sequences. Deep models are shown to be able to naturally process input sequences to automatically extract useful features.
Advisors/Committee Members: Heath, Lenwood S. (committeechair), Zhang, Liqing (committee member), Feng, Wu-Chun (committee member), ElHefnawi, Mahmoud M. (committee member), Sheng, Zhi (committee member).
Subjects/Keywords: Machine Learning; Computational Biology; Deep Learning; Cancer; Drug Response
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Elmarakeby, H. A. (2017). Deep Learning for Biological Problems. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/86264
Chicago Manual of Style (16th Edition):
Elmarakeby, Haitham Abdulrahman. “Deep Learning for Biological Problems.” 2017. Doctoral Dissertation, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/86264.
MLA Handbook (7th Edition):
Elmarakeby, Haitham Abdulrahman. “Deep Learning for Biological Problems.” 2017. Web. 28 Feb 2021.
Vancouver:
Elmarakeby HA. Deep Learning for Biological Problems. [Internet] [Doctoral dissertation]. Virginia Tech; 2017. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/86264.
Council of Science Editors:
Elmarakeby HA. Deep Learning for Biological Problems. [Doctoral Dissertation]. Virginia Tech; 2017. Available from: http://hdl.handle.net/10919/86264
◁ [1] [2] [3] ▶
.