Language: English ❌
You searched for +publisher:"Rutgers University" +contributor:("Chen, Kevin")
.
Showing records 1 – 11 of
11 total matches.
No search limiters apply to these results.

Rutgers University
1.
Roy, Rajat Shuvro, 1983-.
Improving genome assembly by identifying reliable sequencing data.
Degree: PhD, Computer Science, 2014, Rutgers University
URL: https://rucore.libraries.rutgers.edu/rutgers-lib/45449/
► De novo Genome assembly and k-mer frequency counting are two of the classical prob- lems of Bioinformatics. k-mer counting helps to identify genomic k-mers from…
(more)
▼ De novo Genome assembly and k-mer frequency counting are two of the classical prob- lems of Bioinformatics. k-mer counting helps to identify genomic k-mers from sequenced reads which may then inform read correction or genome assembly. Genome assembly has two major subproblems: contig construction and scaffolding. A contig is a continu- ous sub-sequence of the genome assembled from sequencing reads. Scaffolding attempts to construct a linear sequence of contigs (with possible gaps in between) using paired reads (two reads whose distance on the genome is approximately known). In this the- sis I will present a new computationally efficient tool for identifying frequent k-mers which are more likely to be genomic, and a set of linear inequalities which can improve scaffolding (which is known to be NP-hard) by identifying reliable paired reads. Identifying reliable k-mers from Whole Genome Amplification (WGA) data is more challenging compared to multi-cell data due to the coverage variation introduced by the amplification step (MDA, MALBEC, etc.), which implies that applying a simple k- mer frequency cutoff is unreasonable. We observed that with sufficient coverage, using partial reads (read prefix of a certain length) of length approximately twice or less than that of the k-mer length recovers a large proportion of genomic k-mers while keeping the proportion of erroneous k-mers low. We show that using partial reads for assembly ii and gene prediction recovers a significant proportion of genes and propose to use this approach for rapid pathogen detection in combination with Single Cell Genomics (SCG). Thanks to SCG, it is now possible to isolate one single cell from environmental sam- ple, extract its DNA and perform genetic sequencing without any need for culturing the cell in the lab. We show that current bioinformatic tools are capable of charac- terizing a novel organism by producing a draft genome assembly and gene annotation from single cell data of a MAST-4 stramenopile. This demonstrates the potential of SCG for genetic study of the vast majority of environmental organisms that has so far eluded scientists as they cannot be brought into culture, typically a necessity for future studies.
Advisors/Committee Members: Schliep, Alexander (chair), Bhattacharya, Debashish (co-chair), Chen, Kevin (internal member), Farach-Colton, Martin (internal member), Grigoriev, Andrey (outside member).
Subjects/Keywords: Genomes – Analysis; Gene amplification
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Roy, Rajat Shuvro, 1. (2014). Improving genome assembly by identifying reliable sequencing data. (Doctoral Dissertation). Rutgers University. Retrieved from https://rucore.libraries.rutgers.edu/rutgers-lib/45449/
Chicago Manual of Style (16th Edition):
Roy, Rajat Shuvro, 1983-. “Improving genome assembly by identifying reliable sequencing data.” 2014. Doctoral Dissertation, Rutgers University. Accessed January 18, 2021.
https://rucore.libraries.rutgers.edu/rutgers-lib/45449/.
MLA Handbook (7th Edition):
Roy, Rajat Shuvro, 1983-. “Improving genome assembly by identifying reliable sequencing data.” 2014. Web. 18 Jan 2021.
Vancouver:
Roy, Rajat Shuvro 1. Improving genome assembly by identifying reliable sequencing data. [Internet] [Doctoral dissertation]. Rutgers University; 2014. [cited 2021 Jan 18].
Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/45449/.
Council of Science Editors:
Roy, Rajat Shuvro 1. Improving genome assembly by identifying reliable sequencing data. [Doctoral Dissertation]. Rutgers University; 2014. Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/45449/

Rutgers University
2.
Sasson, Ariella Syma, 1978-.
From millions to one: theoretical and concrete approaches to De Novo assembly using short read DNA sequences.
Degree: PhD, Computational Biology and Molecular Biophysics, 2010, Rutgers University
URL: http://hdl.rutgers.edu/1782.1/rucore10001600001.ETD.000056766
► One of the most significant advances in biology has been the ability to sequence the DNA of organisms. Even in the shadow of the completion…
(more)
▼ One of the most significant advances in biology has been the ability to sequence the DNA of organisms. Even in the shadow of the completion of the human genome, intractable regions of the genome remain incomplete. Next generation high-throughput short read sequencing technologies are now available and have the ability to generate millions of short read DNA sequences per run. Although greater coverage depths are possible, de novo sequence assembly with these shorter sequences is significantly more complex than resequencing; handling them presents new computational problems and opportunities. Identifying repetitive regions, coping with sequencing errors, and manipulating the millions of short reads simultaneously, are some of the difficulties that must be overcome. As a result of these complexities and working with the short read sequences from the Waksman SOLiD sequencing platform, this work explores the problem of de novo assembly. Initially, we develop tools for filtering short read sequence data based on quality scores and find that this procedure is critical for the success of the subsequent de novo assembly. Next, we analyze the key phenomena responsible for producing contigs that are much shorter than the values provided by theoretical estimates. Finally, we explore two different routes to circumventing the difficulty imposed by short contigs. The first involves utilization of information from multiple orthologous genomes in a comparative assembly. In particular, we developed a pipeline for using the reference genome of a close by relative to improve genome assembly. The second approach uses paired read information to build scaffolds that are two orders of magnitude larger than the original contigs. For typical bacterial genomes, less than one hundred of these scaffolds are required to cover the entire genome. The combination of short reads from various platforms, assembly, and recovery pipelines brings mid-sized genomes close to completion. As a result, minimal additional work using conventional sequencing technologies are enough to close the remaining small gaps and return a finished single genome. Current advancements in sequencing technologies leave us hopeful that it would be possible to provide fairly complete assemblies for complex genomes via these technological approaches.
Advisors/Committee Members: Sasson, Ariella Syma, 1978- (author), Sengupta, Anirvan (chair), Chen, Kevin (internal member), Schliep, Alexander (internal member), Bhanot, Gyan (internal member), Sidote, David (outside member).
Subjects/Keywords: DNA – Research – Technique
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Sasson, Ariella Syma, 1. (2010). From millions to one: theoretical and concrete approaches to De Novo assembly using short read DNA sequences. (Doctoral Dissertation). Rutgers University. Retrieved from http://hdl.rutgers.edu/1782.1/rucore10001600001.ETD.000056766
Chicago Manual of Style (16th Edition):
Sasson, Ariella Syma, 1978-. “From millions to one: theoretical and concrete approaches to De Novo assembly using short read DNA sequences.” 2010. Doctoral Dissertation, Rutgers University. Accessed January 18, 2021.
http://hdl.rutgers.edu/1782.1/rucore10001600001.ETD.000056766.
MLA Handbook (7th Edition):
Sasson, Ariella Syma, 1978-. “From millions to one: theoretical and concrete approaches to De Novo assembly using short read DNA sequences.” 2010. Web. 18 Jan 2021.
Vancouver:
Sasson, Ariella Syma 1. From millions to one: theoretical and concrete approaches to De Novo assembly using short read DNA sequences. [Internet] [Doctoral dissertation]. Rutgers University; 2010. [cited 2021 Jan 18].
Available from: http://hdl.rutgers.edu/1782.1/rucore10001600001.ETD.000056766.
Council of Science Editors:
Sasson, Ariella Syma 1. From millions to one: theoretical and concrete approaches to De Novo assembly using short read DNA sequences. [Doctoral Dissertation]. Rutgers University; 2010. Available from: http://hdl.rutgers.edu/1782.1/rucore10001600001.ETD.000056766

Rutgers University
3.
Tsitron, Julia, 1976-.
Computational analysis of olfaction and artificial nose technologies.
Degree: Computational Biology and Molecular Biophysics, 2013, Rutgers University
URL: https://rucore.libraries.rutgers.edu/rutgers-lib/41937/
Subjects/Keywords: Olfactory sensors; Bayesian statistical decision theory
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Tsitron, Julia, 1. (2013). Computational analysis of olfaction and artificial nose technologies. (Thesis). Rutgers University. Retrieved from https://rucore.libraries.rutgers.edu/rutgers-lib/41937/
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Tsitron, Julia, 1976-. “Computational analysis of olfaction and artificial nose technologies.” 2013. Thesis, Rutgers University. Accessed January 18, 2021.
https://rucore.libraries.rutgers.edu/rutgers-lib/41937/.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Tsitron, Julia, 1976-. “Computational analysis of olfaction and artificial nose technologies.” 2013. Web. 18 Jan 2021.
Vancouver:
Tsitron, Julia 1. Computational analysis of olfaction and artificial nose technologies. [Internet] [Thesis]. Rutgers University; 2013. [cited 2021 Jan 18].
Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/41937/.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Tsitron, Julia 1. Computational analysis of olfaction and artificial nose technologies. [Thesis]. Rutgers University; 2013. Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/41937/
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Rutgers University
4.
White, Amelia, 1977-.
Automated and quantitative phenotyping of C. elegans genetic screens from high-throughput image data.
Degree: Computational Biology and Molecular Biophysics, 2013, Rutgers University
URL: https://rucore.libraries.rutgers.edu/rutgers-lib/41948/
Subjects/Keywords: Computational biology; Diagnostic imaging; Caenorhabditis elegans – Research
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
White, Amelia, 1. (2013). Automated and quantitative phenotyping of C. elegans genetic screens from high-throughput image data. (Thesis). Rutgers University. Retrieved from https://rucore.libraries.rutgers.edu/rutgers-lib/41948/
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
White, Amelia, 1977-. “Automated and quantitative phenotyping of C. elegans genetic screens from high-throughput image data.” 2013. Thesis, Rutgers University. Accessed January 18, 2021.
https://rucore.libraries.rutgers.edu/rutgers-lib/41948/.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
White, Amelia, 1977-. “Automated and quantitative phenotyping of C. elegans genetic screens from high-throughput image data.” 2013. Web. 18 Jan 2021.
Vancouver:
White, Amelia 1. Automated and quantitative phenotyping of C. elegans genetic screens from high-throughput image data. [Internet] [Thesis]. Rutgers University; 2013. [cited 2021 Jan 18].
Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/41948/.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
White, Amelia 1. Automated and quantitative phenotyping of C. elegans genetic screens from high-throughput image data. [Thesis]. Rutgers University; 2013. Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/41948/
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Rutgers University
5.
Calviño Torterolo, Martĺn.
Comparative genomics of the stem transcriptome from grain and sweet sorghum.
Degree: PhD, Plant Biology, 2014, Rutgers University
URL: https://rucore.libraries.rutgers.edu/rutgers-lib/45214/
► The current dissertation relates to comparative genomics of grain and sweet sorghum, in particular, to their stem’s transcriptome at the time of flowering, when soluble…
(more)
▼ The current dissertation relates to comparative genomics of grain and sweet sorghum, in particular, to their stem’s transcriptome at the time of flowering, when soluble sugars accumulate more abundantly in the sweet sorghum cultivar Rio than in the grain sorghum cultivar BTx623. The accumulation of soluble sugars in the stem of sorghum is a valuable agronomic trait because their fermentation into ethanol is currently being used as source of biofuel. High soluble sugar content in stems is a trait also present in the closely related grass sugarcane. Thus, it is reasonable to assume that sweet sorghum and sugarcane may use the same gene products that leads to high soluble sugar content is stems. My dissertation consists of five chapters, the results of which are five publications as first author. In Chapter 1 I summarized the current status of sweet sorghum genomics and highlighted future research directions. My scientific contribution to the field was also mentioned. In Chapters 2 and 3 I described the first characterization of the stem’s transcriptome from grain and sweet sorghum cultivars using sugarcane Affymetrix arrays, and the use of this transcriptome data to develop molecular markers based on the differences in hybridization intensity from grain and sweet sorghum RNAs to the arrays. In Chapter 4, I described the first characterization of the small RNA component of the stem from grain sorghum BTx623 and sweet sorghum Rio cultivars, and from F2 plants derived from their cross that segregated for sugar content and flowering time. I was able to identify the microRNA family miR169, whose expression co-segregated with sugar content in stems. I also discovered nine new microRNAs in the sorghum genome. In Chapter 5 I described the genomic comparison of MIR169 gene clusters among five different grasses and identified five new MIR169 gene copies in the sorghum genome.
Advisors/Committee Members: Messing, Joachim (chair), Maliga, Pal (internal member), Tumer, Nilgun (internal member), Chen, Kevin (outside member).
Subjects/Keywords: Grain – Genome mapping; Sorghum – Genome mapping; Botanical chemistry
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Calviño Torterolo, M. (2014). Comparative genomics of the stem transcriptome from grain and sweet sorghum. (Doctoral Dissertation). Rutgers University. Retrieved from https://rucore.libraries.rutgers.edu/rutgers-lib/45214/
Chicago Manual of Style (16th Edition):
Calviño Torterolo, Martĺn. “Comparative genomics of the stem transcriptome from grain and sweet sorghum.” 2014. Doctoral Dissertation, Rutgers University. Accessed January 18, 2021.
https://rucore.libraries.rutgers.edu/rutgers-lib/45214/.
MLA Handbook (7th Edition):
Calviño Torterolo, Martĺn. “Comparative genomics of the stem transcriptome from grain and sweet sorghum.” 2014. Web. 18 Jan 2021.
Vancouver:
Calviño Torterolo M. Comparative genomics of the stem transcriptome from grain and sweet sorghum. [Internet] [Doctoral dissertation]. Rutgers University; 2014. [cited 2021 Jan 18].
Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/45214/.
Council of Science Editors:
Calviño Torterolo M. Comparative genomics of the stem transcriptome from grain and sweet sorghum. [Doctoral Dissertation]. Rutgers University; 2014. Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/45214/

Rutgers University
6.
Diao, Liyang, 1986-.
Applications of the mixed linear model in genome-wide association studies and small RNA motif discovery.
Degree: PhD, Computational Biology and Molecular Biophysics, 2014, Rutgers University
URL: https://rucore.libraries.rutgers.edu/rutgers-lib/45229/
► If sheer number of papers published is indicative of anything, it suggests that the age of genome-wide association studies, or GWAS, is here to stay.…
(more)
▼ If sheer number of papers published is indicative of anything, it suggests that the age of genome-wide association studies, or GWAS, is here to stay. However, in spite of the influx of data, several issues remain, one of which is the presence of confounding factors caused by relatedness within the study sample. This can cause many false positive results. In recent years, the use of mixed linear models to correct for unknown types of relatedness, i.e. "cryptic relatedness", has been very popular. While this model has been shown to be successful in some cases, here we address the feasibility of performing GWAS in a highly structured population such as Saccharomyces cerevisiae, and find that the inclusion of fixed local ancestry covariates can sometimes lend a study more power. Furthermore, we explore the application of mixed linear models in a different type of biological problem of discovering motifs associated with active microRNAs. While there exist several algorithms for miRNA motif discovery, only a few consider background sequence composition of the 3' UTR binding site in addition to seed sequence motif enrichment, which is known to factor into miNRA binding efficacy. The methods that do account for 3' UTR sequence composition do so by rescoring motif counts based on the background UTR sequence in which it appears. Though computationally efficient, these methods are unable to simultaneously compare both gene expression values and UTR sequence, which our method, named MixMir, is able to do, with favorable results. When compared to the simple linear model, as well as existing motif discovery algorithms, MixMir is able to rank true motifs more highly in multiple data sets. Such computational methods are biologically significant because although it is possible to sequence small RNAs in a sample, their expression may not be perfectly correlated with the size of their effect, which is what we observed.
Advisors/Committee Members: Chen, Kevin C. (chair), Mischaikow, Konstantin (internal member), Olson, Wilma (internal member), Xing, Jinchuan (outside member).
Subjects/Keywords: Linear models (Statistics); Genomes – Analysis
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Diao, Liyang, 1. (2014). Applications of the mixed linear model in genome-wide association studies and small RNA motif discovery. (Doctoral Dissertation). Rutgers University. Retrieved from https://rucore.libraries.rutgers.edu/rutgers-lib/45229/
Chicago Manual of Style (16th Edition):
Diao, Liyang, 1986-. “Applications of the mixed linear model in genome-wide association studies and small RNA motif discovery.” 2014. Doctoral Dissertation, Rutgers University. Accessed January 18, 2021.
https://rucore.libraries.rutgers.edu/rutgers-lib/45229/.
MLA Handbook (7th Edition):
Diao, Liyang, 1986-. “Applications of the mixed linear model in genome-wide association studies and small RNA motif discovery.” 2014. Web. 18 Jan 2021.
Vancouver:
Diao, Liyang 1. Applications of the mixed linear model in genome-wide association studies and small RNA motif discovery. [Internet] [Doctoral dissertation]. Rutgers University; 2014. [cited 2021 Jan 18].
Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/45229/.
Council of Science Editors:
Diao, Liyang 1. Applications of the mixed linear model in genome-wide association studies and small RNA motif discovery. [Doctoral Dissertation]. Rutgers University; 2014. Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/45229/

Rutgers University
7.
Khan, Faisal M., 1981-.
Semi-supervised transductive regression for survival analysis in medical prognostics.
Degree: PhD, Computer Science, 2016, Rutgers University
URL: https://rucore.libraries.rutgers.edu/rutgers-lib/51331/
► The central challenge in predictive modeling for survival analysis in medical prognostics is the management of censored observations in the data. While time-to-event predictions can…
(more)
▼ The central challenge in predictive modeling for survival analysis in medical prognostics is the management of censored observations in the data. While time-to-event predictions can be modeled as regression problems, traditional regression techniques are challenged by the censored characteristics of the data. In such problems the true target times of a majority of instances are unknown; what is known is a censored target representing some indeterminate time before the true target time. The information for most patients is incomplete and only known “up-to-a-point.†Patients who have experienced the endpoint of interest (cancer recurrence, death, etc) during an often multi-year study are considered as non-censored or events. They may represent as little as 9% of the available sample. Most of the patients do not experience the endpoint or are lost to follow-up for various reasons (patient moved, died of other causes, etc.). These censored samples often represent most of the available sample. Modeling techniques which can correctly account for censored observations are crucial. Such censored samples can be considered as semi-supervised targets, however most efforts in semi-supervised regression do not take into account the partial nature of unsupervised information; with samples treated as either fully labelled or unlabeled. This dissertation presents a novel transduction approach for semi-supervised survival analysis. The true target times are approximated from the censored times through transduction to improve predictive performance. The framework can be employed to transform traditional regression methods for survival analysis, or to enhance existing survival analysis algorithms for improved predictive performance. This proposed approach represents one of the first applications of semi-supervised regression to survival analysis and yields significant improvements in predictive performance for multiple applications in prostate and breast cancer prognostics.
Advisors/Committee Members: Kulikowski, Casimir A (chair), Chen, Kevin (internal member), Michmizos, Konstantinos (internal member), Mitsis, Georgios (outside member).
Subjects/Keywords: Survival analysis (Biometry)
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Khan, Faisal M., 1. (2016). Semi-supervised transductive regression for survival analysis in medical prognostics. (Doctoral Dissertation). Rutgers University. Retrieved from https://rucore.libraries.rutgers.edu/rutgers-lib/51331/
Chicago Manual of Style (16th Edition):
Khan, Faisal M., 1981-. “Semi-supervised transductive regression for survival analysis in medical prognostics.” 2016. Doctoral Dissertation, Rutgers University. Accessed January 18, 2021.
https://rucore.libraries.rutgers.edu/rutgers-lib/51331/.
MLA Handbook (7th Edition):
Khan, Faisal M., 1981-. “Semi-supervised transductive regression for survival analysis in medical prognostics.” 2016. Web. 18 Jan 2021.
Vancouver:
Khan, Faisal M. 1. Semi-supervised transductive regression for survival analysis in medical prognostics. [Internet] [Doctoral dissertation]. Rutgers University; 2016. [cited 2021 Jan 18].
Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/51331/.
Council of Science Editors:
Khan, Faisal M. 1. Semi-supervised transductive regression for survival analysis in medical prognostics. [Doctoral Dissertation]. Rutgers University; 2016. Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/51331/

Rutgers University
8.
Shanku, Alexander G., 1979-.
Insights Into evolution and adaptation using computational methods and next generation sequencing.
Degree: PhD, Computational Biology and Molecular Biophysics, 2016, Rutgers University
URL: https://rucore.libraries.rutgers.edu/rutgers-lib/50176/
► Historically, much of the research in evolutionary biology and population genetics has involved analysis at the level of either a single locus or a few…
(more)
▼ Historically, much of the research in evolutionary biology and population genetics has involved analysis at the level of either a single locus or a few number thereof. However, Next Generation sequencing technology has opened the floodgates with respect to both the sheer volume and quality of sequence data that researchers have long needed to address and answer long-standing questions in their fields. Scientists are now, by and large, no longer hampered in their efforts by technological hurdles to obtain data, but are in fact facing the problem of how best to use the vast amount of data that are accumulating at an ever-increasing rate. This is a good problem to have. The following research described in this dissertation is an attempt to derive answers to questions in the fields of population genetics and evolutionary biology that, until recently, have been either intractable or, at best, extremely difficult to address. In the first chapter I provide an introduction and a brief historical look at the research efforts that have proceeded my own. In the second chapter I describe how modern sequencing methods and computational analysis can be used to study, analyze, and answer evolutionary questions about the non-model organism, Enallagma hageni, in order to 1) determine this organism's phylogenetic position within Arthropoda, 2) provide answers and insight into the evolutionary history of the protein-encoding genes in the Enallagma transcriptome, and 3) give functional annotation to these expressed proteins. In the third chapter I examine how natural selection acts on the genome and derive a method that can accurately determine the evolutionary cause of nucleotide fixations, having occurred either through positive selection or neutral processes. I then apply the methodology to North American populations of Drosophila melanogaster, providing further evidence as to how adaptive evolution proceeds in a newly established population. This is an important question, for though there have been multiple approaches devised to determine the targets and modes of evolution in the genome, to date there has not emerged a definitive method which can determine both the location and type of a selective process, and as a result, the picture of how and where adaptive evolution proceeds in the genome has remained opaque. In the forth chapter I examine how levels of natural selection within the genome have the potential to inhibit the ability to accurately learn population demographic history. Using a number of modern algorithms and extensive simulations, I first examine whether or not demographic histories that are learned under simple biological assumptions will yield accurate results when the actual data itself does not adhere to these assumptions. Further, I go on to examine more complicated models of demographic history, looking specifically at how positive selection biases inference, which directions these biases occur, and at what levels of selection do inference methods fail to be robust. Finally, I…
Advisors/Committee Members: Kern, Andrew D (chair), Chen, Kevin (internal member), Xing, Jinchuan (internal member), Edery, Isaac (outside member).
Subjects/Keywords: Evolution (Biology) – Mathematical models; Genomes – Analysis
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Shanku, Alexander G., 1. (2016). Insights Into evolution and adaptation using computational methods and next generation sequencing. (Doctoral Dissertation). Rutgers University. Retrieved from https://rucore.libraries.rutgers.edu/rutgers-lib/50176/
Chicago Manual of Style (16th Edition):
Shanku, Alexander G., 1979-. “Insights Into evolution and adaptation using computational methods and next generation sequencing.” 2016. Doctoral Dissertation, Rutgers University. Accessed January 18, 2021.
https://rucore.libraries.rutgers.edu/rutgers-lib/50176/.
MLA Handbook (7th Edition):
Shanku, Alexander G., 1979-. “Insights Into evolution and adaptation using computational methods and next generation sequencing.” 2016. Web. 18 Jan 2021.
Vancouver:
Shanku, Alexander G. 1. Insights Into evolution and adaptation using computational methods and next generation sequencing. [Internet] [Doctoral dissertation]. Rutgers University; 2016. [cited 2021 Jan 18].
Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/50176/.
Council of Science Editors:
Shanku, Alexander G. 1. Insights Into evolution and adaptation using computational methods and next generation sequencing. [Doctoral Dissertation]. Rutgers University; 2016. Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/50176/

Rutgers University
9.
Mahmud, Md Pavel, 1981-.
Reduced representations for efficient analysis of genomic data: from microarray to high-throughput sequencing.
Degree: PhD, Computer Science, 2014, Rutgers University
URL: https://rucore.libraries.rutgers.edu/rutgers-lib/45336/
► Since the genomics era has started in the ’70s, microarray technologies have been extensively used for biological applications such as gene expression profiling, copy number…
(more)
▼ Since the genomics era has started in the ’70s, microarray technologies have been extensively used for biological applications such as gene expression profiling, copy number variation (CNV) or Single Neucleotide Polymorphism (SNP) detection. To analyze microarray data, numerous statistical and algorithmic techniques have been developed over the last two decades; specially, for detecting CNV from array comparative genomic hybridization (arrayCGH) data, Hidden Markov Models (HMMs) have been successfully used. Still, due to computational reasons, the benefits of using Bayesian HMMs have been overlooked, and their use has been, at best, minimal in practice. The large demand for computational resources has also affected the analysis of high throughput sequencing (HTS) data, which, over the last few years, has started to revolutionize the field of computational biology. For example, the most sensitive tools for mapping HTS data to reference genomes are generally ignored in favor of fast, less accurate ones. In this dissertation, we strive for reduced representations of biological data which enable us to perform efficient computations on large datasets. Since biological datasets often contain repetitive, sometimes redundant, elements, it is a natural idea to identify groups of similar elements and directly perform computations on these groups. Usually,the relevant type of similarity is specific to the type of data and application in hand. Specifically, we make the following four contributions in this thesis. First, we show that, by exploiting repetition in discrete sequences, Markov Chain Monte Carlo (MCMC) simulations of Bayesian HMM can be accelerated, which can then be applied to the DNA segmentation problem [1]. Second, in case of Gaussian observations representing copy number ratio data, we show that, through precomputing similar, contiguous observations into blocks, MCMC for Bayesian HMM can be well-approximated [2]. Third, by representing sequences to multi-dimensional vectors, we introduce a nearest neighbor based novel technique for mapping HTS data to reference genome [3]. Finally, we present a highly efficient clustering approach for HTS data, which allows us to speed-up computationally demanding, sensitive tools for mapping HTS data [4].
Advisors/Committee Members: Schliep, Alexander (chair), Chen, Kevin (internal member), Farach-Colton, Martin (internal member), Freudenberg, Jan (outside member).
Subjects/Keywords: Genomes – Analysis; Markov processes – Mathematical models; Bayesian statistical decision theory
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Mahmud, Md Pavel, 1. (2014). Reduced representations for efficient analysis of genomic data: from microarray to high-throughput sequencing. (Doctoral Dissertation). Rutgers University. Retrieved from https://rucore.libraries.rutgers.edu/rutgers-lib/45336/
Chicago Manual of Style (16th Edition):
Mahmud, Md Pavel, 1981-. “Reduced representations for efficient analysis of genomic data: from microarray to high-throughput sequencing.” 2014. Doctoral Dissertation, Rutgers University. Accessed January 18, 2021.
https://rucore.libraries.rutgers.edu/rutgers-lib/45336/.
MLA Handbook (7th Edition):
Mahmud, Md Pavel, 1981-. “Reduced representations for efficient analysis of genomic data: from microarray to high-throughput sequencing.” 2014. Web. 18 Jan 2021.
Vancouver:
Mahmud, Md Pavel 1. Reduced representations for efficient analysis of genomic data: from microarray to high-throughput sequencing. [Internet] [Doctoral dissertation]. Rutgers University; 2014. [cited 2021 Jan 18].
Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/45336/.
Council of Science Editors:
Mahmud, Md Pavel 1. Reduced representations for efficient analysis of genomic data: from microarray to high-throughput sequencing. [Doctoral Dissertation]. Rutgers University; 2014. Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/45336/

Rutgers University
10.
Manhart, Michael.
Biophysics and stochastic processes in molecular evolution.
Degree: PhD, Physics and Astronomy, 2014, Rutgers University
URL: https://rucore.libraries.rutgers.edu/rutgers-lib/45338/
► Evolution is the defining feature of living matter. It occurs most fundamentally on the scale of biomolecules such as DNA and proteins, which carry out…
(more)
▼ Evolution is the defining feature of living matter. It occurs most fundamentally on the scale of biomolecules such as DNA and proteins, which carry out all the processes of cells. How do the physical properties of these molecules shape the course of evolution? We address this question using a synthesis of biophysical models, theoretical tools from stochastic processes, and high-throughput data. We first review some basic features of population and evolutionary dynamics, focusing especially on fitness landscapes and how they determine accessible pathways of evolution. We then derive a universal scaling law describing time reversibility and steady state of monomorphic populations on arbitrary fitness landscapes. We use this result to study the evolution of transcription factor (TF) binding sites using high-throughput data on TF-DNA interactions and genome-wide site locations. We find that binding sites for a given TF appear to be subjected to universal selection pressures, independent of the properties of their corresponding genes, and their binding energy-dependent fitness is consistent with a simple functional form inspired by a thermodynamic model. We next consider the properties of evolutionary pathways. We develop a general approach for calculating statistical properties of the path ensemble in a stochastic process. We first demonstrate this approach on a series of simple examples, including evolution on a neutral network and two reaction rate problems. We then apply these techniques to a model of how proteins evolve new binding interactions while maintaining folding stability. In particular we show how the structural coupling of protein folding and binding results in protein traits emerging as evolutionary "spandrels'': proteins can evolve strong binding interactions that confer no intrinsic fitness advantage but merely serve to stabilize the protein if misfolding is deleterious. These observations may explain the abundance of apparently nonfunctional interactions among proteins observed in high-throughput assays. When there are distinct selection pressures on both folding and binding, evolutionary paths of proteins can be tightly constrained so that folding stability is first gained and then partially lost as the new binding function is developed. This suggests the evolution of many natural proteins is highly predictable at the level of biophysical traits.
Advisors/Committee Members: Morozov, Alexandre V (chair), Sengupta, Anirvan M (internal member), Bhanot, Gyan (internal member), Andrei, Natan (internal member), Chen, Kevin (outside member).
Subjects/Keywords: Molecular evolution; Evolution (Biology); Protein folding
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Manhart, M. (2014). Biophysics and stochastic processes in molecular evolution. (Doctoral Dissertation). Rutgers University. Retrieved from https://rucore.libraries.rutgers.edu/rutgers-lib/45338/
Chicago Manual of Style (16th Edition):
Manhart, Michael. “Biophysics and stochastic processes in molecular evolution.” 2014. Doctoral Dissertation, Rutgers University. Accessed January 18, 2021.
https://rucore.libraries.rutgers.edu/rutgers-lib/45338/.
MLA Handbook (7th Edition):
Manhart, Michael. “Biophysics and stochastic processes in molecular evolution.” 2014. Web. 18 Jan 2021.
Vancouver:
Manhart M. Biophysics and stochastic processes in molecular evolution. [Internet] [Doctoral dissertation]. Rutgers University; 2014. [cited 2021 Jan 18].
Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/45338/.
Council of Science Editors:
Manhart M. Biophysics and stochastic processes in molecular evolution. [Doctoral Dissertation]. Rutgers University; 2014. Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/45338/
11.
Gould, David William, 1986-.
A research summary 1) RNA binding proteins, 2) Selective constraint on copy number variation in human PIWI-interacting RNA loci.
Degree: MS, Computational Biology and Molecular Biophysics, 2013, Rutgers University
URL: http://hdl.rutgers.edu/1782.1/rucore10001600001.ETD.000068863
► The overall aim of my research dealt with the understanding of regulatory elements in various systems (most important, in humans) through two research projects 1)…
(more)
▼ The overall aim of my research dealt with the understanding of regulatory elements in various systems (most important, in humans) through two research projects 1) a study of RNA Binding Proteins in S. cerevisiae and 2) a study of piRNA in humans. My first project involved the study of RNA Binding Proteins – thought to play a role in post-transcriptional translation in mammals. The algorithms miReduce and PhyloGibbs were used towards the prediction of binding sites for these proteins in S. cerevisiae. The putative binding sites found with the algorithms miReduce and PhyloGibbs warrant more extensive analysis, but further work needs to be done to determine the importance of secondary structure conservation inherent in many functional RNAs. The second project examined the nature of piwi-interacting RNA (piRNA). piRNA are small noncoding RNA that are found in animals thought to act as regulatory elements in the germ-line. This study in particular considers possible forces of selection on piRNA through the analysis of their copy number variation in humans. Three human populations were included in the data used: Europeans, Yorubans, and Chinese/Japanese. Results from our methods support a hypothesis of negative selection on piRNA; they were presented in a publication co-authored by Dr.
Kevin Chen and myself [11].
Advisors/Committee Members: Gould, David William, 1986- (author), Chen, Kevin (chair), Sontag, Eduardo (internal member), Schliep, Alexander (internal member), Buyske, Steve (outside member).
Subjects/Keywords: RNA-protein interactions; Genetic regulation
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Gould, David William, 1. (2013). A research summary 1) RNA binding proteins, 2) Selective constraint on copy number variation in human PIWI-interacting RNA loci. (Masters Thesis). Rutgers University. Retrieved from http://hdl.rutgers.edu/1782.1/rucore10001600001.ETD.000068863
Chicago Manual of Style (16th Edition):
Gould, David William, 1986-. “A research summary 1) RNA binding proteins, 2) Selective constraint on copy number variation in human PIWI-interacting RNA loci.” 2013. Masters Thesis, Rutgers University. Accessed January 18, 2021.
http://hdl.rutgers.edu/1782.1/rucore10001600001.ETD.000068863.
MLA Handbook (7th Edition):
Gould, David William, 1986-. “A research summary 1) RNA binding proteins, 2) Selective constraint on copy number variation in human PIWI-interacting RNA loci.” 2013. Web. 18 Jan 2021.
Vancouver:
Gould, David William 1. A research summary 1) RNA binding proteins, 2) Selective constraint on copy number variation in human PIWI-interacting RNA loci. [Internet] [Masters thesis]. Rutgers University; 2013. [cited 2021 Jan 18].
Available from: http://hdl.rutgers.edu/1782.1/rucore10001600001.ETD.000068863.
Council of Science Editors:
Gould, David William 1. A research summary 1) RNA binding proteins, 2) Selective constraint on copy number variation in human PIWI-interacting RNA loci. [Masters Thesis]. Rutgers University; 2013. Available from: http://hdl.rutgers.edu/1782.1/rucore10001600001.ETD.000068863
.