Advanced search options

Advanced Search Options 🞨

Browse by author name (“Author name starts with…”).

Find ETDs with:

in
/  
in
/  
in
/  
in

Written in Published in Earliest date Latest date

Sorted by

Results per page:

You searched for +publisher:"Rutgers University" +contributor:("Freudenberg, Jan"). One record found.

Search Limiters

Last 2 Years | English Only

No search limiters apply to these results.

▼ Search Limiters


Rutgers University

1. Mahmud, Md Pavel, 1981-. Reduced representations for efficient analysis of genomic data: from microarray to high-throughput sequencing.

Degree: PhD, Computer Science, 2014, Rutgers University

Since the genomics era has started in the ’70s, microarray technologies have been extensively used for biological applications such as gene expression profiling, copy number variation (CNV) or Single Neucleotide Polymorphism (SNP) detection. To analyze microarray data, numerous statistical and algorithmic techniques have been developed over the last two decades; specially, for detecting CNV from array comparative genomic hybridization (arrayCGH) data, Hidden Markov Models (HMMs) have been successfully used. Still, due to computational reasons, the benefits of using Bayesian HMMs have been overlooked, and their use has been, at best, minimal in practice. The large demand for computational resources has also affected the analysis of high throughput sequencing (HTS) data, which, over the last few years, has started to revolutionize the field of computational biology. For example, the most sensitive tools for mapping HTS data to reference genomes are generally ignored in favor of fast, less accurate ones. In this dissertation, we strive for reduced representations of biological data which enable us to perform efficient computations on large datasets. Since biological datasets often contain repetitive, sometimes redundant, elements, it is a natural idea to identify groups of similar elements and directly perform computations on these groups. Usually,the relevant type of similarity is specific to the type of data and application in hand. Specifically, we make the following four contributions in this thesis. First, we show that, by exploiting repetition in discrete sequences, Markov Chain Monte Carlo (MCMC) simulations of Bayesian HMM can be accelerated, which can then be applied to the DNA segmentation problem [1]. Second, in case of Gaussian observations representing copy number ratio data, we show that, through precomputing similar, contiguous observations into blocks, MCMC for Bayesian HMM can be well-approximated [2]. Third, by representing sequences to multi-dimensional vectors, we introduce a nearest neighbor based novel technique for mapping HTS data to reference genome [3]. Finally, we present a highly efficient clustering approach for HTS data, which allows us to speed-up computationally demanding, sensitive tools for mapping HTS data [4].

Advisors/Committee Members: Schliep, Alexander (chair), Chen, Kevin (internal member), Farach-Colton, Martin (internal member), Freudenberg, Jan (outside member).

Subjects/Keywords: Genomes – Analysis; Markov processes – Mathematical models; Bayesian statistical decision theory

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Mahmud, Md Pavel, 1. (2014). Reduced representations for efficient analysis of genomic data: from microarray to high-throughput sequencing. (Doctoral Dissertation). Rutgers University. Retrieved from https://rucore.libraries.rutgers.edu/rutgers-lib/45336/

Chicago Manual of Style (16th Edition):

Mahmud, Md Pavel, 1981-. “Reduced representations for efficient analysis of genomic data: from microarray to high-throughput sequencing.” 2014. Doctoral Dissertation, Rutgers University. Accessed November 29, 2020. https://rucore.libraries.rutgers.edu/rutgers-lib/45336/.

MLA Handbook (7th Edition):

Mahmud, Md Pavel, 1981-. “Reduced representations for efficient analysis of genomic data: from microarray to high-throughput sequencing.” 2014. Web. 29 Nov 2020.

Vancouver:

Mahmud, Md Pavel 1. Reduced representations for efficient analysis of genomic data: from microarray to high-throughput sequencing. [Internet] [Doctoral dissertation]. Rutgers University; 2014. [cited 2020 Nov 29]. Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/45336/.

Council of Science Editors:

Mahmud, Md Pavel 1. Reduced representations for efficient analysis of genomic data: from microarray to high-throughput sequencing. [Doctoral Dissertation]. Rutgers University; 2014. Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/45336/

.