Advanced search options

Advanced Search Options 🞨

Browse by author name (“Author name starts with…”).

Find ETDs with:

in
/  
in
/  
in
/  
in

Written in Published in Earliest date Latest date

Sorted by

Results per page:

Sorted by: relevance · author · university · dateNew search

You searched for subject:(multi mappable reads). Showing records 1 – 2 of 2 total matches.

Search Limiters

Last 2 Years | English Only

No search limiters apply to these results.

▼ Search Limiters


Texas A&M University

1. Gujjula, Krishna Reddy. Map2Peak: A Novel Perspective on ChIP-Seq Data Analysis Pipeline.

Degree: PhD, Industrial Engineering, 2018, Texas A&M University

The research in this dissertation focuses on developing a novel methodology for ChIPSeq dataset analysis. Despite its advances, the standard ChIP-Seq data analysis pipeline, i.e., read mapping followed by peak calling has the following shortcomings: 1. Majority of the ChIP-Seq dataset consists of background reads, hence unnecessary computation effort is spent on mapping reads that have no role in forming the true peaks. 2. Unnecessary computation effort is spent on aligning control reads which do not map to ChIP-enriched genomic regions. 3. Multi-mappable reads are often discarded during the read mapping, resulting in the reduced power to identify peaks in repeat elements of the genome. We present Map2Peak, a novel tool aimed at mitigating the aforementioned drawbacks. Map2Peak receives ChIP-Seq and control unmapped reads as the input and presents the peaks as the output at a speed twice faster than that of standard workflow. Map2Peak intertwines partial read mappings and peak calling in a five-phase algorithm. It models the fragment count information obtained during the early stages of ChIP read mapping (Phase 1) as a 2-component Poisson mixture model, and then implements expectation-maximization algorithm to identify ChIP enriched regions (Phase 2). The remaining ChIP reads and majority of control reads are then restricted to map exactly only to the much shorter pseudo-genome composed of the ChIP enriched regions (Phase 3 & 4). The mapping information is then used to call peaks on pseudo-genome (Phase 5). Our results show that the peaks called by Map2Peak encompass most of the peaks called by the standard workflow (88%-96%) and some novel motif-justifiable peaks which are not detected by the standard workflow, and majority (90%) of the background reads are discarded. Moreover, Map2Peak implicitly resolves the alignment location for some of the multi-mappable reads which result in increased power to call peaks in repeat elements of the genome. Map2Peak provides researchers with an ultrafast peak caller which utilizes whole ChIP-Seq dataset without discarding multi-mappable reads to identify peaks, and efficiently utilize control datasets for the purpose of peak calling. “Map2Peak” is available at https://kianfar.engr.tamu.edu/map2peak/. Advisors/Committee Members: Kianfar, Kiavash (advisor), Ding, Yu (committee member), Butenko, Sergiy (committee member), Yu, Peng (committee member).

Subjects/Keywords: ChIP-Seq; Peak-calling; E-M algorithm; Read mapping; Poisson mixture model; multi-mappable reads

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Gujjula, K. R. (2018). Map2Peak: A Novel Perspective on ChIP-Seq Data Analysis Pipeline. (Doctoral Dissertation). Texas A&M University. Retrieved from http://hdl.handle.net/1969.1/173796

Chicago Manual of Style (16th Edition):

Gujjula, Krishna Reddy. “Map2Peak: A Novel Perspective on ChIP-Seq Data Analysis Pipeline.” 2018. Doctoral Dissertation, Texas A&M University. Accessed November 22, 2019. http://hdl.handle.net/1969.1/173796.

MLA Handbook (7th Edition):

Gujjula, Krishna Reddy. “Map2Peak: A Novel Perspective on ChIP-Seq Data Analysis Pipeline.” 2018. Web. 22 Nov 2019.

Vancouver:

Gujjula KR. Map2Peak: A Novel Perspective on ChIP-Seq Data Analysis Pipeline. [Internet] [Doctoral dissertation]. Texas A&M University; 2018. [cited 2019 Nov 22]. Available from: http://hdl.handle.net/1969.1/173796.

Council of Science Editors:

Gujjula KR. Map2Peak: A Novel Perspective on ChIP-Seq Data Analysis Pipeline. [Doctoral Dissertation]. Texas A&M University; 2018. Available from: http://hdl.handle.net/1969.1/173796


Texas A&M University

2. Gujjula, Krishna Reddy. Map2Peak: A Novel Perspective on ChIP-Seq Data Analysis Pipeline.

Degree: PhD, Industrial Engineering, 2018, Texas A&M University

The research in this dissertation focuses on developing a novel methodology for ChIPSeq dataset analysis. Despite its advances, the standard ChIP-Seq data analysis pipeline, i.e., read mapping followed by peak calling has the following shortcomings: 1. Majority of the ChIP-Seq dataset consists of background reads, hence unnecessary computation effort is spent on mapping reads that have no role in forming the true peaks. 2. Unnecessary computation effort is spent on aligning control reads which do not map to ChIP-enriched genomic regions. 3. Multi-mappable reads are often discarded during the read mapping, resulting in the reduced power to identify peaks in repeat elements of the genome. We present Map2Peak, a novel tool aimed at mitigating the aforementioned drawbacks. Map2Peak receives ChIP-Seq and control unmapped reads as the input and presents the peaks as the output at a speed twice faster than that of standard workflow. Map2Peak intertwines partial read mappings and peak calling in a five-phase algorithm. It models the fragment count information obtained during the early stages of ChIP read mapping (Phase 1) as a 2-component Poisson mixture model, and then implements expectation-maximization algorithm to identify ChIP enriched regions (Phase 2). The remaining ChIP reads and majority of control reads are then restricted to map exactly only to the much shorter pseudo-genome composed of the ChIP enriched regions (Phase 3 & 4). The mapping information is then used to call peaks on pseudo-genome (Phase 5). Our results show that the peaks called by Map2Peak encompass most of the peaks called by the standard workflow (88%-96%) and some novel motif-justifiable peaks which are not detected by the standard workflow, and majority (90%) of the background reads are discarded. Moreover, Map2Peak implicitly resolves the alignment location for some of the multi-mappable reads which result in increased power to call peaks in repeat elements of the genome. Map2Peak provides researchers with an ultrafast peak caller which utilizes whole ChIP-Seq dataset without discarding multi-mappable reads to identify peaks, and efficiently utilize control datasets for the purpose of peak calling. “Map2Peak” is available at https://kianfar.engr.tamu.edu/map2peak/. Advisors/Committee Members: Kianfar, Kiavash (advisor), Ding, Yu (committee member), Butenko, Sergiy (committee member), Yu, Peng (committee member).

Subjects/Keywords: ChIP-Seq; Peak-calling; E-M algorithm; Read mapping; Poisson mixture model; multi-mappable reads

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Gujjula, K. R. (2018). Map2Peak: A Novel Perspective on ChIP-Seq Data Analysis Pipeline. (Doctoral Dissertation). Texas A&M University. Retrieved from http://hdl.handle.net/1969.1/173701

Chicago Manual of Style (16th Edition):

Gujjula, Krishna Reddy. “Map2Peak: A Novel Perspective on ChIP-Seq Data Analysis Pipeline.” 2018. Doctoral Dissertation, Texas A&M University. Accessed November 22, 2019. http://hdl.handle.net/1969.1/173701.

MLA Handbook (7th Edition):

Gujjula, Krishna Reddy. “Map2Peak: A Novel Perspective on ChIP-Seq Data Analysis Pipeline.” 2018. Web. 22 Nov 2019.

Vancouver:

Gujjula KR. Map2Peak: A Novel Perspective on ChIP-Seq Data Analysis Pipeline. [Internet] [Doctoral dissertation]. Texas A&M University; 2018. [cited 2019 Nov 22]. Available from: http://hdl.handle.net/1969.1/173701.

Council of Science Editors:

Gujjula KR. Map2Peak: A Novel Perspective on ChIP-Seq Data Analysis Pipeline. [Doctoral Dissertation]. Texas A&M University; 2018. Available from: http://hdl.handle.net/1969.1/173701

.