Advanced search options

Advanced Search Options 🞨

Browse by author name (“Author name starts with…”).

Find ETDs with:


Written in Published in Earliest date Latest date

Sorted by

Results per page:

You searched for id:"handle:11375/22018". One record found.

Search Limiters

Last 2 Years | English Only

No search limiters apply to these results.

▼ Search Limiters

McMaster University

1. Islam, A S M Sohidull. Repeats in Strings and Application in Bioinformatics.

Degree: PhD, 2017, McMaster University

A string is a sequence of symbols, usually called letters, drawn from some alphabet. It is one of the most fundamental and important structures in computing, bioinformatics and mathematics. Computer files, contents of a computer memory, network and satellite signals are all instances of strings. The genome of every living thing can be represented by a string drawn from the alphabet {a, c, g, t}. The algorithms processing strings have a wide range of applications such as information retrieval, search engines, data compression, cryptography and bioinformatics. In a DNA sequence the indeterminate symbol {a, c} is used when it is unclear whether a given nucleotide is a or c, We could then say that {a, c} matches another symbol {c, g} which in turn matches {g, t}, but {a, c} certainly does not match {g, t}. The processing of indeterminate strings is much more difficult because of this nontransitivity of matching. Thus a combinatorial understanding of indeterminate strings becomes essential to the development of efficient methods for their processing. With indeterminate strings, as with ordinary ones, the main task is the recognition/computation of patterns called regularities . We are particularly interested in regularities called repeats, whether tandem such as acgacg or nontandem (acgtacg). In this thesis we focus on newly-discovered regularities in strings, especially the enhanced cover array and the Lyndon array, with attention paid to extending the computations to indeterminate strings. Much of this work is necessarily abstract in nature, because the intention is to produce results that are applicable over a wide range of application areas. We will focus on finding algorithms to construct different data structures to represent strings such as cover arrays and Lyndon arrays. The idea of cover comes from strings which are not truly periodic but "almost" periodic in nature. For example abaababa is covered by aba but is not periodic. Similarly the Lyndon array describes the string in another unique way and is used in many fields of string algorithms. These data structures will help us in the field of string processing. As one application of these data structures we will work on "Reverse Engineering"; that is, given data structures derived from of a string, how can we get the string back. Since DNA, RNA and peptide sequences are effectively "strings" with unique properties, we will adapt our algorithms for regular or indeterminate strings to these sequences. Sequence analysis can be used to assign function to genes and proteins by observing the similarities between the compared sequences. Identifying unusual repetitive patterns will aid in the identification of intrinsic features of the sequence such as active sites, gene-structures and regulatory elements. As an application of periodic strings we investigate microsatellites which are short repetitive DNA patterns where repeated substrings are of length 2 to 5. Microsatellites are used in a wide range of studies due to their small size and… Advisors/Committee Members: Smyth, William F, Golding, Brian, Computational Engineering and Science.

Subjects/Keywords: Repeats; String; Bioinformatics; Algorithm

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Islam, A. S. M. S. (2017). Repeats in Strings and Application in Bioinformatics. (Doctoral Dissertation). McMaster University. Retrieved from

Chicago Manual of Style (16th Edition):

Islam, A S M Sohidull. “Repeats in Strings and Application in Bioinformatics.” 2017. Doctoral Dissertation, McMaster University. Accessed July 16, 2018.

MLA Handbook (7th Edition):

Islam, A S M Sohidull. “Repeats in Strings and Application in Bioinformatics.” 2017. Web. 16 Jul 2018.


Islam ASMS. Repeats in Strings and Application in Bioinformatics. [Internet] [Doctoral dissertation]. McMaster University; 2017. [cited 2018 Jul 16]. Available from:

Council of Science Editors:

Islam ASMS. Repeats in Strings and Application in Bioinformatics. [Doctoral Dissertation]. McMaster University; 2017. Available from: