Advanced search options

Advanced Search Options 🞨

Browse by author name (“Author name starts with…”).

Find ETDs with:

in
/  
in
/  
in
/  
in

Written in Published in Earliest date Latest date

Sorted by

Results per page:

You searched for +publisher:"University of North Carolina" +contributor:("Fant, Andrew"). One record found.

Search Limiters

Last 2 Years | English Only

No search limiters apply to these results.

▼ Search Limiters


University of North Carolina

1. Fant, Andrew. The Effect of Data Curation on the Accuracy of Quantitative Structure-Activity Relationship Models.

Degree: 2015, University of North Carolina

In the 33 years since the first public release of GenBank, and the 15 years since the publication of the first pilot assembly of the human genome, drug discovery has been awash in a tsunami of data. But it has only been within the past decade that medicinal chemists and chemical biologists have had access to the same sorts of large-scale, public-access databases as bioinformaticians and molecular biologists have had for so long. The release of this data has sparked a renewed interest in computational methods for rational drug design, but questions have arisen recently about the accuracy and quality of this data. The same question has arisen in other scientific disciplines, but it has a particular urgency to practitioners of Quantitative Structure-Activity Relationship (QSAR) modeling. By its nature QSAR modeling depends on both activity data and chemical structures. While activities are usually expressed as numerical scalar values, a form ubiquitous throughout the sciences, chemical structures (especially that must be interpretable as such by computer software) are stored in a variety of specialized formats which are much less common and mostly ignored outside of cheminformatics and related fields. While previous research has determined that a 5% error rate in data being used for modeling can cause a QSAR model to be non-predictive and useless for its intended purpose, and workflows have been proposed which reduce the effect of inconsistent chemical structure representations on model accuracy, a fundamental question remains: “how accurate are the structure and activity data freely available to researchers?” To this end, we have undertaken two surveys of data quality, one focusing on chemical structure information in Internet resources and a second examining the uncertainty associated with compounds reported in the medicinal chemistry literature as abstracted in ChEMBL. The results of these studies have informed the creation of an improved workflow for the curation of structure-activity data which is intended to identify problematic data points in raw data extracted from databases so that an expert human curator can examine the underlying literature and resolve discrepancies between reported values. This workflow was in turn applied to the creation of two QSAR models that were used to implement a virtual screen seeking molecules capable of binding to both the serotonergic reuptake transporter and the alpha2a adrenergic receptor. While no suitable compounds were identified in the initial screening process, regions of chemical space that may yield truly novel alpha 2a receptor ligands have been identified. These regions can be targeted in future efforts. Basing data curation workflows on manual processes by human curators is not particularly viable, as humans have a tendency to introduce errors by inattention even as they identify and repair other problems. Computers cannot effectively curate data either. While they are highly accurate when programmed properly, they lack human creativity and insight that would allow… Advisors/Committee Members: Fant, Andrew, Tropsha, Alexander, Elston, Timothy, Lee, Andrew, Rusyn, Ivan, Singleton, Scott.

Subjects/Keywords: Pharmaceutical chemistry; Chemistry; Pharmacology; Eshelman School of Pharmacy; Division of Chemical Biology and Medicinal Chemistry

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Fant, A. (2015). The Effect of Data Curation on the Accuracy of Quantitative Structure-Activity Relationship Models. (Thesis). University of North Carolina. Retrieved from https://cdr.lib.unc.edu/record/uuid:fc71a688-33be-402d-a99d-3d1c75efe86d

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Chicago Manual of Style (16th Edition):

Fant, Andrew. “The Effect of Data Curation on the Accuracy of Quantitative Structure-Activity Relationship Models.” 2015. Thesis, University of North Carolina. Accessed November 24, 2020. https://cdr.lib.unc.edu/record/uuid:fc71a688-33be-402d-a99d-3d1c75efe86d.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

MLA Handbook (7th Edition):

Fant, Andrew. “The Effect of Data Curation on the Accuracy of Quantitative Structure-Activity Relationship Models.” 2015. Web. 24 Nov 2020.

Vancouver:

Fant A. The Effect of Data Curation on the Accuracy of Quantitative Structure-Activity Relationship Models. [Internet] [Thesis]. University of North Carolina; 2015. [cited 2020 Nov 24]. Available from: https://cdr.lib.unc.edu/record/uuid:fc71a688-33be-402d-a99d-3d1c75efe86d.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Council of Science Editors:

Fant A. The Effect of Data Curation on the Accuracy of Quantitative Structure-Activity Relationship Models. [Thesis]. University of North Carolina; 2015. Available from: https://cdr.lib.unc.edu/record/uuid:fc71a688-33be-402d-a99d-3d1c75efe86d

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

.