
University of Manchester
1.
Florez Vargas, Oscar Roberto.
DEVELOPMENT OF STRATEGIES FOR ASSESSING REPORTING IN
BIOMEDICAL RESEARCH: MOVING TOWARD ENHANCING
REPRODUCIBILITY.
Degree: 2016, University of Manchester
URL: http://www.manchester.ac.uk/escholar/uk-ac-man-scw:302324
The idea that the same experimental findings can be
reproduced by a variety of independent approaches is one of the
cornerstones of science’s claim to objective truth. However, in
recent years, it has become clear that science is plagued by
findings that cannot be reproduced and, consequently, invalidating
research studies and undermining public trust in the research
enterprise. The observed lack of reproducibility may be a result,
among other things, of the lack of transparency or completeness in
reporting. In particular, omissions in reporting the technical
nature of the experimental method make it difficult to verify the
findings of experimental research in biomedicine. In this context,
the assessment of scientific reports could help to overcome – at
least in part – the ongoing reproducibility crisis.In addressing
this issue, this Thesis undertakes the challenge of developing
strategies for the evaluation of reporting biomedical experimental
methods in scientific manuscripts. Considering the complexity of
experimental design – often involving different technologies and
models, we characterise the problem in methods reporting through
domain-specific checklists. Then, by using checklists as a decision
making tool, supported by miniRECH – a spreadsheet-based approach
that can be used by authors, editors and peer-reviewers – a
reasonable level of consensus on reporting assessments was achieved
regardless of the domain-specific expertise of referees. In
addition, by using a text-mining system as a screening tool, a
framework to guide an automated assessment of the reporting of
bio-experiments was created. The usefulness of these strategies was
demonstrated in some domain-specific scientific areas as well as in
mouse models across biomedical research.In conclusion, we suggested
that the strategies developed in this work could be implemented
through the publication process as barriers to prevent incomplete
reporting from entering the scientific literature, as well as
promoters of completeness in reporting to improve the general value
of the scientific evidence.
Advisors/Committee Members: STEVENS, ROBERT RD, Brass, Andrew, Stevens, Robert.
Subjects/Keywords: Reproducibility; Checklists; Text Mining; Biomedicine
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Florez Vargas, O. R. (2016). DEVELOPMENT OF STRATEGIES FOR ASSESSING REPORTING IN
BIOMEDICAL RESEARCH: MOVING TOWARD ENHANCING
REPRODUCIBILITY. (Doctoral Dissertation). University of Manchester. Retrieved from http://www.manchester.ac.uk/escholar/uk-ac-man-scw:302324
Chicago Manual of Style (16th Edition):
Florez Vargas, Oscar Roberto. “DEVELOPMENT OF STRATEGIES FOR ASSESSING REPORTING IN
BIOMEDICAL RESEARCH: MOVING TOWARD ENHANCING
REPRODUCIBILITY.” 2016. Doctoral Dissertation, University of Manchester. Accessed March 06, 2021.
http://www.manchester.ac.uk/escholar/uk-ac-man-scw:302324.
MLA Handbook (7th Edition):
Florez Vargas, Oscar Roberto. “DEVELOPMENT OF STRATEGIES FOR ASSESSING REPORTING IN
BIOMEDICAL RESEARCH: MOVING TOWARD ENHANCING
REPRODUCIBILITY.” 2016. Web. 06 Mar 2021.
Vancouver:
Florez Vargas OR. DEVELOPMENT OF STRATEGIES FOR ASSESSING REPORTING IN
BIOMEDICAL RESEARCH: MOVING TOWARD ENHANCING
REPRODUCIBILITY. [Internet] [Doctoral dissertation]. University of Manchester; 2016. [cited 2021 Mar 06].
Available from: http://www.manchester.ac.uk/escholar/uk-ac-man-scw:302324.
Council of Science Editors:
Florez Vargas OR. DEVELOPMENT OF STRATEGIES FOR ASSESSING REPORTING IN
BIOMEDICAL RESEARCH: MOVING TOWARD ENHANCING
REPRODUCIBILITY. [Doctoral Dissertation]. University of Manchester; 2016. Available from: http://www.manchester.ac.uk/escholar/uk-ac-man-scw:302324

University of Manchester
2.
Sechidis, Konstantinos.
Hypothesis Testing and Feature Selection in
Semi-Supervised Data.
Degree: 2015, University of Manchester
URL: http://www.manchester.ac.uk/escholar/uk-ac-man-scw:277415
A characteristic of most real world problems is
that collecting unlabelled examples is easier and cheaper than
collecting labelled ones. As a result, learning from partially
labelled data is a crucial and demanding area of machine learning,
and extending techniques from fully to partially supervised
scenarios is a challenging problem. Our work focuses on two types
of partially labelled data that can occur in binary problems:
semi-supervised data, where the labelled set contains both positive
and negative examples, and positive-unlabelled data, a more
restricted version of partial supervision where the labelled set
consists of only positive examples. In both settings, it is very
important to explore a large number of features in order to derive
useful and interpretable information about our classification task,
and select a subset of features that contains most of the useful
information.In this thesis, we address three fundamental and
tightly coupled questions concerning feature selection in partially
labelled data; all three relate to the highly controversial issue
of when does additional unlabelled data improve performance in
partially labelled learning environments and when does not. The
first question is what are the properties of statistical hypothesis
testing in such data? Second, given the widespread criticism of
significance testing, what can we do in terms of effect size
estimation, that is, quantification of how strong the dependency
between feature X and the partially observed label Y? Finally, in
the context of feature selection, how well can features be ranked
by estimated measures, when the population values are unknown? The
answers to these questions provide a comprehensive picture of
feature selection in partially labelled data. Interesting
applications include for estimation of mutual information
quantities, structure learning in Bayesian networks, and
investigation of how human-provided prior knowledge can overcome
the restrictions of partial labelling.One direct contribution of
our work is to enable valid statistical hypothesis testing and
estimation in positive-unlabelled data. Focusing on a generalised
likelihood ratio test and on estimating mutual information, we
provide five key contributions. (1) We prove that assuming all
unlabelled examples are negative cases is sufficient for
independence testing, but not for power analysis activities. (2) We
suggest a new methodology that compensates this and enables power
analysis, allowing sample size determination for observing an
effect with a desired power by incorporating user’s prior knowledge
over the prevalence of positive examples. (3) We show a new
capability, supervision determination, which can determine a-priori
the number of labelled examples the user must collect before being
able to observe a desired statistical effect. (4) We derive an
estimator of the mutual information in positive-unlabelled data,
and its asymptotic distribution. (5) Finally, we show how to rank
features with and without prior knowledge. Also we derive
extensions of…
Advisors/Committee Members: STEVENS, ROBERT RD, Brown, Gavin, Stevens, Robert.
Subjects/Keywords: Machine Learning; Information Theory; Feature Selection; Hypothesis Testing; Semi Supervised; Positive Unlabelled; Mutual Information
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Sechidis, K. (2015). Hypothesis Testing and Feature Selection in
Semi-Supervised Data. (Doctoral Dissertation). University of Manchester. Retrieved from http://www.manchester.ac.uk/escholar/uk-ac-man-scw:277415
Chicago Manual of Style (16th Edition):
Sechidis, Konstantinos. “Hypothesis Testing and Feature Selection in
Semi-Supervised Data.” 2015. Doctoral Dissertation, University of Manchester. Accessed March 06, 2021.
http://www.manchester.ac.uk/escholar/uk-ac-man-scw:277415.
MLA Handbook (7th Edition):
Sechidis, Konstantinos. “Hypothesis Testing and Feature Selection in
Semi-Supervised Data.” 2015. Web. 06 Mar 2021.
Vancouver:
Sechidis K. Hypothesis Testing and Feature Selection in
Semi-Supervised Data. [Internet] [Doctoral dissertation]. University of Manchester; 2015. [cited 2021 Mar 06].
Available from: http://www.manchester.ac.uk/escholar/uk-ac-man-scw:277415.
Council of Science Editors:
Sechidis K. Hypothesis Testing and Feature Selection in
Semi-Supervised Data. [Doctoral Dissertation]. University of Manchester; 2015. Available from: http://www.manchester.ac.uk/escholar/uk-ac-man-scw:277415

University of Manchester
3.
Turner, Emily.
Predictive Variable Selection for Subgroup
Identification.
Degree: 2017, University of Manchester
URL: http://www.manchester.ac.uk/escholar/uk-ac-man-scw:312697
The problem of exploratory subgroup identification
can be broken down into three steps. The first step is to identify
predictive features, the second is to identify the interesting
regions on those features, and the third is to estimate the
properties of the subgroup region, such as subgroup size and the
predicted recovery outcome for individuals belonging to this
subgroup. While most work in this field analyses the full subgroup
identification procedure, we provide an in-depth examination of the
first step, predictive feature identification. A feature is defined
as predictive if it interacts with a treatment to affect the
recovery outcome. We compare three prominent methods for
exploratory subgroup identification: Vir- tual Twins (Foster et al.
2011), SIDES (Subgroup Identification based on Differential Effect
Search, Lipkovich et al. 2011) and GUIDE (Generalised, Unbiased
Interaction Detection and Estimation, Loh et al. 2015). First, we
provide a theoretical interpretation of the problem of predictive
variable selection and connect it with the three methods. We
believe that bringing different approaches under a common
analytical framework facilitates a clearer comparison of each. We
show that Virtual Twins and SIDES select interesting features in a
theoretically similar way, so that the essential difference between
the two is in the way in which this selection mechanism is
implemented in their respective subgroup identification procedures.
Second, we undertake an experimental analysis of the three. In
order to do this, we apply each method to return a predictive
variable importance measure (PVIMs), which we use to rank features
in order of their predictiveness. We then evaluate and compare how
well each method performs at this task. Although each of Virtual
Twins, SIDES and GUIDE either output a PVIM or require minor
adaptations to do so, their strengths and weaknesses as PVIMs had
not been explored prior to this work. We argue that a variable
ranking approach is a particularly good solution to the problem of
subgroup identification. Because clinical trials often lack the
power to identify predictive features with statistical
significance, predictive variable scoring and ranking may be more
appropriate than a full subgroup identification procedure. PVIMs
enable a clinician to visualise the relative importance of each
feature in a straightforward manner and to use clinical expertise
to scrutinise the findings of the algorithm. Our conclusions are
that Virtual Twins performs best in terms of predictive feature
selection, outperforming SIDES and GUIDE on every type of data set.
However, it appears to have weaknesses in distinguishing between
predictive and prognostic biomarkers. Finally, we note that there
is a need to provide common data sets on which new methods can be
evaluated. We show that there is a tendency towards testing new
subgroup identification methods on data sets that demonstrate the
strengths of the algorithm and hide its
weaknesses.
Advisors/Committee Members: STEVENS, ROBERT RD, Brown, Gavin, Stevens, Robert.
Subjects/Keywords: Subgroup identification; Interaction detection; Recursive partitioning
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Turner, E. (2017). Predictive Variable Selection for Subgroup
Identification. (Doctoral Dissertation). University of Manchester. Retrieved from http://www.manchester.ac.uk/escholar/uk-ac-man-scw:312697
Chicago Manual of Style (16th Edition):
Turner, Emily. “Predictive Variable Selection for Subgroup
Identification.” 2017. Doctoral Dissertation, University of Manchester. Accessed March 06, 2021.
http://www.manchester.ac.uk/escholar/uk-ac-man-scw:312697.
MLA Handbook (7th Edition):
Turner, Emily. “Predictive Variable Selection for Subgroup
Identification.” 2017. Web. 06 Mar 2021.
Vancouver:
Turner E. Predictive Variable Selection for Subgroup
Identification. [Internet] [Doctoral dissertation]. University of Manchester; 2017. [cited 2021 Mar 06].
Available from: http://www.manchester.ac.uk/escholar/uk-ac-man-scw:312697.
Council of Science Editors:
Turner E. Predictive Variable Selection for Subgroup
Identification. [Doctoral Dissertation]. University of Manchester; 2017. Available from: http://www.manchester.ac.uk/escholar/uk-ac-man-scw:312697