You searched for subject:(Data analysis)
.
Showing records 1 – 30 of
5821 total matches.
◁ [1] [2] [3] [4] [5] … [195] ▶
1.
Dhungana, Prakash.
Application for quick reduction of GPS data for urban mobility analysis.
Degree: 2014, Escola Superior Tecnologia e Gestão de Oliveira do Hospital
URL: https://www.rcaap.pt/detail.jsp?id=oai:comum.rcaap.pt:10400.26/17521
► The potential requirement of using data reduction application as an assistive tool for the student and researchers has been designed, developed and deployed. For that,…
(more)
▼ The potential requirement of using data reduction application as an assistive tool for the student and researchers has been designed, developed and deployed. For that, numerous researches have been carried for this project. During the process, several modifications on architectural design, development and implemention have been done. The idea of further advancement of this application to the further development has been explored. Primarily, open source software for data reductions have been developed to suit the needs of the data analysis enthusiastic. Requirement for the project were drawn upon a thematic analysis carried on the data collected via the statement of arts and supervisors. Six evaluation sessions were carried out with various sectors of data reduction techniques to establish the acceptance, suitability, usefulness and efficiency of the product. The results were outstanding and indicate strong prospects of the product for the targeted arena.
The potential requirement of using data reduction application as an assistive tool for the student and researchers has been designed, developed and deployed. For that, numerous researches have been carried for this project. During the process, several modifications on architectural design, development and implemention have been done. The idea of further advancement of this application to the further development has been explored. Primarily, open source software for data reductions have been developed to suit the needs of the data analysis enthusiastic. Requirement for the project were drawn upon a thematic analysis carried on the data collected via the statement of arts and supervisors. Six evaluation sessions were carried out with various sectors of data reduction techniques to establish the acceptance, suitability, usefulness and efficiency of the product. The results were outstanding and indicate strong prospects of the product for the targeted arena.
info:eu-repo/semantics/acceptedVersion
Subjects/Keywords: Data analysis; Data reduction
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Dhungana, P. (2014). Application for quick reduction of GPS data for urban mobility analysis. (Thesis). Escola Superior Tecnologia e Gestão de Oliveira do Hospital. Retrieved from https://www.rcaap.pt/detail.jsp?id=oai:comum.rcaap.pt:10400.26/17521
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Dhungana, Prakash. “Application for quick reduction of GPS data for urban mobility analysis.” 2014. Thesis, Escola Superior Tecnologia e Gestão de Oliveira do Hospital. Accessed February 27, 2021.
https://www.rcaap.pt/detail.jsp?id=oai:comum.rcaap.pt:10400.26/17521.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Dhungana, Prakash. “Application for quick reduction of GPS data for urban mobility analysis.” 2014. Web. 27 Feb 2021.
Vancouver:
Dhungana P. Application for quick reduction of GPS data for urban mobility analysis. [Internet] [Thesis]. Escola Superior Tecnologia e Gestão de Oliveira do Hospital; 2014. [cited 2021 Feb 27].
Available from: https://www.rcaap.pt/detail.jsp?id=oai:comum.rcaap.pt:10400.26/17521.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Dhungana P. Application for quick reduction of GPS data for urban mobility analysis. [Thesis]. Escola Superior Tecnologia e Gestão de Oliveira do Hospital; 2014. Available from: https://www.rcaap.pt/detail.jsp?id=oai:comum.rcaap.pt:10400.26/17521
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Illinois – Urbana-Champaign
2.
Xu, Liqi.
New capabilities for large-scale exploratory data analysis.
Degree: PhD, Computer Science, 2020, University of Illinois – Urbana-Champaign
URL: http://hdl.handle.net/2142/107971
► The ever-rising diversity of data generated, manipulated, and analyzed every day engenders a variety of data formats, ranging from one fixed dataset to multiple versions…
(more)
▼ The ever-rising diversity of
data generated, manipulated, and analyzed every day engenders a variety of
data formats, ranging from one fixed dataset to multiple versions of a dataset stored across multiple
data sources. This variety of formats has led to substantial challenges in
data exploration. Existing systems do not effectively support querying capabilities across these formats: (i) Browsing: When exploring a single dataset,
data scientists often need to examine a collection of records that satisfy arbitrary predicates. However, current exploratory
data analysis tools mainly focus on visual summarization over browsing. (ii) Versioning: With the proliferation of dataset versions generated during different stages of exploration, exploratory
data analysis is no longer just about exploring one static dataset. Instead,
data scientists need to keep track of massive numbers of versions, as well as search for versions with specific criteria. (iii) Integrating: Nowadays, datasets are collected and stored at multiple sources (e.g., as part of the IoT). When exploring
data,
data scientists often need to query and join
data across databases at disparate locations.
In this dissertation, we propose systems that enable query capabilities to efficiently and effectively fulfill these new demands in
data exploration. (i) For browsing, we develop NEEDLETAIL, a
data exploration engine that employs a light-weight indexing structure along with efficient algorithms to retrieve any-k valid records for arbitrary queries as quickly as possible. (ii) For versioning, we implement and open-source ORPHEUSDB, a dataset version control system that can efficiently track and query across dataset versions. Since versioning queries in ORPHEUSDB take advantage of array operators in relational database systems, we also conduct an extensive experimental study on understanding array implementations in modern database systems. (iii) For integrating, we leverage machine learning techniques to optimize federated query processing and eventually improve the interactivity of
data exploration across disparate databases.
Advisors/Committee Members: Parameswaran, Aditya (advisor), Parameswaran, Aditya (Committee Chair), Zhai, ChengXiang (committee member), Tao, Xie (committee member), Cole, Richard L. (committee member).
Subjects/Keywords: Exploratory Data Analysis; Data Management
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Xu, L. (2020). New capabilities for large-scale exploratory data analysis. (Doctoral Dissertation). University of Illinois – Urbana-Champaign. Retrieved from http://hdl.handle.net/2142/107971
Chicago Manual of Style (16th Edition):
Xu, Liqi. “New capabilities for large-scale exploratory data analysis.” 2020. Doctoral Dissertation, University of Illinois – Urbana-Champaign. Accessed February 27, 2021.
http://hdl.handle.net/2142/107971.
MLA Handbook (7th Edition):
Xu, Liqi. “New capabilities for large-scale exploratory data analysis.” 2020. Web. 27 Feb 2021.
Vancouver:
Xu L. New capabilities for large-scale exploratory data analysis. [Internet] [Doctoral dissertation]. University of Illinois – Urbana-Champaign; 2020. [cited 2021 Feb 27].
Available from: http://hdl.handle.net/2142/107971.
Council of Science Editors:
Xu L. New capabilities for large-scale exploratory data analysis. [Doctoral Dissertation]. University of Illinois – Urbana-Champaign; 2020. Available from: http://hdl.handle.net/2142/107971

Deakin University
3.
Nguyen, Vu.
Bayesian nonparametric multilevel modelling and applications.
Degree: School of Information Technology, 2015, Deakin University
URL: http://hdl.handle.net/10536/DRO/DU:30079715
Our research aims at contributing to the multilevel modeling in data analytics. We address the task of multilevel clustering, multilevel regression, and classification. We provide state of the art solution for the critical problem.
Advisors/Committee Members: Phung, Dinh, Venkatesh, Svetha.
Subjects/Keywords: data analysis; regression analysis; multilevel data
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Nguyen, V. (2015). Bayesian nonparametric multilevel modelling and applications. (Thesis). Deakin University. Retrieved from http://hdl.handle.net/10536/DRO/DU:30079715
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Nguyen, Vu. “Bayesian nonparametric multilevel modelling and applications.” 2015. Thesis, Deakin University. Accessed February 27, 2021.
http://hdl.handle.net/10536/DRO/DU:30079715.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Nguyen, Vu. “Bayesian nonparametric multilevel modelling and applications.” 2015. Web. 27 Feb 2021.
Vancouver:
Nguyen V. Bayesian nonparametric multilevel modelling and applications. [Internet] [Thesis]. Deakin University; 2015. [cited 2021 Feb 27].
Available from: http://hdl.handle.net/10536/DRO/DU:30079715.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Nguyen V. Bayesian nonparametric multilevel modelling and applications. [Thesis]. Deakin University; 2015. Available from: http://hdl.handle.net/10536/DRO/DU:30079715
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Ottawa
4.
Nhogue Wabo, Blanche Nadege.
Hedge Funds and Survival Analysis
.
Degree: 2013, University of Ottawa
URL: http://hdl.handle.net/10393/26257
► Using data from Hedge Fund Research, Inc. (HFR), this study adapts and expands on existing methods in survival analysis in an attempt to investigate whether…
(more)
▼ Using data from Hedge Fund Research, Inc. (HFR), this study adapts and expands
on existing methods in survival analysis in an attempt to investigate whether hedge
funds mortality can be predicted on the basis of certain hedge funds characteristics.
The main idea is to determine the characteristics which contribute the most to the
survival and failure probabilities of hedge funds and interpret them. We establish hazard
models with time-independent covariates, as well as time-varying covariates to interpret
the selected hedge funds characteristics. Our results show that size, age, performance,
strategy, annual audit, fund offshore and fund denomination are the characteristics that
best explain hedge fund failure. We find that 1% increase in performance decreases
the hazard by 3.3%, the small size and the less than 5 years old hedge funds are the
most likely to die and the event-driven strategy is the best to use as compare to others.
The risk of death is 0.668 times lower for funds who indicated that an annual audit
is performed as compared to the funds who did not indicated that an annual audit is
performed. The risk of death for the offshore hedge funds is 1.059 times higher than the
non-offshore hedge funds.
Subjects/Keywords: Survival Analysis;
HedgeFunds;
Data analysis
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Nhogue Wabo, B. N. (2013). Hedge Funds and Survival Analysis
. (Thesis). University of Ottawa. Retrieved from http://hdl.handle.net/10393/26257
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Nhogue Wabo, Blanche Nadege. “Hedge Funds and Survival Analysis
.” 2013. Thesis, University of Ottawa. Accessed February 27, 2021.
http://hdl.handle.net/10393/26257.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Nhogue Wabo, Blanche Nadege. “Hedge Funds and Survival Analysis
.” 2013. Web. 27 Feb 2021.
Vancouver:
Nhogue Wabo BN. Hedge Funds and Survival Analysis
. [Internet] [Thesis]. University of Ottawa; 2013. [cited 2021 Feb 27].
Available from: http://hdl.handle.net/10393/26257.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Nhogue Wabo BN. Hedge Funds and Survival Analysis
. [Thesis]. University of Ottawa; 2013. Available from: http://hdl.handle.net/10393/26257
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

UCLA
5.
Wu, Xiaoxu.
An Informative and Predictive Analysis of the San Francisco Police Department Crime Data.
Degree: Statistics, 2016, UCLA
URL: http://www.escholarship.org/uc/item/9113p8tw
► It is the responsibility of the San Francisco Police Department to protect the local community from various crimes and to improve the local security environment.…
(more)
▼ It is the responsibility of the San Francisco Police Department to protect the local community from various crimes and to improve the local security environment. With the development of modern statistics tools, we can learn from the past data and give suggestions for future strategy.In this thesis, we study the San Francisco Police Department crime dataset from 01/01/2013 through 05/13/2015. Informative analysis regarding timing and location for different crimes are examined. Visualization methods are proposed for related features. We also discuss possibilities of predicting the crime categories given time and location data using the k-nearest-neighbor model and the logistic regression model.
Subjects/Keywords: Statistics; Data analysis
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Wu, X. (2016). An Informative and Predictive Analysis of the San Francisco Police Department Crime Data. (Thesis). UCLA. Retrieved from http://www.escholarship.org/uc/item/9113p8tw
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Wu, Xiaoxu. “An Informative and Predictive Analysis of the San Francisco Police Department Crime Data.” 2016. Thesis, UCLA. Accessed February 27, 2021.
http://www.escholarship.org/uc/item/9113p8tw.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Wu, Xiaoxu. “An Informative and Predictive Analysis of the San Francisco Police Department Crime Data.” 2016. Web. 27 Feb 2021.
Vancouver:
Wu X. An Informative and Predictive Analysis of the San Francisco Police Department Crime Data. [Internet] [Thesis]. UCLA; 2016. [cited 2021 Feb 27].
Available from: http://www.escholarship.org/uc/item/9113p8tw.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Wu X. An Informative and Predictive Analysis of the San Francisco Police Department Crime Data. [Thesis]. UCLA; 2016. Available from: http://www.escholarship.org/uc/item/9113p8tw
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Central Florida
6.
Angelopoulou, Anastasia.
A Simulation-Based Task Analysis using Agent-Based, Discrete Event and System Dynamics Simulation.
Degree: 2015, University of Central Florida
URL: https://stars.library.ucf.edu/etd/5146
► Recent advances in technology have increased the need for using simulation models to analyze tasks and obtain human performance data. A variety of task analysis…
(more)
▼ Recent advances in technology have increased the need for using simulation models to analyze tasks and obtain human performance
data. A variety of task
analysis approaches and tools have been proposed and developed over the years. Over 100 task
analysis methods have been reported in the literature. However, most of the developed methods and tools allow for representation of the static aspects of the tasks performed by expert system-driven human operators, neglecting aspects of the work environment, i.e. physical layout, and dynamic aspects of the task. The use of simulation can help face the new challenges in the field of task
analysis as it allows for simulation of the dynamic aspects of the tasks, the humans performing them, and their locations in the environment. Modeling and/or simulation task
analysis tools and techniques have been proven to be effective in task
analysis, workload, and human reliability assessment. However, most of the existing task
analysis simulation models and tools lack features that allow for consideration of errors, workload, level of operator's expertise and skills, among others. In addition, the current task
analysis simulation tools require basic training on the tool to allow for modeling the flow of task
analysis process and/or error and workload assessment. The modeling process is usually achieved using drag and drop functionality and, in some cases, programming skills. This research focuses on automating the modeling process and simulating individuals (or groups of individuals) performing tasks in a dynamic work environment in any domain. The main objective of this research is to develop a universal tool that allows for modeling and simulation of task
analysis models in a short amount of time with limited need for training or knowledge of modeling and simulation theory. A Universal Task
Analysis Simulation Modeling (UTASiMo) tool can be used for automatically generating simulation models that analyze the tasks performed by human operators. UTASiMo is a multi-method modeling and simulation tool developed as a combination of agent-based, discrete event, and system dynamics simulation models. A generic multi-method modeling and simulation framework, named 3M&S Framework, as well as the Unified Modeling Language have been used for the design of the conceptual model and the implementation of the simulation tool. UTASiMo-generated models are dynamically created during run-time based on user inputs. The simulation results include estimations of operator workload, task completion time, and probability of human errors based on human operator variability and task structure.
Advisors/Committee Members: Karwowski, Waldemar.
Subjects/Keywords: Categorical Data Analysis
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Angelopoulou, A. (2015). A Simulation-Based Task Analysis using Agent-Based, Discrete Event and System Dynamics Simulation. (Doctoral Dissertation). University of Central Florida. Retrieved from https://stars.library.ucf.edu/etd/5146
Chicago Manual of Style (16th Edition):
Angelopoulou, Anastasia. “A Simulation-Based Task Analysis using Agent-Based, Discrete Event and System Dynamics Simulation.” 2015. Doctoral Dissertation, University of Central Florida. Accessed February 27, 2021.
https://stars.library.ucf.edu/etd/5146.
MLA Handbook (7th Edition):
Angelopoulou, Anastasia. “A Simulation-Based Task Analysis using Agent-Based, Discrete Event and System Dynamics Simulation.” 2015. Web. 27 Feb 2021.
Vancouver:
Angelopoulou A. A Simulation-Based Task Analysis using Agent-Based, Discrete Event and System Dynamics Simulation. [Internet] [Doctoral dissertation]. University of Central Florida; 2015. [cited 2021 Feb 27].
Available from: https://stars.library.ucf.edu/etd/5146.
Council of Science Editors:
Angelopoulou A. A Simulation-Based Task Analysis using Agent-Based, Discrete Event and System Dynamics Simulation. [Doctoral Dissertation]. University of Central Florida; 2015. Available from: https://stars.library.ucf.edu/etd/5146

NSYSU
7.
Chen, Yu-rong.
Effectively Aggregating Big Data for Visualization.
Degree: Master, Information Management, 2016, NSYSU
URL: http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0411116-090441
► With the fast development of the internet technologies, data is easily generated and collected. Those data could be useful based on how the enterprise or…
(more)
▼ With the fast development of the internet technologies,
data is easily generated and collected. Those
data could be useful based on how the enterprise or individuals can derive the valuable information from it. Before doing more complex
analysis, analyzers need to understand the
data, preferably in a visualization way, leading to the approach of Exploratory
Data Analysis(EDA). With EDA, analyzers can dig out the pattern or characteristic of
data and then choose the appropriate model for further
analysis. The common techniques of EDA include graphing, tabulation, and equation fitting, which could help the analyzers explore the
data and identify its regularity. Unfortunately, when the volume of
data is huge, traditional EDA methods may suffer from the lack of efficiency.
Our work uses R to develop an EDA software based on its features of
data exploration and rich package libraries and tries to efficiently visualize big
data. By applying
data reduction strategies, large volumes of
data could be reduced to some meaningful
data set with lower complexity and lower size. Specifically, we apply the strategy of binning for developing
data reduction methods. Equal-width is the most common binning method for aggregating continuous variables. Although equal-width had high efficiency, it had poor performance for skewness
data distribution. In this thesis, we compared three aggregation approaches: equal-width, equal-depth and MHist by assessing their time efficiencies and accuracies.
Experimental results showed that both equal-depth and MHist has much higher accuracy at some price of efficiency when compared to equal-width. MHist method performs well in various
data distributions but with lowest efficiency. The method equal-depth strikes a balance in that it has reasonable performance in both efficiency and accuracy.
Advisors/Committee Members: Yi-ling Lin (chair), San-Yia Hwang (committee member), Yung-Jan Cho (chair), Chien-Hsiang Lee (chair).
Subjects/Keywords: Data Discretization; Exploratory Data Analysis; Big data; Data Reduction; Data Visualization
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Chen, Y. (2016). Effectively Aggregating Big Data for Visualization. (Thesis). NSYSU. Retrieved from http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0411116-090441
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Chen, Yu-rong. “Effectively Aggregating Big Data for Visualization.” 2016. Thesis, NSYSU. Accessed February 27, 2021.
http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0411116-090441.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Chen, Yu-rong. “Effectively Aggregating Big Data for Visualization.” 2016. Web. 27 Feb 2021.
Vancouver:
Chen Y. Effectively Aggregating Big Data for Visualization. [Internet] [Thesis]. NSYSU; 2016. [cited 2021 Feb 27].
Available from: http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0411116-090441.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Chen Y. Effectively Aggregating Big Data for Visualization. [Thesis]. NSYSU; 2016. Available from: http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0411116-090441
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Alberta
8.
Luo, Dandan.
Models for Univariate and Multivariate Analysis of
Longitudinal and Clustered Data.
Degree: PhD, Department of Mathematical and Statistical
Sciences, 2012, University of Alberta
URL: https://era.library.ualberta.ca/files/s4655h86x
► Longitudinal studies of repeated observations on subjects are commonly undertaken in medical and biological sciences. The responses on a given occasion may be either univariate…
(more)
▼ Longitudinal studies of repeated observations on
subjects are commonly undertaken in medical and biological
sciences. The responses on a given occasion may be either
univariate or multivariate. We concentrate on three topics related
to longitudinal and clustered data analysis. The first topic is the
development of a class of generalized linear latent variable
models. The second involves the modelling of count data with excess
zeros. The third is the development of a non-Gaussian linear mixed
effects model for multiple outcomes. In addressing the first
problem, we propose random mean models to account for correlation
among repeated measures. We extend random mean models to include
mixed outcomes, renaming them random mean joint models. The
difficulty in joint modelling of continuous and discrete outcomes
is the lack of a natural multivariate distribution. We overcome the
difficulty by introducing two cross-correlated latent processes. We
apply the Monte Carlo EM (MCEM) algorithm to find the MLEs of
regression coefficients and variance components, by treating the
latent variables as missing data. This thesis also proposes
regression models for count data with excess zeros. We solve the
problem from a perspective different from that of mixture model
framework. By employing the zero truncated distribution and the
zero modified distribution, we establish a broad class of
distributions to model data with excess zeros. We consider the zero
modified Poisson regression model and zero modified binomial
regression model for cross-sectional data. We extend the zero
modified regression models to models with random effects. We
further extend random mean models to model zero-inflated data, and
formulate the corresponding zero modified random mean models. A
non-Gaussian linear mixed effects model for multiple outcomes is
proposed to the third question. The methodology is motivated by a
glaucoma study. The normality assumption for random effects may be
unrealistic, raising concerns about the validity of inferences on
fixed effects and random effects if it is violated. To accommodate
the skewness of the responses and the associations among multiple
characteristics, we propose a mixed effects model, in which
non-normal random effects are assumed by the log-gamma
distribution.
Subjects/Keywords: Longitudinal Data Analysis; Zero-inflated Data; Clustered Data Analysis
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Luo, D. (2012). Models for Univariate and Multivariate Analysis of
Longitudinal and Clustered Data. (Doctoral Dissertation). University of Alberta. Retrieved from https://era.library.ualberta.ca/files/s4655h86x
Chicago Manual of Style (16th Edition):
Luo, Dandan. “Models for Univariate and Multivariate Analysis of
Longitudinal and Clustered Data.” 2012. Doctoral Dissertation, University of Alberta. Accessed February 27, 2021.
https://era.library.ualberta.ca/files/s4655h86x.
MLA Handbook (7th Edition):
Luo, Dandan. “Models for Univariate and Multivariate Analysis of
Longitudinal and Clustered Data.” 2012. Web. 27 Feb 2021.
Vancouver:
Luo D. Models for Univariate and Multivariate Analysis of
Longitudinal and Clustered Data. [Internet] [Doctoral dissertation]. University of Alberta; 2012. [cited 2021 Feb 27].
Available from: https://era.library.ualberta.ca/files/s4655h86x.
Council of Science Editors:
Luo D. Models for Univariate and Multivariate Analysis of
Longitudinal and Clustered Data. [Doctoral Dissertation]. University of Alberta; 2012. Available from: https://era.library.ualberta.ca/files/s4655h86x

California State University – Sacramento
9.
Thasma Lakshmanan Balajibabu, Reka Supraja.
Obtaining insights into how intermittent fasting affects people with type 2 diabetes through interactive visual dashboards.
Degree: MS, Computer Science, 2020, California State University – Sacramento
URL: http://hdl.handle.net/10211.3/216267
► Many Americans are prediabetic or diabetic and are trying to control their diet using intermittent fasting. This form of dieting may reverse diabetes based on…
(more)
▼ Many Americans are prediabetic or diabetic and are trying to control their diet using intermittent fasting. This form of dieting may reverse diabetes based on several kinds of research. Therefore, this project is about researching and analyzing more on the relationship between diabetics and intermittent fasting
data sets to explore the impact of intermittent fasting on people with diabetes. The form of
analysis is through interactive visual exploratory dashboards as a summary webpage/tool/GUI to understand how intermittent fasting and other factors affect reversing diabetics using visual
data analysis tools such as Tableau and D3.js to create visualizations and integrated them using JavaScript and bootstrap and then hosted in GitHub. To obtain more insights on how intermittent fasting impacts peoples with diabetes, I present an interactive summary webpage/tool aimed to give a basic knowledge and awareness to people with diabetes to motivate themselves to reverse diabetes and to get rid of medicines. This tool can also be used by anyone who wants to explore diabetes statics and how to reverse diabetes. The tool will be interactive and will also focus on providing users a required glimpse of diabetes reversal through intermittent fasting. For example, if a medical provider wants to educate their patients about diabetes reversal options, this tool will become handy to deliver a basic idea. Through this tool users will be able to get insights on diabetes such as google search statistics on diabetes, average glucose level after the meal, diabetes statistics in the USA from 1980 to 2014, and in California from 2012 to 2018 on different categories such as total population, ethnicity, age, income, other characteristics on reversing diabetes, reversing diabetes through intermittent fasting by comparing blood glucose numbers (HbA1c), waist circumference and weight before and after someone starts intermittent fasting and how average blood glucose, weight, and other factors affect when someone takes two meals per day (Intermittent Fasting) versus six meals per day with the help of interactive visualizations (graphs and charts).
Advisors/Committee Members: Baynes, Anna.
Subjects/Keywords: Exploratory data analysis; D3.js; Tableau; Data visualization; Data analysis
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Thasma Lakshmanan Balajibabu, R. S. (2020). Obtaining insights into how intermittent fasting affects people with type 2 diabetes through interactive visual dashboards. (Masters Thesis). California State University – Sacramento. Retrieved from http://hdl.handle.net/10211.3/216267
Chicago Manual of Style (16th Edition):
Thasma Lakshmanan Balajibabu, Reka Supraja. “Obtaining insights into how intermittent fasting affects people with type 2 diabetes through interactive visual dashboards.” 2020. Masters Thesis, California State University – Sacramento. Accessed February 27, 2021.
http://hdl.handle.net/10211.3/216267.
MLA Handbook (7th Edition):
Thasma Lakshmanan Balajibabu, Reka Supraja. “Obtaining insights into how intermittent fasting affects people with type 2 diabetes through interactive visual dashboards.” 2020. Web. 27 Feb 2021.
Vancouver:
Thasma Lakshmanan Balajibabu RS. Obtaining insights into how intermittent fasting affects people with type 2 diabetes through interactive visual dashboards. [Internet] [Masters thesis]. California State University – Sacramento; 2020. [cited 2021 Feb 27].
Available from: http://hdl.handle.net/10211.3/216267.
Council of Science Editors:
Thasma Lakshmanan Balajibabu RS. Obtaining insights into how intermittent fasting affects people with type 2 diabetes through interactive visual dashboards. [Masters Thesis]. California State University – Sacramento; 2020. Available from: http://hdl.handle.net/10211.3/216267

Tampere University
10.
Syrjärinne, Paula.
Urban Traffic Analysis with Bus Location Data
.
Degree: 2016, Tampere University
URL: https://trepo.tuni.fi/handle/10024/98613
► Tässä työssä esitellään Tampereen alueen julkisen liikenteen linja-autoista kerätyn datan käyttöä ja analysointia. Aineistoa on analysoitu useilla eri algoritmeilla ja monesta eri näkökulmasta. Osa analyyseista…
(more)
▼ Tässä työssä esitellään Tampereen alueen julkisen liikenteen linja-autoista kerätyn datan käyttöä ja analysointia. Aineistoa on analysoitu useilla eri algoritmeilla ja monesta eri näkökulmasta. Osa analyyseista mittaa julkisen liikenteen palvelutasoa, osa tarjoaa matkustajille hyödyllistä lisäinformaatiota ja osa keskittyy havainnoimaan liikenteen yleistä sujuvuutta.
Työn alussa esitellään aiheeseen liittyviä taustatietoja ja aiemmin samasta aiheesta tehtyjä tutkimuksia. Erilaiset liikenteeseen liittyvät sensoriverkostot käydään läpi, keskittyen erityisesti sensoriautoverkostoihin. Kauttaaltaan työssä käsitellään Tampereen linja-autodataa liikkuvasta autosensoriverkosta kerättynä datana. Sensoriautoverkostojen analyysiin liittyvää kirjallisuutta esitellään työssä siten että tutkimukset on jaoteltu lähdedatan perusteella taksidataa, linja-autodataa ja mobiililaitedataa käsitteleviin artikkeleihin. Kuhunkin näistä liittyy erilaisia tutkimusongelmia.
Taksidataa käytettäessä puuttuvat havaintopisteet ovat yleisin ongelma, kun taas henkilöautoliikenteen mallintaminen linja-autoista kerätyn datan perusteella on tyypillinen kysymys bussidataa käyttävissä tutkimuksissa. Mobiililaitteista kerättyä dataa käytettäessä pitää sen sijaan yleensä ensin selvittää onko laite ylipäätään liikkuvassa ajoneuvossa.
Tampereen linja-autodata esitellään yksityiskohtaisesti. Tämä data on verrattain hyvälaatuista, koska sen päivitysnopeus on korkea, jokaiseen havaintoon on aina liittetty yksilölliset tunnisteet ja koko julkisen liikenteen verkoston alueelta on runsaasti havaintoja saatavilla. Kuten missä tahansa oikeasta lähteestä kerätyssä datassa, tässäkin aineistossa on kuitenkin ongelmia, kuten epäjohdonmukaisuuksia, virheitä ja kohinaa. Näiden virheiden odotettavissa olevat suuruusluokat on käyty datan esittelyssä läpi. Samoin esittelellään esikäsittelyprosessi, jossa dataa sekä puhdistetaan virheistä että sen kokoa ja muotoa muutetaan helpommin käytettäväksi tilastollisessa analyysissä.
Työn kokeellisessa osassa tarkastellaan aluksi datan käyttöä julkisen liikenteen toimivuuden mittaamisessa. Datasta on etsitty usein esiintyviä aika-paikka-linja –joukkoja, jotka paljastavat missä, milloin ja millä linjoilla bussit ovat säännöllisesti myöhässä. Sen lisäksi reittiajoja on jaoteltu paikan ja tapahtumien (kuten pysäkillä käynnit tai liikennevaloissa odottaminen) mukaan, jotta on löydetty syitä myöhästymisille.
Matkustajien kannalta tehdyissä kokeiluissa on toteutettu mm. dataan perustuvat pysäkkiaikataulut, jotka mukautuvat ajan mittaan todellisten saapumisaikojen mukaan. Saapumisajan lisäksi matkustajille annetaan arvio saapumisajan epävarmuudesta.
Yleisen liikenteen sujuvuuden analysoimiseksi esitellään katuosuusprofiilien käsite. Profiili kertoo kullekin pysäkinvälille normaalin ajoajan rajat kunakin vuorokaudenaikana. Profiileja voidaan käyttää pysäkinvälien luokitteluun esimerkiksi aamu- ja iltapäiväruuhkan vaikutusten mukaan, ja ne ovat perusta reaaliaikaisen poikkeustilamonitoroinnin tarpeisiin.…
Subjects/Keywords: data-analyysi
;
big data
;
älyliikenne
;
data analysis
;
big data
;
ITS
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Syrjärinne, P. (2016). Urban Traffic Analysis with Bus Location Data
. (Doctoral Dissertation). Tampere University. Retrieved from https://trepo.tuni.fi/handle/10024/98613
Chicago Manual of Style (16th Edition):
Syrjärinne, Paula. “Urban Traffic Analysis with Bus Location Data
.” 2016. Doctoral Dissertation, Tampere University. Accessed February 27, 2021.
https://trepo.tuni.fi/handle/10024/98613.
MLA Handbook (7th Edition):
Syrjärinne, Paula. “Urban Traffic Analysis with Bus Location Data
.” 2016. Web. 27 Feb 2021.
Vancouver:
Syrjärinne P. Urban Traffic Analysis with Bus Location Data
. [Internet] [Doctoral dissertation]. Tampere University; 2016. [cited 2021 Feb 27].
Available from: https://trepo.tuni.fi/handle/10024/98613.
Council of Science Editors:
Syrjärinne P. Urban Traffic Analysis with Bus Location Data
. [Doctoral Dissertation]. Tampere University; 2016. Available from: https://trepo.tuni.fi/handle/10024/98613

University of KwaZulu-Natal
11.
Vela Vela, Junior.
The employees’ perception on the adoption of big data analytics by selected medical aid organisations in Durban.
Degree: 2017, University of KwaZulu-Natal
URL: http://hdl.handle.net/10413/15171
► The increase of number of data available in today’s world has prompted different industries to find a way to get the value out of the…
(more)
▼ The increase of number of
data available in today’s world has prompted different industries to find a way to get the value out of the
data available. Big
data analytics is a term used to describe the
analysis of the enormous amount of
data. Therefore, practitioners and researchers are trying to understand the adoption of this new technology by companies, government, universities.
Big
data analytics has been used by some medical aid companies to improve the quality of schemes and products provided to clients by collecting, analysing accurate
data. However, the rate of acceptance and use of big
data analytics by medical aids organisations in South Africa is still unknown. In this dissertation, we discuss the employees’ perceptions on the adoption of big
data analytics by medical aid organizations in Durban. The benefits and challenges of big
data analytics in medical aid organizations was also discussed.
A conceptual framework was developed to structure the problem being investigated in this dissertation. To this end, five perceived factors that might influence the employees’ perception on the adoption of big
data analytics were examined: - perceived performance expectancy, - perceive price value, - perceived social influence, - perceived facilitating conditions, - perceived characteristic of Innovation.
A survey research was used as a research strategy. An exploratory nature of the study was chosen. Thus, there is no conclusive outcomes in this dissertation. Results show that generally employees have a positive perception on the adoption of big
data analytics. Constructs such as perceived performance expectancy, perceived price value, and the perceived characteristics of innovation proved to be influencing the employees’ attitudes towards the adoption of big
data analytics.
Advisors/Committee Members: Subramaniam, Prabhakar Rontala. (advisor).
Subjects/Keywords: Big data analysis.; Technology.; Big data.; Adoption.
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Vela Vela, J. (2017). The employees’ perception on the adoption of big data analytics by selected medical aid organisations in Durban. (Thesis). University of KwaZulu-Natal. Retrieved from http://hdl.handle.net/10413/15171
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Vela Vela, Junior. “The employees’ perception on the adoption of big data analytics by selected medical aid organisations in Durban.” 2017. Thesis, University of KwaZulu-Natal. Accessed February 27, 2021.
http://hdl.handle.net/10413/15171.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Vela Vela, Junior. “The employees’ perception on the adoption of big data analytics by selected medical aid organisations in Durban.” 2017. Web. 27 Feb 2021.
Vancouver:
Vela Vela J. The employees’ perception on the adoption of big data analytics by selected medical aid organisations in Durban. [Internet] [Thesis]. University of KwaZulu-Natal; 2017. [cited 2021 Feb 27].
Available from: http://hdl.handle.net/10413/15171.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Vela Vela J. The employees’ perception on the adoption of big data analytics by selected medical aid organisations in Durban. [Thesis]. University of KwaZulu-Natal; 2017. Available from: http://hdl.handle.net/10413/15171
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Otago
12.
Turei Stanton, Worik Macky.
Is Data Snooping responsible for Technical Analysis Rules Success?
.
Degree: 2013, University of Otago
URL: http://hdl.handle.net/10523/4346
► Data Snooping is often suspected when effective technical analysis rules are found or presented. It is difficult to tell if a result is due to…
(more)
▼ Data Snooping is often suspected when effective technical
analysis rules are found or presented. It is difficult to tell if a result is due to
data snooping, so evaluating technical
analysis rules often boils down to detecting
data snooping and if it has invalidated the results. Herein we look at several algorithms designed to increase (risk–adjusted) returns for investors, and several techniques for detecting or compensating for
data snooping.
We find no easy answer to detecting
data snooping. Many of the methods we look at are useful, but there is no known way to get around sparse
data and the unrepeatable nature of investment decisions. We conclude that
data snooping bias is a persistent risk and it is unlikely that there is any effective single solution to the problem. The best that we can do is be aware of the risk of
data snooping and to report how we have dealt with the risk as part of our
analysis.
Advisors/Committee Members: Crack, Timothy (advisor).
Subjects/Keywords: data snooping;
data mining;
technical analysis
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Turei Stanton, W. M. (2013). Is Data Snooping responsible for Technical Analysis Rules Success?
. (Masters Thesis). University of Otago. Retrieved from http://hdl.handle.net/10523/4346
Chicago Manual of Style (16th Edition):
Turei Stanton, Worik Macky. “Is Data Snooping responsible for Technical Analysis Rules Success?
.” 2013. Masters Thesis, University of Otago. Accessed February 27, 2021.
http://hdl.handle.net/10523/4346.
MLA Handbook (7th Edition):
Turei Stanton, Worik Macky. “Is Data Snooping responsible for Technical Analysis Rules Success?
.” 2013. Web. 27 Feb 2021.
Vancouver:
Turei Stanton WM. Is Data Snooping responsible for Technical Analysis Rules Success?
. [Internet] [Masters thesis]. University of Otago; 2013. [cited 2021 Feb 27].
Available from: http://hdl.handle.net/10523/4346.
Council of Science Editors:
Turei Stanton WM. Is Data Snooping responsible for Technical Analysis Rules Success?
. [Masters Thesis]. University of Otago; 2013. Available from: http://hdl.handle.net/10523/4346

University of Houston
13.
-4085-1454.
Design and Implementation of Real-time Student Performance Evaluation and Feedback System.
Degree: MS, Computer Science, 2017, University of Houston
URL: http://hdl.handle.net/10657/4565
► Undergraduate education is challenged by high dropout rates and by delayed student graduation due to dropping courses or having to repeat courses due to low…
(more)
▼ Undergraduate education is challenged by high dropout rates and by delayed student graduation due to dropping courses or having to repeat courses due to low academic performance. In this context, an early prediction of student-performance may help students to understand where they stand amongst their peers and to change the attitude with about the course they are taking. Moreover, it is important to identify students in time who need special attention and providing appropriate interventions, such as mentoring and conducting review sessions. The goal of this thesis is the design and implementation of real-time student-performance evaluation and feedback system (RSPEF) to improve graduation rates. RSPEF is an interactive, web-based system consisting of a Predictive
Analysis System (PAS) that uses machine-learning techniques to interpolate past student-performance into future, and the development of an Emergency Warning System (EWS) that identifies poor-performing students in courses. Moreover, a unified representation of student-background and student-performance
data is provided in form of a relational database schema that is suitable to be used to assess student’s performance across multiple courses, which is critical for the generalizability of RSPEF system. The system design includes core machine-learning &
data-
analysis engine, a relational database that is reusable across courses and an interactive web-based interface to continuously collect
data and create dashboards for users.
Advisors/Committee Members: Eick, Christoph F. (advisor), McNeil, Sara G. (committee member), Shi, Weidong (committee member).
Subjects/Keywords: Educational data mining; Data analysis; Machine learning
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
-4085-1454. (2017). Design and Implementation of Real-time Student Performance Evaluation and Feedback System. (Masters Thesis). University of Houston. Retrieved from http://hdl.handle.net/10657/4565
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Chicago Manual of Style (16th Edition):
-4085-1454. “Design and Implementation of Real-time Student Performance Evaluation and Feedback System.” 2017. Masters Thesis, University of Houston. Accessed February 27, 2021.
http://hdl.handle.net/10657/4565.
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
MLA Handbook (7th Edition):
-4085-1454. “Design and Implementation of Real-time Student Performance Evaluation and Feedback System.” 2017. Web. 27 Feb 2021.
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Vancouver:
-4085-1454. Design and Implementation of Real-time Student Performance Evaluation and Feedback System. [Internet] [Masters thesis]. University of Houston; 2017. [cited 2021 Feb 27].
Available from: http://hdl.handle.net/10657/4565.
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Council of Science Editors:
-4085-1454. Design and Implementation of Real-time Student Performance Evaluation and Feedback System. [Masters Thesis]. University of Houston; 2017. Available from: http://hdl.handle.net/10657/4565
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete

University of Connecticut
14.
Ibrahim, Abdelrahman Hosny.
Integrative Analysis of Heterogeneous Genomics Data for Triple Negative Breast Cancer and High Grade Serous Ovarian Cancer.
Degree: MS, Computer Science and Engineering, 2016, University of Connecticut
URL: https://opencommons.uconn.edu/gs_theses/1032
► The human body is made up of trillions of cells. Although all the human body cells contain the same DNA sequence inside their nuclei,…
(more)
▼ The human body is made up of trillions of cells. Although all the human body cells contain the same DNA sequence inside their nuclei, each one carries out its own function. Normally, human cells grow and divide to form daughter cells as the body needs them. When cells grow old, or lose their ability to function properly, they die (in a very organized way called apoptosis or programmed cell death) and new cells take their role. Cancer is a disease that is caused by uncontrolled division of abnormal cells in some part of the body, breaking the natural process of growing. Old or damaged cells survive when they should die, and new (abnormal) cells form when they are not needed. Some types of cancer form solid tumors, which are masses of tissue. Others, such as leukemias, do not form solid tumors. It is widely believed that cancer is caused by the accumulation of detrimental variation in the genome over the course of a lifetime. Variations can take several forms. Single Nucleotide Polymorphism (SNP) is a mutation in a single base of the DNA. Indels describe insertions or deletions of bases in the genome. Copy Number Variation (CNV) represents multiplied and deleted segments in a genome. Most of the time, one type of mutation is not sufficient to induce cancer formation.
In this study, we have investigated genomic datasets of a phase-1 clinical trial on triple-negative breast cancer and ovarian cancer patients. The goal is to identify genes that drive drug resistance. We have developed
data analysis pipelines to obtain genomics variations (somatic mutations and copy number variations) from the Whole Exome Sequencing (WES) raw
data of 35 triple-negative breast cancer (TNBC) and ovarian cancer patients. In addition, we have analyzed the gene expression levels and gene fusion from the RNA-Seq raw reads
data for a subset of 16 patients. This study is an effort toward optimizing the integrative
analysis of genomic datasets under certain limitations. The main limitation is the small number of samples in the clinical trial (as is the case in most clinical trials). Another challenge is to find an abstract way to analyze the raw sequencing
data given its large size and heterogeneity. The novelty of our work comes in following a
data science approach in answering such research questions. The unbiased and
data-driven approach was successful in identifying genes that are most likely related to the drug resistance. Our results will guide clinicians toward having an in-depth study of the driver genes.
Advisors/Committee Members: Reda Ammar, Sheida Nabavi, Sanguthevar Rajasekaran, Yufeng Wu, Reda Ammar, Sheida Nabavi.
Subjects/Keywords: cancer; data science; data analysis; genomics; NGS
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Ibrahim, A. H. (2016). Integrative Analysis of Heterogeneous Genomics Data for Triple Negative Breast Cancer and High Grade Serous Ovarian Cancer. (Masters Thesis). University of Connecticut. Retrieved from https://opencommons.uconn.edu/gs_theses/1032
Chicago Manual of Style (16th Edition):
Ibrahim, Abdelrahman Hosny. “Integrative Analysis of Heterogeneous Genomics Data for Triple Negative Breast Cancer and High Grade Serous Ovarian Cancer.” 2016. Masters Thesis, University of Connecticut. Accessed February 27, 2021.
https://opencommons.uconn.edu/gs_theses/1032.
MLA Handbook (7th Edition):
Ibrahim, Abdelrahman Hosny. “Integrative Analysis of Heterogeneous Genomics Data for Triple Negative Breast Cancer and High Grade Serous Ovarian Cancer.” 2016. Web. 27 Feb 2021.
Vancouver:
Ibrahim AH. Integrative Analysis of Heterogeneous Genomics Data for Triple Negative Breast Cancer and High Grade Serous Ovarian Cancer. [Internet] [Masters thesis]. University of Connecticut; 2016. [cited 2021 Feb 27].
Available from: https://opencommons.uconn.edu/gs_theses/1032.
Council of Science Editors:
Ibrahim AH. Integrative Analysis of Heterogeneous Genomics Data for Triple Negative Breast Cancer and High Grade Serous Ovarian Cancer. [Masters Thesis]. University of Connecticut; 2016. Available from: https://opencommons.uconn.edu/gs_theses/1032

University of Illinois – Chicago
15.
Swanlund, Andrew Peter.
Correcting for Rater Bias in the Presence of Non-Ignorable Missing Ratings.
Degree: 2016, University of Illinois – Chicago
URL: http://hdl.handle.net/10027/21553
► This thesis addresses the problem of non-ignorable missing ratings in judge rated data. A Bayesian bivariate probit ordinal missing data model implemented with Markov chain…
(more)
▼ This thesis addresses the problem of non-ignorable missing ratings in judge rated
data. A Bayesian bivariate probit ordinal missing
data model implemented with Markov chain Monte Carlo (MCMC) was applied to simulated and real-world
data sets to test the extent to which this proposed approach outperformed existing methods for analyzing judge rated
data across a variety of evaluation criteria and
data collection scenarios. The MCMC approach was compared to the many-facet Rasch model, generalizability theory (with a linear regression correction for rater effects), and the Rasch rating scale model. The objectives of the research were to test the extent to which the proposed methods could 1) calculate generalizability theory variance components when traditional methods could not be applied, and 2) produce more accurate latent trait measures than existing methods. The study used eight simulated
data sets with varying numbers of examinees, raters, items, and distributional properties of examinee ability estimates. In addition a real-world
data set consisting of classroom observations was used to test the applicability of the methods to non-simulated
data.
The Bayesian bivariate missing
data model produced variance component estimates (and D-study coefficients) that were quite accurate for measurement scenarios with only a single, randomly assigned rater. The MCMC approach yields confidence intervals with better coverage probabilities than traditional approaches, and this finding is consistent when raters are randomly or non-randomly assigned to examinees. This modeling approach more accurately models the uncertainty in examinee scores by taking into better account the error due to rater severity, and non-random assignment of raters.
Advisors/Committee Members: Karabatsos, George (advisor), Smith, Everett V (committee member), Yin, Yue (committee member), Martin, Ryan (committee member), Hedeker, Donald (committee member), Karabatsos, George (chair).
Subjects/Keywords: Psychometrics; Judge-Rated Data; Missing Data Analysis
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Swanlund, A. P. (2016). Correcting for Rater Bias in the Presence of Non-Ignorable Missing Ratings. (Thesis). University of Illinois – Chicago. Retrieved from http://hdl.handle.net/10027/21553
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Swanlund, Andrew Peter. “Correcting for Rater Bias in the Presence of Non-Ignorable Missing Ratings.” 2016. Thesis, University of Illinois – Chicago. Accessed February 27, 2021.
http://hdl.handle.net/10027/21553.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Swanlund, Andrew Peter. “Correcting for Rater Bias in the Presence of Non-Ignorable Missing Ratings.” 2016. Web. 27 Feb 2021.
Vancouver:
Swanlund AP. Correcting for Rater Bias in the Presence of Non-Ignorable Missing Ratings. [Internet] [Thesis]. University of Illinois – Chicago; 2016. [cited 2021 Feb 27].
Available from: http://hdl.handle.net/10027/21553.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Swanlund AP. Correcting for Rater Bias in the Presence of Non-Ignorable Missing Ratings. [Thesis]. University of Illinois – Chicago; 2016. Available from: http://hdl.handle.net/10027/21553
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Brno University of Technology
16.
Habr, Vojtěch.
Návrh manažerského reportingu a vizualizace dat: Proposal of Management Reporting and Data Visualization.
Degree: 2019, Brno University of Technology
URL: http://hdl.handle.net/11012/178792
► This bachelor thesis focuses on gaining the statistical data about admissions at the Faculty of Business and Management of Brno University of Technology. Theoretical background…
(more)
▼ This bachelor thesis focuses on gaining the statistical
data about admissions at the Faculty of Business and Management of Brno University of Technology. Theoretical background of working with
data is placed in the first part, the current state is analyzed in the second part and the third part contains a proposal to solve the problem.
Advisors/Committee Members: Kříž, Jiří (advisor), Luhan, Jan (referee).
Subjects/Keywords: data; analýza; reporting; data; analysis; reporting
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Habr, V. (2019). Návrh manažerského reportingu a vizualizace dat: Proposal of Management Reporting and Data Visualization. (Thesis). Brno University of Technology. Retrieved from http://hdl.handle.net/11012/178792
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Habr, Vojtěch. “Návrh manažerského reportingu a vizualizace dat: Proposal of Management Reporting and Data Visualization.” 2019. Thesis, Brno University of Technology. Accessed February 27, 2021.
http://hdl.handle.net/11012/178792.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Habr, Vojtěch. “Návrh manažerského reportingu a vizualizace dat: Proposal of Management Reporting and Data Visualization.” 2019. Web. 27 Feb 2021.
Vancouver:
Habr V. Návrh manažerského reportingu a vizualizace dat: Proposal of Management Reporting and Data Visualization. [Internet] [Thesis]. Brno University of Technology; 2019. [cited 2021 Feb 27].
Available from: http://hdl.handle.net/11012/178792.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Habr V. Návrh manažerského reportingu a vizualizace dat: Proposal of Management Reporting and Data Visualization. [Thesis]. Brno University of Technology; 2019. Available from: http://hdl.handle.net/11012/178792
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Texas A&M University
17.
Catalena, Kate Ashley.
Mining Student Submission Information to Refine Plagiarism Detection.
Degree: MS, Computer Science, 2020, Texas A&M University
URL: http://hdl.handle.net/1969.1/191583
► Plagiarism is becoming an increasingly important issue in introductory programming courses. There are several tools to assist with plagiarism detection, but they are not effective…
(more)
▼ Plagiarism is becoming an increasingly important issue in introductory programming courses. There are several tools to assist with plagiarism detection, but they are not effective for more basic programming assignments, like those in introductory courses. The proliferation of auto-grading platforms creates an opportunity to capture additional information about how students develop the solutions to their programming assignments. In this research, we identify how to extract information from an online autograding platform, Mimir Classroom, that can be useful in revealing patterns in solution development. We explore how and to what extent this additional information can be used to better support instructors when identifying cases of probable plagiarism.
We have developed a tool that takes the raw student assignment submissions from Mimir, analyzes them, and produces
data sets and visualizations that help instructors to refine information extracted by existing plagiarism detection platforms. The instructors can then take this information to further investigate any probable cases of plagiarism that have been found by the tool. Our main goal is to give insight into student behaviors and identify signals that can be effective indicatives of plagiarism. Furthermore, the framework can enable the
analysis of other aspects of students’ solution development processes that may be useful when reasoning about their learning. As an initial exploration scenario of the framework developed in this work, we have used student code submissions from the CSCE 121: Introduction to Program Design and Concepts course at Texas A&M University. We experimented with the student code submissions from the Fall 2018 and Fall 2019 offerings of the course.
Advisors/Committee Members: Da Silva, Dilma (advisor), Shipman, Frank (committee member), Moore, Michael (committee member), Narayanan, Krishna (committee member).
Subjects/Keywords: Computer Science Education; data mining; data analysis
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Catalena, K. A. (2020). Mining Student Submission Information to Refine Plagiarism Detection. (Masters Thesis). Texas A&M University. Retrieved from http://hdl.handle.net/1969.1/191583
Chicago Manual of Style (16th Edition):
Catalena, Kate Ashley. “Mining Student Submission Information to Refine Plagiarism Detection.” 2020. Masters Thesis, Texas A&M University. Accessed February 27, 2021.
http://hdl.handle.net/1969.1/191583.
MLA Handbook (7th Edition):
Catalena, Kate Ashley. “Mining Student Submission Information to Refine Plagiarism Detection.” 2020. Web. 27 Feb 2021.
Vancouver:
Catalena KA. Mining Student Submission Information to Refine Plagiarism Detection. [Internet] [Masters thesis]. Texas A&M University; 2020. [cited 2021 Feb 27].
Available from: http://hdl.handle.net/1969.1/191583.
Council of Science Editors:
Catalena KA. Mining Student Submission Information to Refine Plagiarism Detection. [Masters Thesis]. Texas A&M University; 2020. Available from: http://hdl.handle.net/1969.1/191583

Rutgers University
18.
Yaros, John Robert, 1982-.
Data mining perspectives on equity similarity prediction.
Degree: PhD, Computer Science, 2014, Rutgers University
URL: https://rucore.libraries.rutgers.edu/rutgers-lib/45580/
► Accurate identification of similar companies is invaluable to the financial and investing communities. To perform relative valuation, a key step is identifying a ``peer group''…
(more)
▼ Accurate identification of similar companies is invaluable to the financial and investing communities. To perform relative valuation, a key step is identifying a ``peer group'' containing the most similar companies. To hedge a stock portfolio, best results are often achieved by selling short a hedge portfolio with future time series of returns most similar to the original portfolio - generally those with the most similar companies. To achieve diversification, a common approach is to avoid portfolios containing any stocks that are highly similar to other stocks in the same portfolio. Yet, the identification of similar companies is often left to hands of single experts who devise sector/industry taxonomies or other structures to represent and quantify similarity. Little attention (at least in the public domain) has been given to the potential that may lie in data-mining techniques. In fact, much existing research considers sector/industry taxonomies to be ground truth and quantifies results of clustering algorithms by their agreement with the taxonomies. This dissertation takes an alternate view that proper identification of relevant features and proper application of machine learning and data mining techniques can achieve results that rival or even exceed the expert approaches. Two representations of similarity are considered: 1) a pairwise approach, wherein a value is computed to quantify the similarity for each pair of companies, and 2) a partition approach analogous to sector/industry taxonomies, wherein the universe of stocks is split into distinct groups such that the companies within each group are highly related to each other. To generate results for each representation, we consider three main datasets: historical stock-return correlation, equity-analyst coverage and news article co-occurrences. The latter two have hardly been considered previously. New algorithmic techniques are devised that operate on these datasets. In particular, a hypergraph partitioning algorithm is designed for imbalanced datasets, with implications beyond company similarity prediction, especially in consensus clustering.
Advisors/Committee Members: Imielinski, Tomasz (chair), Muthukrishnan, S. (internal member), Pavlovic, Vladimir (internal member), Tackett, Walter Alden (outside member).
Subjects/Keywords: Data mining – Analysis; Investments – Data processing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Yaros, John Robert, 1. (2014). Data mining perspectives on equity similarity prediction. (Doctoral Dissertation). Rutgers University. Retrieved from https://rucore.libraries.rutgers.edu/rutgers-lib/45580/
Chicago Manual of Style (16th Edition):
Yaros, John Robert, 1982-. “Data mining perspectives on equity similarity prediction.” 2014. Doctoral Dissertation, Rutgers University. Accessed February 27, 2021.
https://rucore.libraries.rutgers.edu/rutgers-lib/45580/.
MLA Handbook (7th Edition):
Yaros, John Robert, 1982-. “Data mining perspectives on equity similarity prediction.” 2014. Web. 27 Feb 2021.
Vancouver:
Yaros, John Robert 1. Data mining perspectives on equity similarity prediction. [Internet] [Doctoral dissertation]. Rutgers University; 2014. [cited 2021 Feb 27].
Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/45580/.
Council of Science Editors:
Yaros, John Robert 1. Data mining perspectives on equity similarity prediction. [Doctoral Dissertation]. Rutgers University; 2014. Available from: https://rucore.libraries.rutgers.edu/rutgers-lib/45580/

Ryerson University
19.
Tsakiltsidis, Sokratis.
Predicting the time-to-deliver of software changes.
Degree: 2016, Ryerson University
URL: https://digital.library.ryerson.ca/islandora/object/RULA%3A5844
► In this thesis we examine the application of survival analysis on time-to-deliver data. Successful prediction of the time necessary to deliver a new feature or…
(more)
▼ In this thesis we examine the application of survival analysis on time-to-deliver data. Successful prediction of the time necessary to deliver a new feature or fix a reported defect can assist in various phases and aspects of software development. We identify and try to overcome limitations when dealing with time-to-event data. Our proposed methodological framework includes use of survival analysis, utilization of incomplete information that might be available as censored data, and incorporation of random-effects through mixed-effects models for identification of hierarchical/clustered data within our dataset. We explore and experiment with a dataset from a large scale commercial software over a twelve year period of time. We show that we can successfully implement survival analysis, and that incorporation of random-effects provides a considerable advantage, however, incorporation of censored information is not proven to be advantageous in this case.
Subjects/Keywords: Numerical analysis – Data processing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Tsakiltsidis, S. (2016). Predicting the time-to-deliver of software changes. (Thesis). Ryerson University. Retrieved from https://digital.library.ryerson.ca/islandora/object/RULA%3A5844
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Tsakiltsidis, Sokratis. “Predicting the time-to-deliver of software changes.” 2016. Thesis, Ryerson University. Accessed February 27, 2021.
https://digital.library.ryerson.ca/islandora/object/RULA%3A5844.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Tsakiltsidis, Sokratis. “Predicting the time-to-deliver of software changes.” 2016. Web. 27 Feb 2021.
Vancouver:
Tsakiltsidis S. Predicting the time-to-deliver of software changes. [Internet] [Thesis]. Ryerson University; 2016. [cited 2021 Feb 27].
Available from: https://digital.library.ryerson.ca/islandora/object/RULA%3A5844.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Tsakiltsidis S. Predicting the time-to-deliver of software changes. [Thesis]. Ryerson University; 2016. Available from: https://digital.library.ryerson.ca/islandora/object/RULA%3A5844
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Tampere University
20.
Lin, Jake.
Metagenomic tools and applications toward Type 1 Diabetes
.
Degree: 2018, Tampere University
URL: https://trepo.tuni.fi/handle/10024/103701
► Laskennallinen mikrobipopulaatioiden havaitseminen tyypin 1 diabeteksen ulosteetäytteillä Ihmisen autoimmuunisairauksien puhkeamiseen vaikuttavat hyvin monenlaiset häiriöt solujen toiminnassa ja autoimmuunisairauksien kirjo on hyvin laaja. Esimerkiksi ykköstyypin diabetes…
(more)
▼ Laskennallinen mikrobipopulaatioiden havaitseminen tyypin 1 diabeteksen ulosteetäytteillä
Ihmisen autoimmuunisairauksien puhkeamiseen vaikuttavat hyvin monenlaiset häiriöt solujen toiminnassa ja autoimmuunisairauksien kirjo on hyvin laaja. Esimerkiksi ykköstyypin diabetes puhkeaa tyypillisesti lapsuuden aikana. Ykköstyypin diabetekseen sairastuneella henkilöllä immuunijärjestelmä tuhoaa insuliinia tuottavia haimasoluja, ja insuliinin määrä laskee haitallisen alas. Yksityiskohtaista tietoa ykköstyypin diabeteksen puhkeamismekanismeista ei ole, mutta useissa tutkimuksissa sairauden puhkeaminen on yhdistetty erilaisiin mikrobialtistuksiin ja erityisesti tietyn tyyppisiin virusinfektioihin.
Nykyaikaisten sekvensointiteknologioiden kehityksen myötä on mahdollista tuottaa laajoja metagenomiikka-aineistoja, joiden avulla saadaan yksityiskohtaista tietoa esimerkiksi vesi-, maaperä- tai ulostenäytteiden geneettisestä materiaalista. Metagenomiikkatutkimuksen avulla on myös mahdollista selvittää erilaisten mikrobiaalisten ja viruspohjaisten altistusten sekä populaatiodynamiikan vaikutusta ihmisen autoimmuunisairauksiin. Tässä työssä metagenomiikkatutkimusta sovelletaan viruspohjaisten ja mikrobiaalisten tekijöiden tutkimiseen ykköstyypin diabeteksen yhteydessä. Erityisesti työssä kehitettiin uusia menetelmiä lasten ja nuorten ulostenäytteistä saatujen metagenomiikka-aineistojen tilastolliseen ja laskennalliseen analyysiin.
Väitöskirjatyössä on kehitetty bioinformatiikkatyökalu Vipie, joka mahdollistaa metagenomiikkanäytteiden analysoinnin ja visualisoinnin helppokäyttöisessä web-pohjaisessa laskentaympäristössä. Bioinformatiikkatyökalu Vipie sekä kehitetyt menetelmät perustuvat avoimeen lähdekoodiin ja menetelmät ovat tällä hetkellä aktiivisessa käytössä useissa tutkimusryhmissä. Lisäksi väitöskirjatyön soveltavassa osiossa on raportoitu uusia biologisia löydöksiä ykköstyypin diabetekseen sairastuneiden lasten ja nuorten ulostenäytteiden virus- ja bakteerikannoista. Ykköstyypin diabetesta sairastavien henkilöiden suolistossa todettiin esimerkiksi olevan vahvempia bakteerien välisiä riippuvuuksia kuin terveellä henkilöllä.; Computational detection of environmental microbial populations with assessment in Type 1 Diabetes stool samples
Human autoimmune diseases stem from abnormal responses towards normal cells. Relatively common, there are many types of autoimmune diseases. For example, Type 1 diabetes impacting children and teenagers, is caused by gradual immune targeting and subsequent destruction of pancreatic beta cells. The exact reasons, or triggers for the initial targeting of the beta cells are not known but there have been multiple reports citing microbes, particularly certain virus infections and gut bacteria population shifts as possible culprits.
The introduction of metagenomics together with advances in next generation sequencing technologies have enabled detection of genetic material directly from the environmental samples such as soil, water and human stool. Prior to metagenomics, most bacterial…
Subjects/Keywords: Metagenomics
;
Sequencing
;
Data Analysis
;
T1D
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Lin, J. (2018). Metagenomic tools and applications toward Type 1 Diabetes
. (Doctoral Dissertation). Tampere University. Retrieved from https://trepo.tuni.fi/handle/10024/103701
Chicago Manual of Style (16th Edition):
Lin, Jake. “Metagenomic tools and applications toward Type 1 Diabetes
.” 2018. Doctoral Dissertation, Tampere University. Accessed February 27, 2021.
https://trepo.tuni.fi/handle/10024/103701.
MLA Handbook (7th Edition):
Lin, Jake. “Metagenomic tools and applications toward Type 1 Diabetes
.” 2018. Web. 27 Feb 2021.
Vancouver:
Lin J. Metagenomic tools and applications toward Type 1 Diabetes
. [Internet] [Doctoral dissertation]. Tampere University; 2018. [cited 2021 Feb 27].
Available from: https://trepo.tuni.fi/handle/10024/103701.
Council of Science Editors:
Lin J. Metagenomic tools and applications toward Type 1 Diabetes
. [Doctoral Dissertation]. Tampere University; 2018. Available from: https://trepo.tuni.fi/handle/10024/103701

University of Utah
21.
Gueye, Abdou Salam.
Mapping economic and health data: integration and analysis challenges.
Degree: PhD, Biomedical Informatics;, 2007, University of Utah
URL: http://content.lib.utah.edu/cdm/singleitem/collection/etd2/id/320/rec/730
► This work included four separate studies that each led to a publication. These studies shared a common relationship between economy or lifestyle and health while…
(more)
▼ This work included four separate studies that each led to a publication. These studies shared a common relationship between economy or lifestyle and health while employing different research methods using data integration. Each study addressed certain aspects of public health informatics. The cost-effectiveness of incorporating a Clinical Decision Support System (CGSS) in a multidimensional rural community intervention aimed to reduce inappropriate antibiotic prescription with calculated during the first study. The three additional studies integrated different levels of ecological and individual data in order to described relationship between economy and health. First, ecological data integration was utilized to create a dataset containing countries’ health and economic indicators in order to develop a prediction model of HIV sero-prevalence across Africa. Second, individual data integration was used to assess the role of a single lifestyle indicator (alcohol dependency) on health outcome. Third, ecological and individual data integration was used in order to measure the association between global economy and individual data. The first study demonstrated the cost-effectiveness of using a clinical decision support system in public health interventions. The three additional studies demonstrated, respectively, 1) the possibility to combine multivariate modeling and spatial clustering for predicting HIV sero-prevalence in different countries in sub-Saharan Africa; 2) that alcohol dependency at the time of end-stage renal disease onset is a risk factor for rental graft failure and recipient death; and 3) that, beyond 3 years post transplantation, when some recipients lose Medicare benefits, economic downturns might negatively affect the kidney graft and recipient survival. These studies provide diverse and rigorous research experiences related to public health, epidemiology, and informatics. Further development of the methods used would certainly help explain the relationship between the macroeconomic situation, population health, and individual health as it would facilitate data integration crossing temporal, geographic, and sciences.
Subjects/Keywords: Automatic Data Processing; Statistical Analysis
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Gueye, A. S. (2007). Mapping economic and health data: integration and analysis challenges. (Doctoral Dissertation). University of Utah. Retrieved from http://content.lib.utah.edu/cdm/singleitem/collection/etd2/id/320/rec/730
Chicago Manual of Style (16th Edition):
Gueye, Abdou Salam. “Mapping economic and health data: integration and analysis challenges.” 2007. Doctoral Dissertation, University of Utah. Accessed February 27, 2021.
http://content.lib.utah.edu/cdm/singleitem/collection/etd2/id/320/rec/730.
MLA Handbook (7th Edition):
Gueye, Abdou Salam. “Mapping economic and health data: integration and analysis challenges.” 2007. Web. 27 Feb 2021.
Vancouver:
Gueye AS. Mapping economic and health data: integration and analysis challenges. [Internet] [Doctoral dissertation]. University of Utah; 2007. [cited 2021 Feb 27].
Available from: http://content.lib.utah.edu/cdm/singleitem/collection/etd2/id/320/rec/730.
Council of Science Editors:
Gueye AS. Mapping economic and health data: integration and analysis challenges. [Doctoral Dissertation]. University of Utah; 2007. Available from: http://content.lib.utah.edu/cdm/singleitem/collection/etd2/id/320/rec/730

Dalhousie University
22.
Ni, Wenjia.
Application of Clustering, Logistic Regression and Decision
Tree Induction on EGM Data for Detection and Prediction of At-Risk
and Problem Gamblers.
Degree: Master of Electronic Commerce, Faculty of Computer Science, 2014, Dalhousie University
URL: http://hdl.handle.net/10222/53596
► The use of data mining techniques for problem gambling behaviour analysis has huge potential to offer players protection and to reduce the risk of gambling-related…
(more)
▼ The use of
data mining techniques for problem gambling
behaviour
analysis has huge potential to offer players protection
and to reduce the risk of gambling-related harms. In this thesis,
we apply three
data mining models—clustering, logistic regression
and decision tree on one month EGM player
data to separate players
into different groups, identify which gambling behaviour are highly
associated with gambling addiction, and derive predictive rules for
predicting potential at-risk and problem gamblers. We consequently
separated all players into four groups—non-problem gambler,
low-risk gambler, moderate-risk gambler, and problem gambler
groups, based on their similar behavioural characteristics. Three
behavioural indicators and four best predictive rules are finally
obtained to predict at-risk and problem gamblers. It is hoped that
this thesis will provide a useful resource for EGM manufacturers to
redesign their machines to avoid risky and problem gambling
behaviour.
Advisors/Committee Members: n/a (external-examiner), Dr. Evangelos E. Milios (graduate-coordinator), Dr. Evangelos E. Milios (thesis-reader), Dr. Qiguang Gao (thesis-reader), Dr. Christian Blouin (thesis-reader), Dr. Vlado Keselj (thesis-supervisor), Not Applicable (ethics-approval), Not Applicable (manuscripts), No (copyright-release).
Subjects/Keywords: Data Mining; Gambling behaviour analysis
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Ni, W. (2014). Application of Clustering, Logistic Regression and Decision
Tree Induction on EGM Data for Detection and Prediction of At-Risk
and Problem Gamblers. (Masters Thesis). Dalhousie University. Retrieved from http://hdl.handle.net/10222/53596
Chicago Manual of Style (16th Edition):
Ni, Wenjia. “Application of Clustering, Logistic Regression and Decision
Tree Induction on EGM Data for Detection and Prediction of At-Risk
and Problem Gamblers.” 2014. Masters Thesis, Dalhousie University. Accessed February 27, 2021.
http://hdl.handle.net/10222/53596.
MLA Handbook (7th Edition):
Ni, Wenjia. “Application of Clustering, Logistic Regression and Decision
Tree Induction on EGM Data for Detection and Prediction of At-Risk
and Problem Gamblers.” 2014. Web. 27 Feb 2021.
Vancouver:
Ni W. Application of Clustering, Logistic Regression and Decision
Tree Induction on EGM Data for Detection and Prediction of At-Risk
and Problem Gamblers. [Internet] [Masters thesis]. Dalhousie University; 2014. [cited 2021 Feb 27].
Available from: http://hdl.handle.net/10222/53596.
Council of Science Editors:
Ni W. Application of Clustering, Logistic Regression and Decision
Tree Induction on EGM Data for Detection and Prediction of At-Risk
and Problem Gamblers. [Masters Thesis]. Dalhousie University; 2014. Available from: http://hdl.handle.net/10222/53596

Vanderbilt University
23.
Teng, Zhongwei.
Implementation of Self-report mHealth Application and Data Analysis.
Degree: MS, Electrical Engineering, 2017, Vanderbilt University
URL: http://hdl.handle.net/1803/15163
► As a growing industry, mHealth has been adopted in many fields of healthcare, such as diabetes, hypertension, asthma, eating disorders and data collection. To improve…
(more)
▼ As a growing industry, mHealth has been adopted in many fields of healthcare, such as diabetes, hypertension, asthma, eating disorders and
data collection. To improve practicality of mHealth applications, authentication schema, as the credential guard to protect access to mobile application with sensitive patient
data, need to be evaluated according to the particularity of mHealth applications' user base. This paper identified several metrics for evaluating authentication schema of mHealth applications from the aspects of security and ease to use. Following these metrics, a QR-Code based schema is proposed as an alternative secure and convenient authentication way. This paper also evaluated different models of
data analysis on self-report
data.
Advisors/Committee Members: Richard Alan Peters (committee member), D. Mitchell Wilkes (committee member).
Subjects/Keywords: Data Analysis; Authentication; mHealth
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Teng, Z. (2017). Implementation of Self-report mHealth Application and Data Analysis. (Thesis). Vanderbilt University. Retrieved from http://hdl.handle.net/1803/15163
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Teng, Zhongwei. “Implementation of Self-report mHealth Application and Data Analysis.” 2017. Thesis, Vanderbilt University. Accessed February 27, 2021.
http://hdl.handle.net/1803/15163.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Teng, Zhongwei. “Implementation of Self-report mHealth Application and Data Analysis.” 2017. Web. 27 Feb 2021.
Vancouver:
Teng Z. Implementation of Self-report mHealth Application and Data Analysis. [Internet] [Thesis]. Vanderbilt University; 2017. [cited 2021 Feb 27].
Available from: http://hdl.handle.net/1803/15163.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Teng Z. Implementation of Self-report mHealth Application and Data Analysis. [Thesis]. Vanderbilt University; 2017. Available from: http://hdl.handle.net/1803/15163
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Vanderbilt University
24.
Carroll, Robert James.
Defining Phenotypes, Predicting Drug Response, and Discovering Genetic Associations in the Electronic Health Record with Applications in Rheumatoid Arthritis.
Degree: PhD, Biomedical Informatics, 2014, Vanderbilt University
URL: http://hdl.handle.net/1803/14730
► Electronic Health Records (EHRs) allow for the digital capture of patient information and have proven to be a valuable tool for patient treatment. In this…
(more)
▼ Electronic Health Records (EHRs) allow for the digital capture of patient information and have proven to be a valuable tool for patient treatment. In this dissertation, I explore reuse of EHR
data for clinical and genomic research with a focus on rheumatoid arthritis (RA). RA is a chronic autoimmune disorder that primarily affects joints with swelling, stiffness, and pain, and if left untreated can lead to permanent joint damage. Phenome wide association studies (PheWAS) leverage the breadth of codified diagnostic information about patients in the EHR to find disease associations. A package for the R statistical language is presented here that includes the tools needed to perform EHR-based or observational trial PheWAS, from ICD-9 code translation to association testing and meta-
analysis. It includes a versatile plotting system for phenotype related information following the Manhattan plot paradigm. This methodology is applied in conjunction with genetic risk scores (GRS) to assess pleiotropy and shared genetic risk among phenotypes. Investigations of 99 known risk variants for RA and three formulations of GRS show that the GRS is more specific to RA than the individual single nucleotide polymorphisms, but the GRSs had clinically interesting associations with hypothyroidism. Presented next is the development of an algorithm to retrospectively identify drug response to etanercept in the EHR. Using chart reviews and a variety of input
data including billing codes, processed free text, and medication entries, a support vector machine and random forest classifier were created that can discriminate between drug responders and non-responders with an area under the receiver operating characteristic curve of 0.939 and 0.923, respectively. The drug response algorithm was applied to create a case control cohort. Using these records, the final study identifies phenotypes associated with etanercept response, including fibromyalgia and several axial skeleton disease phenotypes: intervertebral disc disorders, degeneration of intervertebral disc, and spinal stenosis. Taken together, these studies demonstrate that EHR
data can be an important tool for clinical and genomic research, and offer particular promise for the study of RA.
Advisors/Committee Members: Tom Lasko (committee member), Hua Xu (committee member), Digna Velez-Edwards (committee member), Jeremy Warner (committee member), Josh Denny (Committee Chair).
Subjects/Keywords: Secondary use; data analysis; informatics
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Carroll, R. J. (2014). Defining Phenotypes, Predicting Drug Response, and Discovering Genetic Associations in the Electronic Health Record with Applications in Rheumatoid Arthritis. (Doctoral Dissertation). Vanderbilt University. Retrieved from http://hdl.handle.net/1803/14730
Chicago Manual of Style (16th Edition):
Carroll, Robert James. “Defining Phenotypes, Predicting Drug Response, and Discovering Genetic Associations in the Electronic Health Record with Applications in Rheumatoid Arthritis.” 2014. Doctoral Dissertation, Vanderbilt University. Accessed February 27, 2021.
http://hdl.handle.net/1803/14730.
MLA Handbook (7th Edition):
Carroll, Robert James. “Defining Phenotypes, Predicting Drug Response, and Discovering Genetic Associations in the Electronic Health Record with Applications in Rheumatoid Arthritis.” 2014. Web. 27 Feb 2021.
Vancouver:
Carroll RJ. Defining Phenotypes, Predicting Drug Response, and Discovering Genetic Associations in the Electronic Health Record with Applications in Rheumatoid Arthritis. [Internet] [Doctoral dissertation]. Vanderbilt University; 2014. [cited 2021 Feb 27].
Available from: http://hdl.handle.net/1803/14730.
Council of Science Editors:
Carroll RJ. Defining Phenotypes, Predicting Drug Response, and Discovering Genetic Associations in the Electronic Health Record with Applications in Rheumatoid Arthritis. [Doctoral Dissertation]. Vanderbilt University; 2014. Available from: http://hdl.handle.net/1803/14730

Texas A&M University
25.
Ding, Weihao.
Experiment of Mapper Algorithm on High-Dimensional Data in Microseismic Monitoring.
Degree: MS, Petroleum Engineering, 2017, Texas A&M University
URL: http://hdl.handle.net/1969.1/166077
► The objective of this research to utilize data driven methods to analyze microseismic monitoring, especially using Topological data analysis (TDA) with limited physically based approaches.…
(more)
▼ The objective of this research to utilize
data driven methods to analyze
microseismic monitoring, especially using Topological
data analysis (TDA) with limited
physically based approaches. Python Mapper (PM) is the tool of TDA for this study.
Microseismic
data has great characteristics of big
data. Previous studies suggesting
stage-by-stage microseismic
analysis also avoid the limitation of current software, which
can only process slightly over 10,000
data points. During this study, more TDA
packages are constantly evolving to handle larger and more complex
data such as Betti
Mapper by Spark.
PM is a tool by combining topology principles and machine learning methods
into an integrated
data analytic implementation. The high-dimensionality of
microseismic
data practically limits what classical statistical analyses can achieve.
Machine learning techniques such as dimensionality reduction are required for such
datasets. Where PM stands out is its ability to retain the raw feature of
data set when
machine-learning algorithm is applied.
The first portion of the study is to observe the
data point relation of microseismic
data entirely and stage-by-stage. Dividing attributes into location and signal
data reveals
the relation within and between two different
data types.
The main discovery from location
data of network is the high density areas are
tend to be earlier events and could locate where high pressure start to build up, or the
origins of the fracture networks. Origins that are far apart in the beginning grow into
each other to result in one (most of the time) or more (rarely more than two) networks.
The fracture growth with complex directions of extensions can be represented with a
much simpler, single-directional network. Signal
data reveals location-specific
data
quality trends. These trends are hardly visible if attributes are investigated in pairs but
obvious when mapped altogether. Locational and geological characteristics may be an
explanation, but this needs further information to prove the observations. In fracture
growth softwares, these trends will allow researchers to ignore the location of the
wellbore and focuses at the actual origins of the fracture network. An override including
discontinuity of the network and confidence of stimulated reservoir volume could be
manually added to improve the accuracy of the fracture simulation.
A sensitivity
analysis to PM parameters is carried out to test the robustness of the
method and comparing raw
data clustering method to prove the effectiveness and
benefits of using TDA. TDA is a great method for
data preprocesses, analyses, and has
virtually infinite possibility, but should never be the end of a project. The results from
PM could be used as input for many other studies.
Advisors/Committee Members: Killough, John (advisor), Gildin, Eduardo (committee member), Barrufet, Maria (committee member).
Subjects/Keywords: Data Analysis; Microseismic; Topological; Mapper
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Ding, W. (2017). Experiment of Mapper Algorithm on High-Dimensional Data in Microseismic Monitoring. (Masters Thesis). Texas A&M University. Retrieved from http://hdl.handle.net/1969.1/166077
Chicago Manual of Style (16th Edition):
Ding, Weihao. “Experiment of Mapper Algorithm on High-Dimensional Data in Microseismic Monitoring.” 2017. Masters Thesis, Texas A&M University. Accessed February 27, 2021.
http://hdl.handle.net/1969.1/166077.
MLA Handbook (7th Edition):
Ding, Weihao. “Experiment of Mapper Algorithm on High-Dimensional Data in Microseismic Monitoring.” 2017. Web. 27 Feb 2021.
Vancouver:
Ding W. Experiment of Mapper Algorithm on High-Dimensional Data in Microseismic Monitoring. [Internet] [Masters thesis]. Texas A&M University; 2017. [cited 2021 Feb 27].
Available from: http://hdl.handle.net/1969.1/166077.
Council of Science Editors:
Ding W. Experiment of Mapper Algorithm on High-Dimensional Data in Microseismic Monitoring. [Masters Thesis]. Texas A&M University; 2017. Available from: http://hdl.handle.net/1969.1/166077

Texas A&M University
26.
Almarzooq, Anas Mohammadali S.
The Implications and Flow Behavior of the Hydraulically Fractured Wells in Shale Gas Formation.
Degree: MS, Petroleum Engineering, 2012, Texas A&M University
URL: http://hdl.handle.net/1969.1/ETD-TAMU-2010-12-8626
► Shale gas formations are known to have low permeability. This low permeability can be as low as 100 nano darcies. Without stimulating wells drilled in…
(more)
▼ Shale gas formations are known to have low permeability. This low permeability can be as low as 100 nano darcies. Without stimulating wells drilled in the shale gas formations, it is hard to produce them at an economic rate. One of the stimulating approaches is by drilling horizontal wells and hydraulically fracturing the formation. Once the formation is fractured, different flow patterns will occur. The dominant flow regime observed in the shale gas formation is the linear flow or the transient drainage from the formation matrix toward the hydraulic fracture. This flow could extend up to years of production and it can be identified by half slop on the log-log plot of the gas rate against time. It could be utilized to evaluate the hydraulic fracture surface area and eventually evaluate the effectiveness of the completion job. Different models from the literature can be used to evaluate the completion job. One of the models used in this work assumes a rectangular reservoir with a slab shaped matrix between each two hydraulic fractures. From this model, there are at least five flow regions and the two regions discussed are the Region 2 in which bilinear flow occurs as a result of simultaneous drainage form the matrix and hydraulic fracture. The other is Region 4 which results from transient matrix drainage which could extend up to many years. The Barnett shale production
data will be utilized throughout this work to show sample of the calculations.
This first part of this work will evaluate the field
data used in this study following a systematic procedure explained in Chapter III. This part reviews the historical production, reservoir and fluid
data and well completion records available for the wells being analyzed. It will also check for
data correlations from the
data available and explain abnormal flow behaviors that might occur utilizing the field production
data. It will explain why some wells might not fit into each model. This will be followed by a preliminary diagnosis, in which flow regimes will be identified, unclear
data will be filtered, and interference and liquid loading
data will be pointed. After completing the
data evaluation, this work will evaluate and compare the different methods available in the literature in order to decide which method will best fit to analyze the production
data from the Barnett shale. Formation properties and the original gas in place will be evaluated and compared for different methods.
Advisors/Committee Members: Wattenbarger, Robert A. (advisor), Sun, Yuefeng (committee member), Maggard, Bryan (committee member).
Subjects/Keywords: Shale gas; production data analysis
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Almarzooq, A. M. S. (2012). The Implications and Flow Behavior of the Hydraulically Fractured Wells in Shale Gas Formation. (Masters Thesis). Texas A&M University. Retrieved from http://hdl.handle.net/1969.1/ETD-TAMU-2010-12-8626
Chicago Manual of Style (16th Edition):
Almarzooq, Anas Mohammadali S. “The Implications and Flow Behavior of the Hydraulically Fractured Wells in Shale Gas Formation.” 2012. Masters Thesis, Texas A&M University. Accessed February 27, 2021.
http://hdl.handle.net/1969.1/ETD-TAMU-2010-12-8626.
MLA Handbook (7th Edition):
Almarzooq, Anas Mohammadali S. “The Implications and Flow Behavior of the Hydraulically Fractured Wells in Shale Gas Formation.” 2012. Web. 27 Feb 2021.
Vancouver:
Almarzooq AMS. The Implications and Flow Behavior of the Hydraulically Fractured Wells in Shale Gas Formation. [Internet] [Masters thesis]. Texas A&M University; 2012. [cited 2021 Feb 27].
Available from: http://hdl.handle.net/1969.1/ETD-TAMU-2010-12-8626.
Council of Science Editors:
Almarzooq AMS. The Implications and Flow Behavior of the Hydraulically Fractured Wells in Shale Gas Formation. [Masters Thesis]. Texas A&M University; 2012. Available from: http://hdl.handle.net/1969.1/ETD-TAMU-2010-12-8626

McMaster University
27.
Chenxi, Yu.
Incorporating Historical Data via Bayesian Analysis Based on The Logit Model.
Degree: MSc, 2018, McMaster University
URL: http://hdl.handle.net/11375/23978
► This thesis presents a Bayesian approach to incorporate historical data. Usually, in statistical inference, a large data size is required to establish a strong evidence.…
(more)
▼ This thesis presents a Bayesian approach to incorporate historical data. Usually, in statistical inference, a large data size is required to establish a strong evidence. However, in most bioassay experiments, dataset is of limited size. Here, we proposed a method that is able to incorporate control groups data from historical studies. The approach is framed in the context of testing whether an increased dosage of the chemical is associated with increased probability of the adverse event. To test whether such a relationship exists, the proposed approach compares two logit models via Bayes factor. In particular, we eliminate the effect of survival time by using poly-k test. We test the performance of the proposed approach by applying it to six simulated scenarios.
Thesis
Master of Science (MSc)
This thesis presents a Bayesian approach to incorporate historical data. Usually, in statistical inference, a large data size is required to establish a strong evidence. However, in most bioassay experiments, dataset is of limited size. Here, we proposed a method that is able to incorporate control groups data from historical studies. The approach is framed in the context of testing whether an increased dosage of the chemical is associated with increased probability of the adverse event. To test whether such a relationship exists, the proposed approach compares two logit models via Bayes factor. In particular, we eliminate the effect of survival time by using poly-k test. We test the performance of the proposed approach by applying it to six simulated scenarios.
Advisors/Committee Members: Narayanaswamy, Balakrishnan Jr, Mathematics and Statistics.
Subjects/Keywords: Bayesian analysis; historical data
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Chenxi, Y. (2018). Incorporating Historical Data via Bayesian Analysis Based on The Logit Model. (Masters Thesis). McMaster University. Retrieved from http://hdl.handle.net/11375/23978
Chicago Manual of Style (16th Edition):
Chenxi, Yu. “Incorporating Historical Data via Bayesian Analysis Based on The Logit Model.” 2018. Masters Thesis, McMaster University. Accessed February 27, 2021.
http://hdl.handle.net/11375/23978.
MLA Handbook (7th Edition):
Chenxi, Yu. “Incorporating Historical Data via Bayesian Analysis Based on The Logit Model.” 2018. Web. 27 Feb 2021.
Vancouver:
Chenxi Y. Incorporating Historical Data via Bayesian Analysis Based on The Logit Model. [Internet] [Masters thesis]. McMaster University; 2018. [cited 2021 Feb 27].
Available from: http://hdl.handle.net/11375/23978.
Council of Science Editors:
Chenxi Y. Incorporating Historical Data via Bayesian Analysis Based on The Logit Model. [Masters Thesis]. McMaster University; 2018. Available from: http://hdl.handle.net/11375/23978

Penn State University
28.
Ozdemir, Ali.
Latency Analysis of Data Mining Codes.
Degree: 2014, Penn State University
URL: https://submit-etda.libraries.psu.edu/catalog/22805
► According to the requirements posed by recent developments in computing, we need better architectures and software. In order to achieve these goals, we evaluate these…
(more)
▼ According to the requirements posed by recent developments in computing, we need better architectures and software. In order to achieve these goals, we evaluate these systems by running test jobs and simulations. In this study, we analyze the latency of
data mining applications from NU-MINEBENCH Benchmark suite from Northwestern University. We present the results for 32 out of order cores, 4x8 mesh architecture with 32 kB L1 instruction and
data caches, and 32 L2 cache banks distributed over the network with each bank having 512 kB capacity. We run multiple multithreaded benchmarks simultaneously and collect the results of latencies between L1-L2, L2-Memory Controllers (MCs), Memory, MC-L2 and L2-L1 for continuous 100 million cycles and same latencies until the end of simulation. We present the results in two graphs for each workloads. The first one, Type A, shows the latencies for consecutive 100 million cycles after fast forwarding 1.1 billion cycles. The second graph, Type B, shows the breakdown of the total latency of memory requests among the different memory hierarchies.
Advisors/Committee Members: Mahmut Taylan Kandemir, Thesis Advisor/Co-Advisor.
Subjects/Keywords: Latency Analysis Data Mining Codes
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Ozdemir, A. (2014). Latency Analysis of Data Mining Codes. (Thesis). Penn State University. Retrieved from https://submit-etda.libraries.psu.edu/catalog/22805
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Ozdemir, Ali. “Latency Analysis of Data Mining Codes.” 2014. Thesis, Penn State University. Accessed February 27, 2021.
https://submit-etda.libraries.psu.edu/catalog/22805.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Ozdemir, Ali. “Latency Analysis of Data Mining Codes.” 2014. Web. 27 Feb 2021.
Vancouver:
Ozdemir A. Latency Analysis of Data Mining Codes. [Internet] [Thesis]. Penn State University; 2014. [cited 2021 Feb 27].
Available from: https://submit-etda.libraries.psu.edu/catalog/22805.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Ozdemir A. Latency Analysis of Data Mining Codes. [Thesis]. Penn State University; 2014. Available from: https://submit-etda.libraries.psu.edu/catalog/22805
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Penn State University
29.
Hu, Xiaocheng.
DATA MINING ON CORPORATE FILLING BASED ON BAYESIAN LEARNING APPROACH.
Degree: 2015, Penn State University
URL: https://submit-etda.libraries.psu.edu/catalog/25731
► Most of the researches on corporate filling mainly focus on qualitative analysis. This thesis used quantitative method-Bayesian Learning Machine in analyzing the information content of…
(more)
▼ Most of the researches on corporate filling mainly focus on qualitative
analysis. This thesis used quantitative method-Bayesian Learning Machine in analyzing the information content of future prediction statements (FPS) in the Management Discussion and
Analysis section of 10-Q fillings.
The thesis proposed a new approach that involved a combination of mathematical methods and text
analysis. Naïve Bayesian machine learning approach was used to examine the prediction of future performance of the company.
In conclusion, the average profit tone of FPS is negatively associated with profit predict and negatively associated with other predict which includes the prediction related to employees, regulations, accounting and other. The liquidity tone is negatively associated with other predict. The overall tone is negatively associated with other predict.
Advisors/Committee Members: Tao Yao, Thesis Advisor/Co-Advisor.
Subjects/Keywords: data mining; text analysis; Perl
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Hu, X. (2015). DATA MINING ON CORPORATE FILLING BASED ON BAYESIAN LEARNING APPROACH. (Thesis). Penn State University. Retrieved from https://submit-etda.libraries.psu.edu/catalog/25731
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Hu, Xiaocheng. “DATA MINING ON CORPORATE FILLING BASED ON BAYESIAN LEARNING APPROACH.” 2015. Thesis, Penn State University. Accessed February 27, 2021.
https://submit-etda.libraries.psu.edu/catalog/25731.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Hu, Xiaocheng. “DATA MINING ON CORPORATE FILLING BASED ON BAYESIAN LEARNING APPROACH.” 2015. Web. 27 Feb 2021.
Vancouver:
Hu X. DATA MINING ON CORPORATE FILLING BASED ON BAYESIAN LEARNING APPROACH. [Internet] [Thesis]. Penn State University; 2015. [cited 2021 Feb 27].
Available from: https://submit-etda.libraries.psu.edu/catalog/25731.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Hu X. DATA MINING ON CORPORATE FILLING BASED ON BAYESIAN LEARNING APPROACH. [Thesis]. Penn State University; 2015. Available from: https://submit-etda.libraries.psu.edu/catalog/25731
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Newcastle
30.
Cheema, Salman Arif.
The aggregate association index and its extensions.
Degree: PhD, 2016, University of Newcastle
URL: http://hdl.handle.net/1959.13/1312984
► Research Doctorate - Doctor of Philosophy (PhD)
The analysis of aggregate data from 2x2 contingency tables has a long and interesting history. Traditionally, the approach…
(more)
▼ Research Doctorate - Doctor of Philosophy (PhD)
The analysis of aggregate data from 2x2 contingency tables has a long and interesting history. Traditionally, the approach taken to estimate the unknown cell frequencies (or some function of them) is to use ecological inference (EI). However, EI relies on assumptions that are either untestable or are unrealistic. Rather than adopting strategies to estimate the unknown cells, one may instead focus on understanding the underlying association structure between the variables using the <i>Aggregate Association Index</i> (AAI). Given only the aggregate data, the AAI quantifies how likely it is that an association exists between two nominal dichotomous variables when a test of independence is performed at the α level of significance. Such a test therefore relies on Pearson’s chi-squared statistic and does so in terms of the conditional proportion P₁. Here, P₁ is the proportion of individuals/subjects classified into the first column category of the 2x2 table given that they are classified into the first row category. This thesis discusses and expands upon the AAI which was proposed less than a decade ago. The generalisations and variants of the original AAI that we propose highlight the emerging growth of this index in the context of aggregate data analysis and how the AAI overcomes many of the pitfalls that confront the analyst when performing EI. We generalise the AAI to incorporate various linear transformations related to P₁ and demonstrate the invariance of the index to <i>any</i> linear transformation; for example, such transformations include the independence ratio, Pearson contingency, standardised residual and adjusted residual. We also show how the AAI is linked to one of the most common measures of association used to analyse 2x2 contingency tables – the odds ratio. The link between the AAI and odds ratio is investigated further as we establish the theoretical relationship between the index and the extended hypergeometric distribution. In doing so, the analyst may consider any <i>a priori</i> association structure using a new variant of the AAI called the <i>Extended Aggregate Association Index</i> (EAAI). Further extensions of the AAI are also made by generalising the index to incorporate the structure of ordered dichotomous variables. This is achieved by examining the features of ordinal log-linear models and how they may be used to analyse aggregate data. Since the underlying statistic that we shall be using is Pearson’s chi-squared statistic, its magnitude (and therefore the magnitude of the AAI) is strongly influenced by the size of the sample being studied. So, this thesis examines the impact of the sample size on the AAI and proposes strategies to minimise the impact of the sample size on the magnitude of the index. We also introduce the pseudo p-value so that the analyst can evaluate the relative significance of the AAI while isolating the impact of the sample size. Another new measure of association for analysing aggregate data is proposed in this thesis…
Advisors/Committee Members: University of Newcastle. Faculty of Science & Information Technology, School of Mathematical and Physical Sciences.
Subjects/Keywords: aggregate; categorical; data analysis
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Cheema, S. A. (2016). The aggregate association index and its extensions. (Doctoral Dissertation). University of Newcastle. Retrieved from http://hdl.handle.net/1959.13/1312984
Chicago Manual of Style (16th Edition):
Cheema, Salman Arif. “The aggregate association index and its extensions.” 2016. Doctoral Dissertation, University of Newcastle. Accessed February 27, 2021.
http://hdl.handle.net/1959.13/1312984.
MLA Handbook (7th Edition):
Cheema, Salman Arif. “The aggregate association index and its extensions.” 2016. Web. 27 Feb 2021.
Vancouver:
Cheema SA. The aggregate association index and its extensions. [Internet] [Doctoral dissertation]. University of Newcastle; 2016. [cited 2021 Feb 27].
Available from: http://hdl.handle.net/1959.13/1312984.
Council of Science Editors:
Cheema SA. The aggregate association index and its extensions. [Doctoral Dissertation]. University of Newcastle; 2016. Available from: http://hdl.handle.net/1959.13/1312984
◁ [1] [2] [3] [4] [5] … [195] ▶
.