Advanced search options

Advanced Search Options 🞨

Browse by author name (“Author name starts with…”).

Find ETDs with:

in
/  
in
/  
in
/  
in

Written in Published in Earliest date Latest date

Sorted by

Results per page:

Sorted by: relevance · author · university · dateNew search

You searched for subject:(Apache Hive). Showing records 1 – 3 of 3 total matches.

Search Limiters

Last 2 Years | English Only

No search limiters apply to these results.

▼ Search Limiters


NSYSU

1. Chuang, Shun-hsien. The study of a big data platform for mining computation and storage : the case study of national health insurance database.

Degree: Master, Computer Science and Engineering, 2015, NSYSU

Our country has a complete national health insurance. The treatment records of patients are recorded into the National Health Insurance Research Database. With the treatment records of patients, we can use big data technology to analyze the treatment records of patients and to get useful information. This useful information is currently used in the preventive healthcare or improves the effectiveness of treatment and medication. We hope these researchs can help reduce the cost of nationl health insurance or enhance the effectiveness and quality of health care, even we can prevent diseases. However, these massive health insurance data are not suitable for usual data processing software. We must import a new approach to solve these massive health insurance data. In this paper, health insurance data are stored into the big data computing platform, then, we use a vatiety of analysis tools for accelerating data analysis. The purpose of this paper is discussing which analysis tool is suitable for the usual research topic of medical staffs. We will use several typical computations to analyze the performance of data computing and storing. Keywords: National Health Insurance Research Database, Big Data Computing, Apache Hadoop, Apache Hive, Hive on Spark, Impala Advisors/Committee Members: Yen-Hsia Wen (chair), Jui-Hsiu Tsai (chair), Shi-Huang Chen (chair), Chun-Hung Lin (committee member), Shihn-Sheng Wu (chair).

Subjects/Keywords: Impala; Big Data Computing; Apache Hadoop; Hive on Spark; Apache Hive; National Health Insurance Research Database

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Chuang, S. (2015). The study of a big data platform for mining computation and storage : the case study of national health insurance database. (Thesis). NSYSU. Retrieved from http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0624115-131244

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Chicago Manual of Style (16th Edition):

Chuang, Shun-hsien. “The study of a big data platform for mining computation and storage : the case study of national health insurance database.” 2015. Thesis, NSYSU. Accessed December 06, 2019. http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0624115-131244.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

MLA Handbook (7th Edition):

Chuang, Shun-hsien. “The study of a big data platform for mining computation and storage : the case study of national health insurance database.” 2015. Web. 06 Dec 2019.

Vancouver:

Chuang S. The study of a big data platform for mining computation and storage : the case study of national health insurance database. [Internet] [Thesis]. NSYSU; 2015. [cited 2019 Dec 06]. Available from: http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0624115-131244.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Council of Science Editors:

Chuang S. The study of a big data platform for mining computation and storage : the case study of national health insurance database. [Thesis]. NSYSU; 2015. Available from: http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0624115-131244

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

2. Moser, Matjaž. COMPARISON OF MYSQL, NEO4J AND APACHE HIVE DATABASE MANAGEMENT SYSTEMS.

Degree: 2016, Univerza v Mariboru

In this work a comparison of three different database management systems (DBMS) is presented. In general a relational data model with a graph data model is compared. The three systems used are MySQL, Neo4j and Apache Hive. Neo4j is a member of the NoSQL database family and is a well-known graph database. The relational databases used in this work are MySQL and Apache Hive. The latter is not a classic relational model but since it is modelled as such it can be considered relational. MySQL is a well-known solution that has been on the market for several years now and is a standard solution for many data problems. Within this work a detailed comparison of all three systems from different aspects of usage is made, both from our own experience as well as from preliminary research from other authors and sources. Additionally some practical information from our dataset with some simple mining techniques is extracted and the result of our work is visualised in an interactive web side using modern approaches to data visualisation.

Jedro tega dela predstavlja primerjava časov izvedbe poizvedb na vseh treh sistemih. Poizvedbe so zasnovane z namenom preizkusa čim bolj širokega nabora aktivnosti (računanje, združevanje več tabel, iskanje po vrednosti, iskanje po razponu vrednosti ipd.. ). Na podlagi časov teh poizvedb smo prišli, do zaključka, da je v danih razmerah in v primeru relativno majhne količine podatkov MySQL najboljša izbira, sledi mu Neo4j in na zadnje Apache Hive. Glede ostalih vidikov primerjave bi bilo težko soditi saj so podvrženi posameznikovi presoji in znanju. Izračunali smo oddaljenost na podlagi skupnih sestavin v receptih kuhinj za vse pare med kuhinjami. Za potrebe teh izračunov smo ustvarili skripto z jezikom Python. Rezultate izračunov smo prikazali kot interaktivno spletno aplikacijo. Spletna stran prikazuje zemljevid sveta in na podlagi uporabnikovega klika na zemljevidu prikaže oddaljenost različnih kuhinj od izbrane, pri čemer uporablja za prikaz barvno lestvico. Za vizualizacijo rezultatov sta bila uporabljena knjižnica D3.js in JavaScript. Spletna stran v obliki besedila nudi tudi razlage za anomalije v rezultatih izračunov. Aplikacija svoje podatke pridobiva iz baze Neo4j v oblaku, pri čemer smo uporabili brezplačno gostovanje manjših grafovnih baz GrapheneDB. Aplikacija deluje na platformi Heroku, ki omogoča brezplačno gostovanje manjših aplikacij v oblaku (ang. PaaS – platform as a service).

Advisors/Committee Members: Rajkovič, Uroš.

Subjects/Keywords: Neo4j; MySQL; Apache Hadoop/Hive; database comparison; data visualisation; Neo4j; MySQL; Apache Hadoop/Hive; primerjava sistemov za upravljanje podatkovih baz; vizualizacija podatkov

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Moser, M. (2016). COMPARISON OF MYSQL, NEO4J AND APACHE HIVE DATABASE MANAGEMENT SYSTEMS. (Masters Thesis). Univerza v Mariboru. Retrieved from https://dk.um.si/IzpisGradiva.php?id=57729 ; https://dk.um.si/Dokument.php?id=87171&dn= ; https://plus.si.cobiss.net/opac7/bib/7575827?lang=sl

Chicago Manual of Style (16th Edition):

Moser, Matjaž. “COMPARISON OF MYSQL, NEO4J AND APACHE HIVE DATABASE MANAGEMENT SYSTEMS.” 2016. Masters Thesis, Univerza v Mariboru. Accessed December 06, 2019. https://dk.um.si/IzpisGradiva.php?id=57729 ; https://dk.um.si/Dokument.php?id=87171&dn= ; https://plus.si.cobiss.net/opac7/bib/7575827?lang=sl.

MLA Handbook (7th Edition):

Moser, Matjaž. “COMPARISON OF MYSQL, NEO4J AND APACHE HIVE DATABASE MANAGEMENT SYSTEMS.” 2016. Web. 06 Dec 2019.

Vancouver:

Moser M. COMPARISON OF MYSQL, NEO4J AND APACHE HIVE DATABASE MANAGEMENT SYSTEMS. [Internet] [Masters thesis]. Univerza v Mariboru; 2016. [cited 2019 Dec 06]. Available from: https://dk.um.si/IzpisGradiva.php?id=57729 ; https://dk.um.si/Dokument.php?id=87171&dn= ; https://plus.si.cobiss.net/opac7/bib/7575827?lang=sl.

Council of Science Editors:

Moser M. COMPARISON OF MYSQL, NEO4J AND APACHE HIVE DATABASE MANAGEMENT SYSTEMS. [Masters Thesis]. Univerza v Mariboru; 2016. Available from: https://dk.um.si/IzpisGradiva.php?id=57729 ; https://dk.um.si/Dokument.php?id=87171&dn= ; https://plus.si.cobiss.net/opac7/bib/7575827?lang=sl


University of North Texas

3. Venumuddala, Ramu Reddy. Distributed Frameworks Towards Building an Open Data Architecture.

Degree: 2015, University of North Texas

Data is everywhere. The current Technological advancements in Digital, Social media and the ease at which the availability of different application services to interact with variety of systems are causing to generate tremendous volumes of data. Due to such varied services, Data format is now not restricted to only structure type like text but can generate unstructured content like social media data, videos and images etc. The generated Data is of no use unless been stored and analyzed to derive some Value. Traditional Database systems comes with limitations on the type of data format schema, access rates and storage sizes etc. Hadoop is an Apache open source distributed framework that support storing huge datasets of different formatted data reliably on its file system named Hadoop File System (HDFS) and to process the data stored on HDFS using MapReduce programming model. This thesis study is about building a Data Architecture using Hadoop and its related open source distributed frameworks to support a Data flow pipeline on a low commodity hardware. The Data flow components are, sourcing data, storage management on HDFS and data access layer. This study also discuss about a use case to utilize the architecture components. Sqoop, a framework to ingest the structured data from database onto Hadoop and Flume is used to ingest the semi-structured Twitter streaming json data on to HDFS for analysis. The data sourced using Sqoop and Flume have been analyzed using Hive for SQL like analytics and at a higher level of data access layer, Hadoop has been compared with an in memory computing system using Spark. Significant differences in query execution performances have been analyzed when working with Hadoop and Spark frameworks. This integration helps for ingesting huge Volumes of streaming json Variety data to derive better Value based analytics using Hive and Spark. Advisors/Committee Members: Fu, Song, Caragea, Cornelia, Huang, Yan.

Subjects/Keywords: Hadoop; Hive; MapReduce; Flume; Software architecture.; Data structures (Computer science); Apache Hadoop.

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Venumuddala, R. R. (2015). Distributed Frameworks Towards Building an Open Data Architecture. (Thesis). University of North Texas. Retrieved from https://digital.library.unt.edu/ark:/67531/metadc801911/

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Chicago Manual of Style (16th Edition):

Venumuddala, Ramu Reddy. “Distributed Frameworks Towards Building an Open Data Architecture.” 2015. Thesis, University of North Texas. Accessed December 06, 2019. https://digital.library.unt.edu/ark:/67531/metadc801911/.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

MLA Handbook (7th Edition):

Venumuddala, Ramu Reddy. “Distributed Frameworks Towards Building an Open Data Architecture.” 2015. Web. 06 Dec 2019.

Vancouver:

Venumuddala RR. Distributed Frameworks Towards Building an Open Data Architecture. [Internet] [Thesis]. University of North Texas; 2015. [cited 2019 Dec 06]. Available from: https://digital.library.unt.edu/ark:/67531/metadc801911/.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Council of Science Editors:

Venumuddala RR. Distributed Frameworks Towards Building an Open Data Architecture. [Thesis]. University of North Texas; 2015. Available from: https://digital.library.unt.edu/ark:/67531/metadc801911/

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

.