You searched for subject:(Parallel Distributed Computing)
.
Showing records 1 – 30 of
150 total matches.
◁ [1] [2] [3] [4] [5] ▶

University of Georgia
1.
Agarwal, Abhishek.
Merging parallel simulations.
Degree: 2014, University of Georgia
URL: http://hdl.handle.net/10724/22049
► I n earlier work cloning is proposed as a means for efficiently splitting a running simulation midway through its execution into multiple parallel simulations. In…
(more)
▼ I n earlier work cloning is proposed as a means for efficiently splitting a running simulation midway through its execution into multiple parallel simulations. In simulation cloning, clones usually are able to share computations that occur
early in the simulation, but as their states diverge individual logical processes (LP’s) are replicated as necessary so that their computations proceed independently. Over time the state of the clones (or their constituent LPs) may converge.
Traditionally, these converged LPs would continue to execute identical events. We address this inefficiency by merging of previously cloned LPs. We show that such merging can further increase efficiency beyond that obtained through cloning only. We
discuss our implementation of merging, and illustrate its effectiveness in several example simulation scenarios.
Subjects/Keywords: Parallel and Distributed Computing; Simulations; Cloning; Merging.
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Agarwal, A. (2014). Merging parallel simulations. (Thesis). University of Georgia. Retrieved from http://hdl.handle.net/10724/22049
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Agarwal, Abhishek. “Merging parallel simulations.” 2014. Thesis, University of Georgia. Accessed January 17, 2021.
http://hdl.handle.net/10724/22049.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Agarwal, Abhishek. “Merging parallel simulations.” 2014. Web. 17 Jan 2021.
Vancouver:
Agarwal A. Merging parallel simulations. [Internet] [Thesis]. University of Georgia; 2014. [cited 2021 Jan 17].
Available from: http://hdl.handle.net/10724/22049.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Agarwal A. Merging parallel simulations. [Thesis]. University of Georgia; 2014. Available from: http://hdl.handle.net/10724/22049
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Virginia Tech
2.
Kannan, Vijayasarathy.
A Distributed Approach to EpiFast using Apache Spark.
Degree: MS, Computer Science and Applications, 2015, Virginia Tech
URL: http://hdl.handle.net/10919/55272
► EpiFast is a parallel algorithm for large-scale epidemic simulations, based on an interpretation of the stochastic disease propagation in a contact network. The original EpiFast…
(more)
▼ EpiFast is a
parallel algorithm for large-scale epidemic simulations, based on an interpretation of the stochastic disease propagation in a contact network. The original EpiFast
implementation is based on a master-slave computation model with a focus on
distributed
memory using message-passing-interface (MPI). However, it suffers from few shortcomings
with respect to scale of networks being studied. This thesis addresses these shortcomings
and provides two different implementations: Spark-EpiFast based on the Apache Spark big
data processing engine and Charm-EpiFast based on the Charm++
parallel programming
framework. The study focuses on exploiting features of both systems that we believe could
potentially benefit in terms of performance and scalability. We present models of EpiFast
specific to each system and relate algorithm specifics to several optimization techniques. We
also provide a detailed analysis of these optimizations through a range of experiments that
consider scale of networks and environment settings we used. Our analysis shows that the
Spark-based version is more efficient than the Charm++ and MPI-based counterparts. To
the best of our knowledge, ours is one of the preliminary efforts of using Apache Spark for
epidemic simulations. We believe that our proposed model could act as a reference for similar
large-scale epidemiological simulations exploring non-MPI or MapReduce-like approaches.
Advisors/Committee Members: Marathe, Madhav Vishnu (committeechair), Marathe, Achla (committee member), Vullikanti, Anil Kumar S. (committee member), Chen, Jiangzhuo (committee member).
Subjects/Keywords: computational epidemiology; parallel programming; distributed computing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Kannan, V. (2015). A Distributed Approach to EpiFast using Apache Spark. (Masters Thesis). Virginia Tech. Retrieved from http://hdl.handle.net/10919/55272
Chicago Manual of Style (16th Edition):
Kannan, Vijayasarathy. “A Distributed Approach to EpiFast using Apache Spark.” 2015. Masters Thesis, Virginia Tech. Accessed January 17, 2021.
http://hdl.handle.net/10919/55272.
MLA Handbook (7th Edition):
Kannan, Vijayasarathy. “A Distributed Approach to EpiFast using Apache Spark.” 2015. Web. 17 Jan 2021.
Vancouver:
Kannan V. A Distributed Approach to EpiFast using Apache Spark. [Internet] [Masters thesis]. Virginia Tech; 2015. [cited 2021 Jan 17].
Available from: http://hdl.handle.net/10919/55272.
Council of Science Editors:
Kannan V. A Distributed Approach to EpiFast using Apache Spark. [Masters Thesis]. Virginia Tech; 2015. Available from: http://hdl.handle.net/10919/55272

University of Tasmania
3.
Atkinson, AK.
Tupleware: a distributed tuple space for the development and execution of array-based applications in a cluster computing environment.
Degree: 2010, University of Tasmania
URL: https://eprints.utas.edu.au/9996/1/Alistair_Atkinson_PhD_Thesis.pdf
► This thesis describes Tupleware, an implementation of a distributed tuple space which acts as a scalable and efficient cluster middleware for computationally intensive numerical and…
(more)
▼ This thesis describes Tupleware, an implementation of a distributed tuple space which acts as a scalable and efficient cluster middleware for computationally intensive numerical and scientific applications. Tupleware is based on the Linda coordination language (Gelernter 1985), and incorporates additional techniques such as peer-to-peer communications and exploitation of data locality in order to address problems such as scalability and performance, which are commonly encountered by traditional centralised tuple space implementations.
Tupleware is implemented in such as way that, whilepr ocessing is taking place, all communication between cluster nodes is decentralised in a peer-to-peer fashion. Communication events are initiated by a node requesting a tuple which is located on a remote node, and in order to make tuple retrieval as efficient as possible, a tuple
search algorithm is used to minimise the number of communication instances required to retrieve a remote tuple. This algorithm is based on the locality of a remote
tuple and the success of previous remote tuple requests. As Tupleware is targetted at numerical applications which generally involve the partitioning and processing of 1-D or 2-D arrays, the locality of a remote tuple can generally be determined as being located on one of a small number nodes which are processing neighbouring partitions of the array.
Furthermore, unlike some other distributed tuple space implementations, Tupleware does not burden the programmer with any additional complexity due to this distribution. At the application level, the Tupleware middleware behaves exactly like a centralised tuple space, and provides much greater flexibility with regards to where components of a system are executed.
The design and implementation of Tupleware is described, and placed in the context of other distributed tuple space implementations, along with the specific requirements of the applications that the system caters for. Finally, Tupleware is evaluated using several numerical and/or scientific applications, which show it to provide a sufficient level of scalability for a broad range tasks.
The main contribution of this work is the identification of techniques which enable a tuple space to be efficiently and transparently distributed across the nodes in a cluster. Central to this is the use of an algorithm for tuple retrieval which minimises the number of communication instances which occur during system execution. Distribution transparency is ensured by the provision of a simple interface to the underlying system, so that the distributed tuple space appears to the programmer as a single unified resource.
It is hoped that this research in some way furthers the adoption of the tuple space programming model for distributed computing, by enhancing its ability to
provide improved performance, scalability, flexibility and simplicity for a range of applications not traditionally suited to tuple space based systems.
Subjects/Keywords: Distributed computing; parallel computing; concurrency; high-performance computing; tuple space.
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Atkinson, A. (2010). Tupleware: a distributed tuple space for the development and execution of array-based applications in a cluster computing environment. (Thesis). University of Tasmania. Retrieved from https://eprints.utas.edu.au/9996/1/Alistair_Atkinson_PhD_Thesis.pdf
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Atkinson, AK. “Tupleware: a distributed tuple space for the development and execution of array-based applications in a cluster computing environment.” 2010. Thesis, University of Tasmania. Accessed January 17, 2021.
https://eprints.utas.edu.au/9996/1/Alistair_Atkinson_PhD_Thesis.pdf.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Atkinson, AK. “Tupleware: a distributed tuple space for the development and execution of array-based applications in a cluster computing environment.” 2010. Web. 17 Jan 2021.
Vancouver:
Atkinson A. Tupleware: a distributed tuple space for the development and execution of array-based applications in a cluster computing environment. [Internet] [Thesis]. University of Tasmania; 2010. [cited 2021 Jan 17].
Available from: https://eprints.utas.edu.au/9996/1/Alistair_Atkinson_PhD_Thesis.pdf.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Atkinson A. Tupleware: a distributed tuple space for the development and execution of array-based applications in a cluster computing environment. [Thesis]. University of Tasmania; 2010. Available from: https://eprints.utas.edu.au/9996/1/Alistair_Atkinson_PhD_Thesis.pdf
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

North Carolina State University
4.
Lin, Heshan.
High Performance Parallel and Distributed Genomic Sequence Search.
Degree: PhD, Computer Science, 2009, North Carolina State University
URL: http://www.lib.ncsu.edu/resolver/1840.16/3481
► Genomic sequence database search identifies similarities between given query sequences and known sequences in a database. It forms a critical class of applications used widely…
(more)
▼ Genomic sequence database search identifies similarities between given query sequences and known sequences in a database. It forms a critical class of applications used widely and routinely in computational biology. Due to their wide application in diverse task settings, sequence search tools today are run on several types of
parallel systems, including batch jobs on one or more supercomputers and interactive queries through web-based services. Despite successful parallelization of popular sequence search tools such as BLAST, in the past two decades the growth of sequence databases has outpaced that of
computing hardware elements, making scalable and efficient
parallel sequence search processing crucial in helping life scientists' dealing with the ever-increasing amount of genomic information.
In this thesis, we investigate efficient and scalable
parallel and
distributed sequence-search solutions by addressing unique problems and challenges in the aforementioned execution settings. Specifically, this thesis research 1) introduces
parallel I/O techniques into sequence-search tools and proposes novel computation and I/O co-scheduling algorithms that enable genomic sequence search to scale efficiently on massively
parallel computers; 2) presents a semantic based
distributed I/O framework that leverages the application specific meta information to drastically reduce the amount of data transfer and thus enables
distributed sequence searching collaboration in the global scale;
3) proposes a novel request scheduling technique for clustered sequence-search web servers that comprehensively takes into account both data locality and
parallel search efficiency to optimize query response time under various server load levels and access scenarios. The efficacy of our proposed solutions has been verified on a broad range of
parallel and
distributed systems, including Peta-scale supercomputers, the NSF TeraGrid system, and small- or medium-sized clusters. In addition, our optimizations of massively
parallel sequence search have been transformed into the official release of mpiBLAST-PIO, currently the only supported branch of mpiBLAST, a popular open-source sequence-search tool. mpiBLAST-PIO is able to achieve 93%
parallel efficiency across 32,768 cores on the IBM Blue Gene/P supercomputer.
Advisors/Committee Members: Xiaosong Ma, Committee Chair (advisor), Steffen Heber, Committee Member (advisor), Frank Mueller, Committee Member (advisor), Nagiza Samatova, Committee Member (advisor), Douglas Reeves, Committee Member (advisor).
Subjects/Keywords: parallel I/O; scheduling; distributed computing; parallel bioinformatics; sequence database search
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Lin, H. (2009). High Performance Parallel and Distributed Genomic Sequence Search. (Doctoral Dissertation). North Carolina State University. Retrieved from http://www.lib.ncsu.edu/resolver/1840.16/3481
Chicago Manual of Style (16th Edition):
Lin, Heshan. “High Performance Parallel and Distributed Genomic Sequence Search.” 2009. Doctoral Dissertation, North Carolina State University. Accessed January 17, 2021.
http://www.lib.ncsu.edu/resolver/1840.16/3481.
MLA Handbook (7th Edition):
Lin, Heshan. “High Performance Parallel and Distributed Genomic Sequence Search.” 2009. Web. 17 Jan 2021.
Vancouver:
Lin H. High Performance Parallel and Distributed Genomic Sequence Search. [Internet] [Doctoral dissertation]. North Carolina State University; 2009. [cited 2021 Jan 17].
Available from: http://www.lib.ncsu.edu/resolver/1840.16/3481.
Council of Science Editors:
Lin H. High Performance Parallel and Distributed Genomic Sequence Search. [Doctoral Dissertation]. North Carolina State University; 2009. Available from: http://www.lib.ncsu.edu/resolver/1840.16/3481

NSYSU
5.
Lin, Chieh-Wei.
Software Implementation of Topology-based Communication Library.
Degree: Master, Electrical Engineering, 2013, NSYSU
URL: http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0530113-200455
► Along with the rapid development of Internet and computer technology, distributed and parallel computing is prevalently applied in many applications to enhance performance with parallel…
(more)
▼ Along with the rapid development of Internet and computer technology,
distributed and
parallel computing is prevalently applied in many applications to enhance performance with
parallel processing on multiple computers. In
distributed computing systems, each computer exchanges information and data via network interconnections. Communication interconnection among
parallel programs is an important part in the program design of a
distributed parallel system. In our research, we developed a software design of a topology-based communication library. We focused on communication interconnections among
parallel programs and developed an easy-to-use and general topology-based communication interconnection design. A
parallel program can utilize the functions of communication interconnection input and connection to construct communication channels required by the
parallel by the
parallel program automatically. It frees a
parallel program design from writing complex communication channel construction codes and make
parallel and
distributed program easy to design. We designed three kinds of supported topologies including n-dimensional array, n-dimensional mesh, and graph topologies. In this thesis, we carried out software implementation of the communication library. We also performed
parallel program experiments with the implemented library to verify the correctness of the implementation and measure its performance.
Keywords:
parallel computing,
distributed computing systems, topology, communication interconnection design, communication library
Advisors/Committee Members: Shiann-Rong Kuang (chair), Tsung Lee (committee member), Chih-Chien Chen (chair).
Subjects/Keywords: parallel computing; topology; communication interconnection design; communication library; distributed computing systems
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Lin, C. (2013). Software Implementation of Topology-based Communication Library. (Thesis). NSYSU. Retrieved from http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0530113-200455
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Lin, Chieh-Wei. “Software Implementation of Topology-based Communication Library.” 2013. Thesis, NSYSU. Accessed January 17, 2021.
http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0530113-200455.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Lin, Chieh-Wei. “Software Implementation of Topology-based Communication Library.” 2013. Web. 17 Jan 2021.
Vancouver:
Lin C. Software Implementation of Topology-based Communication Library. [Internet] [Thesis]. NSYSU; 2013. [cited 2021 Jan 17].
Available from: http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0530113-200455.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Lin C. Software Implementation of Topology-based Communication Library. [Thesis]. NSYSU; 2013. Available from: http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0530113-200455
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

NSYSU
6.
Chen, Song-Yi.
Migrating processes in distributed computing systems.
Degree: Master, Computer Science and Engineering, 2013, NSYSU
URL: http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0715113-153751
► In the early, cluster computing system has not been widely used, process migration (fork[11]ãthread[6, 40]) acts on the same machine. And now the application of…
(more)
▼ In the early, cluster
computing system has not been widely used, process migration (fork[11]ãthread[6, 40]) acts on the same machine. And now the application of cluster
computing systems has been very common, many system is the use of lightweight and easy SLURM [38] system, but does not have any mechanism to make the migration process can be applied to the calculation node in the cluster system, it Means that to create a new program called by the running program, moved to another
computing node, while the execution is completed before the results back to the original call his program. In this thesis, SLURM mechanism to make the cluster
computing system implemented to achieve the fork-and-exec utility programs.
Advisors/Committee Members: Kuang-Chih Huang (chair), Chun-Hung Lin (committee member), Hung-Jen Liao (chair), Shi-Huang Chen (chair), Wei Kuang Lai (chair).
Subjects/Keywords: cluster system; SLURM; Process Migration; parallel computing; distributed computing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Chen, S. (2013). Migrating processes in distributed computing systems. (Thesis). NSYSU. Retrieved from http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0715113-153751
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Chen, Song-Yi. “Migrating processes in distributed computing systems.” 2013. Thesis, NSYSU. Accessed January 17, 2021.
http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0715113-153751.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Chen, Song-Yi. “Migrating processes in distributed computing systems.” 2013. Web. 17 Jan 2021.
Vancouver:
Chen S. Migrating processes in distributed computing systems. [Internet] [Thesis]. NSYSU; 2013. [cited 2021 Jan 17].
Available from: http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0715113-153751.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Chen S. Migrating processes in distributed computing systems. [Thesis]. NSYSU; 2013. Available from: http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0715113-153751
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Otago
7.
Yu, Byung-Hyun.
Design and Implementation of an Efficient and Scalable Software Distributed Shared Memory System
.
Degree: 2012, University of Otago
URL: http://hdl.handle.net/10523/2088
► This thesis presents the design and implementation of our novel hybrid software DSM system. We call our system hybrid home-based EAC (HHEAC) since the system…
(more)
▼ This thesis presents the design and implementation of our novel hybrid software DSM system. We call our system hybrid home-based EAC (HHEAC) since the system implements our novel exclusive access consistency model (EAC) based on the hybrid protocol of the homeless and home-based protocols. HHEAC guarantees only that shared variables inside a critical section are up to date before the accesses. Other shared variables outside a critical section are guaranteed to be up to date after the next barrier synchronisation.
Our home-based DSM implementation is different from the previous implementations in that a home node does not receive any diffs from non-home nodes until the next barrier synchronisation. It is also different in that during a lock synchronisation required diffs are prefetched before the critical section, which reduces not only data traffic but also page faults inside the critical section.
We also present a diff integration technique that can further unnecessary data traffic during lock synchronisation. This technique is especially effective in reducing data traffic for migratory applications.
Finally, we develop a home migration technique that solves the wrong home assignment problem in the home-based protocol. Our technique is different from others in that an optimum home node is decided before updating a home node.
To evaluate our system, we performed various experiments with well-known benchmark applications, including a novel
parallel neural network application.The performance evaluation shows that HHEAC is more scalable than other DSM systems such as TreadMarks and removes the home assignment problem in the conventional home-based protocol.
Advisors/Committee Members: Cranefield, Stephen (advisor).
Subjects/Keywords: Distributed Shared Memory;
High Performance Computing;
Parallel Computing;
Cache Coherence
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Yu, B. (2012). Design and Implementation of an Efficient and Scalable Software Distributed Shared Memory System
. (Doctoral Dissertation). University of Otago. Retrieved from http://hdl.handle.net/10523/2088
Chicago Manual of Style (16th Edition):
Yu, Byung-Hyun. “Design and Implementation of an Efficient and Scalable Software Distributed Shared Memory System
.” 2012. Doctoral Dissertation, University of Otago. Accessed January 17, 2021.
http://hdl.handle.net/10523/2088.
MLA Handbook (7th Edition):
Yu, Byung-Hyun. “Design and Implementation of an Efficient and Scalable Software Distributed Shared Memory System
.” 2012. Web. 17 Jan 2021.
Vancouver:
Yu B. Design and Implementation of an Efficient and Scalable Software Distributed Shared Memory System
. [Internet] [Doctoral dissertation]. University of Otago; 2012. [cited 2021 Jan 17].
Available from: http://hdl.handle.net/10523/2088.
Council of Science Editors:
Yu B. Design and Implementation of an Efficient and Scalable Software Distributed Shared Memory System
. [Doctoral Dissertation]. University of Otago; 2012. Available from: http://hdl.handle.net/10523/2088

Northeastern University
8.
Sun, Enqiang.
Cross-platform heterogeneous runtime environment.
Degree: PhD, Department of Electrical and Computer Engineering, 2016, Northeastern University
URL: http://hdl.handle.net/2047/D20213163
► Heterogeneous platforms are becoming widely adopted thanks for the support from new languages and programming models. Among these languages/models, OpenCL is an industry standard for…
(more)
▼ Heterogeneous platforms are becoming widely adopted thanks for the support from new languages and programming models. Among these languages/models, OpenCL is an industry standard for parallel programming on heterogeneous devices. With OpenCL, compute-intensive portions of an application can be offloaded to a variety of processing units within a system. OpenCL is one of the first standards that focuses on portability, allowing programs to be written once and run unmodified on multiple heterogeneous devices, regardless of vendor.; While OpenCL has been widely adopted, there still remains a lack of support for automatic workload balancing and data consistency when multiple devices are present in the system. To address this need, we have designed a cross-platform heterogeneous runtime environment which provides a high-level, unified execution model that is coupled with an intelligent resource management facility. The main motivation for developing this runtime environment is to provide OpenCL programmers with a convenient programming paradigm to fully utilize all possible devices in a system and incorporate flexible workload balancing schemes without compromising the user's ability to assign tasks according to the data affinity. Our work removes much of the cumbersome initialization of the platform, and now devices and related OpenCL objects are hidden under the hood.; Equipped with this new runtime environment and associated programming interface, the programmer can focus on designing the application and worry less about customization to the target platform. Further, the programmer can now take advantage of multiple devices using a dynamic workload balancing algorithm to reap the benefits of task-level parallelism.; To demonstrate the value of this cross-platform heterogeneous runtime environment, we have evaluated it running both micro benchmarks and popular OpenCL benchmark applications. With minimal overhead of managing data objects across devices, the experimental results show a scalable performance speedup with increasing number of computing devices, without any changes of program source code.
Subjects/Keywords: parallel computing; runtime; Heterogeneous computing; Parallel programming (Computer science); Parallel processing (Electronic computers); Parallel computers; Electronic data processing; Distributed processing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Sun, E. (2016). Cross-platform heterogeneous runtime environment. (Doctoral Dissertation). Northeastern University. Retrieved from http://hdl.handle.net/2047/D20213163
Chicago Manual of Style (16th Edition):
Sun, Enqiang. “Cross-platform heterogeneous runtime environment.” 2016. Doctoral Dissertation, Northeastern University. Accessed January 17, 2021.
http://hdl.handle.net/2047/D20213163.
MLA Handbook (7th Edition):
Sun, Enqiang. “Cross-platform heterogeneous runtime environment.” 2016. Web. 17 Jan 2021.
Vancouver:
Sun E. Cross-platform heterogeneous runtime environment. [Internet] [Doctoral dissertation]. Northeastern University; 2016. [cited 2021 Jan 17].
Available from: http://hdl.handle.net/2047/D20213163.
Council of Science Editors:
Sun E. Cross-platform heterogeneous runtime environment. [Doctoral Dissertation]. Northeastern University; 2016. Available from: http://hdl.handle.net/2047/D20213163

Rice University
9.
Peng, Zhimin.
Parallel Sparse Optimization.
Degree: MA, Engineering, 2013, Rice University
URL: http://hdl.handle.net/1911/77447
► This thesis proposes parallel and distributed algorithms for solving very largescale sparse optimization problems on computer clusters and clouds. Many modern applications problems from compressive…
(more)
▼ This thesis proposes
parallel and
distributed algorithms for solving very largescale
sparse optimization problems on computer clusters and clouds. Many modern
applications problems from compressive sensing, machine learning and signal and
image processing involve large-scale data and can be modeled as sparse optimization
problems. Those problems are in such a large-scale that they can no longer
be processed on single workstations running single-threaded
computing approaches.
Moving to
parallel/
distributed/cloud
computing becomes a viable option. I propose
two approaches for solving these problems. The first approach is the
distributed
implementations of a class of efficient proximal linear methods for solving convex
optimization problems by taking advantages of the separability of the terms in the objective.
The second approach is a
parallel greedy coordinate descent method (GRock),
which greedily choose several entries to update in
parallel in each iteration. I establish
the convergence of GRock and explain why it often performs exceptionally well for
sparse optimization. Extensive numerical results on a computer cluster and Amazon
EC2 demonstrate the efficiency and elasticity of my algorithms.
Advisors/Committee Members: Yin, Wotao (advisor), Zhang, Yin (committee member), Baraniuk, Richard G. (committee member).
Subjects/Keywords: Sparse optimization; Parallel computing; Distributed computing; Prox-linear
methods; Grock; Applied MathSparse optimization; Parallel computing; Distributed computing; Prox-linear
methods; Grock; Applied MathSparse optimization; Parallel computing; Distributed computing; Prox-linear
methods; Grock; Applied MathSparse optimization; Parallel computing; Distributed computing; Prox-linear
methods; Grock; Applied MathSparse optimization; Parallel computing; Distributed computing; Prox-linear
methods; Grock; Applied MathSparse optimization; Parallel computing; Distributed computing; Prox-linear
methods; Grock; Applied Math
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Peng, Z. (2013). Parallel Sparse Optimization. (Masters Thesis). Rice University. Retrieved from http://hdl.handle.net/1911/77447
Chicago Manual of Style (16th Edition):
Peng, Zhimin. “Parallel Sparse Optimization.” 2013. Masters Thesis, Rice University. Accessed January 17, 2021.
http://hdl.handle.net/1911/77447.
MLA Handbook (7th Edition):
Peng, Zhimin. “Parallel Sparse Optimization.” 2013. Web. 17 Jan 2021.
Vancouver:
Peng Z. Parallel Sparse Optimization. [Internet] [Masters thesis]. Rice University; 2013. [cited 2021 Jan 17].
Available from: http://hdl.handle.net/1911/77447.
Council of Science Editors:
Peng Z. Parallel Sparse Optimization. [Masters Thesis]. Rice University; 2013. Available from: http://hdl.handle.net/1911/77447

University of California – San Diego
10.
VASUKI BALASUBRAMANIAM, KARTHIKEYAN.
System and Analysis for Low Latency Video Processing using Microservices.
Degree: Computer Science, 2017, University of California – San Diego
URL: http://www.escholarship.org/uc/item/6c38332p
► The evolution of big data processing and analysis has led to data-parallel frameworks such as Hadoop, MapReduce, Spark, and Hive, which are capable of analyzing…
(more)
▼ The evolution of big data processing and analysis has led to data-parallel frameworks such as Hadoop, MapReduce, Spark, and Hive, which are capable of analyzing large streams of data such as server logs, web transactions, and user reviews. Videos are one of the biggest sources of data and dominate the Internet traffic. Video processing on a large scale is critical and challenging as videos possess spatial and temporal features, which are not taken into account by the existing data-parallel frameworks. There are a broad range of users who want to apply sophisticated video processing pipelines such as transcoding, feature extraction, classification, scene cut detection and digital compositing to video contentParallel video processing poses several significant research challenges to the existing data processing frameworks. Current systems are capable of processing videos but with higher resource startup times, a small degree of parallelism, low average resource utilization, coarse-grained billing, and higher latency. This research proposes a low latency software run-time for processing a single video efficiently by orchestrating cloud-based microservices. The system leverages lightweight microservices provided by Amazon Web Services Lambda framework.
Subjects/Keywords: Computer science; AWS; cloud computing; distributed; lambda; parallel; Video
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
VASUKI BALASUBRAMANIAM, K. (2017). System and Analysis for Low Latency Video Processing using Microservices. (Thesis). University of California – San Diego. Retrieved from http://www.escholarship.org/uc/item/6c38332p
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
VASUKI BALASUBRAMANIAM, KARTHIKEYAN. “System and Analysis for Low Latency Video Processing using Microservices.” 2017. Thesis, University of California – San Diego. Accessed January 17, 2021.
http://www.escholarship.org/uc/item/6c38332p.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
VASUKI BALASUBRAMANIAM, KARTHIKEYAN. “System and Analysis for Low Latency Video Processing using Microservices.” 2017. Web. 17 Jan 2021.
Vancouver:
VASUKI BALASUBRAMANIAM K. System and Analysis for Low Latency Video Processing using Microservices. [Internet] [Thesis]. University of California – San Diego; 2017. [cited 2021 Jan 17].
Available from: http://www.escholarship.org/uc/item/6c38332p.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
VASUKI BALASUBRAMANIAM K. System and Analysis for Low Latency Video Processing using Microservices. [Thesis]. University of California – San Diego; 2017. Available from: http://www.escholarship.org/uc/item/6c38332p
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

North Carolina State University
11.
Lim, Min Yeol.
Improving Power and Performance Efficiency in Parallel and Distributed Computing Systems.
Degree: PhD, Computer Science, 2009, North Carolina State University
URL: http://www.lib.ncsu.edu/resolver/1840.16/5662
► For decades, high-performance computing systems have focused on increasing maximum performance at any cost. A consequence of the devotion towards boosting performance significantly increases power…
(more)
▼ For decades, high-performance
computing systems have focused on increasing maximum performance at any cost. A consequence of the devotion towards boosting performance significantly increases power consumption. The most powerful supercomputers require up to 10 megawatts of peak power – enough to sustain a city of 40,000. However, some of that power may be wasted with little or no performance gain, because applications do not require peak performance all the time. Therefore, improving power and performance efficiency becomes one of the primary concerns in
parallel and
distributed computing. Our goal is to build a runtime system that can understand power-performance tradeoffs and balance power consumption and performance penalty adaptively.
In this thesis, we make the following contributions. First, we develop a MPI runtime system that can dynamically balance power and performance tradeoffs in MPI applications. Our system dynamically identifies power saving opportunities without prior knowledge about system behaviors and then determines the best p-state to improve the power and performance efficiency. The system is entirely transparent to MPI applications with no user intervention. Second, we develop a method for determining minimum energy consumption in voltage and frequency scaling systems for a given time delay. Our approach helps to better analyze the performance of a specific DVFS algorithm in terms of balancing power and performance. Third, we develop a power prediction model that can correlate power and performance data on a chip multiprocessor machine. Our model shows that the power consumption can be estimated by hardware performance counters with reasonable accuracy in various execution environments. Given the prediction model, one can make a runtime decision of balancing power and performance tradeoffs on a chip-multiprocessor machine without delay for actual power measurements. Last, we develop an algorithm to save power by dynamically migrating virtual machines and placing them onto fewer physical machines depending on workloads. Our scheme uses a two-level, adaptive buffering scheme which reserves processing capacity. It is designed to adapt the buffer sizes to workloads in order to balance performance violations and energy savings by reducing the amount of energy wasted on the buffers. Our simulation framework justifies our study of the energy benefits and the performance effects of the algorithm along with studies of its sensitivity to various parameters.
Advisors/Committee Members: Dr. George N. Rouskas, Committee Member (advisor), Dr. Gregory T. Byrd, Committee Member (advisor), Dr. Xiaosong Ma, Committee Member (advisor), Dr. Robert J. Fowler, Committee Member (advisor), Dr. Vincent W. Freeh, Committee Chair (advisor).
Subjects/Keywords: Virtualization; DVFS; Power aware computing; Parallel and distributed system
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Lim, M. Y. (2009). Improving Power and Performance Efficiency in Parallel and Distributed Computing Systems. (Doctoral Dissertation). North Carolina State University. Retrieved from http://www.lib.ncsu.edu/resolver/1840.16/5662
Chicago Manual of Style (16th Edition):
Lim, Min Yeol. “Improving Power and Performance Efficiency in Parallel and Distributed Computing Systems.” 2009. Doctoral Dissertation, North Carolina State University. Accessed January 17, 2021.
http://www.lib.ncsu.edu/resolver/1840.16/5662.
MLA Handbook (7th Edition):
Lim, Min Yeol. “Improving Power and Performance Efficiency in Parallel and Distributed Computing Systems.” 2009. Web. 17 Jan 2021.
Vancouver:
Lim MY. Improving Power and Performance Efficiency in Parallel and Distributed Computing Systems. [Internet] [Doctoral dissertation]. North Carolina State University; 2009. [cited 2021 Jan 17].
Available from: http://www.lib.ncsu.edu/resolver/1840.16/5662.
Council of Science Editors:
Lim MY. Improving Power and Performance Efficiency in Parallel and Distributed Computing Systems. [Doctoral Dissertation]. North Carolina State University; 2009. Available from: http://www.lib.ncsu.edu/resolver/1840.16/5662

Syracuse University
12.
Abi Saad, Maria.
Facilitating High Performance Code Parallelization.
Degree: PhD, Electrical Engineering and Computer Science, 2017, Syracuse University
URL: https://surface.syr.edu/etd/712
► With the surge of social media on one hand and the ease of obtaining information due to cheap sensing devices and open source APIs…
(more)
▼ With the surge of social media on one hand and the ease of obtaining information due to cheap sensing devices and open source APIs on the other hand, the amount of data that can be processed is as well vastly increasing. In addition, the world of
computing has recently been witnessing a growing shift towards massively
parallel distributed systems due to the increasing importance of transforming data into knowledge in today’s data-driven world. At the core of data analysis for all sorts of applications lies pattern matching. Therefore, parallelizing pattern matching algorithms should be made efficient in order to cater to this ever-increasing abundance of data. We propose a method that automatically detects a user’s single threaded function call to search for a pattern using Java’s standard regular expression library, and replaces it with our own data
parallel implementation using Java bytecode injection. Our approach facilitates
parallel processing on different platforms consisting of shared memory systems (using multithreading and NVIDIA GPUs) and
distributed systems (using MPI and Hadoop). The major contributions of our implementation consist of reducing the execution time while at the same time being transparent to the user. In addition to that, and in the same spirit of facilitating high performance code parallelization, we present a tool that automatically generates Spark Java code from minimal user-supplied inputs. Spark has emerged as the tool of choice for efficient big data analysis. However, users still have to learn the complicated Spark API in order to write even a simple application. Our tool is easy to use, interactive and offers Spark’s native Java API performance. To the best of our knowledge and until the time of this writing, such a tool has not been yet implemented.
Advisors/Committee Members: C.Y. Roger Chen.
Subjects/Keywords: Distributed systems; GPU; Java bytecode injection; Multithreading; Parallel computing; Spark; Engineering
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Abi Saad, M. (2017). Facilitating High Performance Code Parallelization. (Doctoral Dissertation). Syracuse University. Retrieved from https://surface.syr.edu/etd/712
Chicago Manual of Style (16th Edition):
Abi Saad, Maria. “Facilitating High Performance Code Parallelization.” 2017. Doctoral Dissertation, Syracuse University. Accessed January 17, 2021.
https://surface.syr.edu/etd/712.
MLA Handbook (7th Edition):
Abi Saad, Maria. “Facilitating High Performance Code Parallelization.” 2017. Web. 17 Jan 2021.
Vancouver:
Abi Saad M. Facilitating High Performance Code Parallelization. [Internet] [Doctoral dissertation]. Syracuse University; 2017. [cited 2021 Jan 17].
Available from: https://surface.syr.edu/etd/712.
Council of Science Editors:
Abi Saad M. Facilitating High Performance Code Parallelization. [Doctoral Dissertation]. Syracuse University; 2017. Available from: https://surface.syr.edu/etd/712

University of Glasgow
13.
Djemame, Karim.
Distributed simulation of high-level algebraic Petri nets.
Degree: PhD, 1999, University of Glasgow
URL: http://theses.gla.ac.uk/76248/
;
https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.301624
► In the field of Petri nets, simulation is an essential tool to validate and evaluate models. Conventional simulation techniques, designed for their use in sequential…
(more)
▼ In the field of Petri nets, simulation is an essential tool to validate and evaluate models. Conventional simulation techniques, designed for their use in sequential computers, are too slow if the system to simulate is large or complex. The aim of this work is to search for techniques to accelerate simulations exploiting the parallelism available in current, commercial multicomputers, and to use these techniques to study a class of Petri nets called high-level algebraic nets. These nets exploit the rich theory of algebraic specifications for high-level Petri nets: Petri nets gain a great deal of modelling power by representing dynamically changing items as structured tokens whereas algebraic specifications turned out to be an adequate and flexible instrument for handling structured items. In this work we focus on ECATNets (Extended Concurrent Algebraic Term Nets) whose most distinctive feature is their semantics which is defined in terms of rewriting logic. Nevertheless, ECATNets have two drawbacks: the occultation of the aspect of time and a bad exploitation of the parallelism inherent in the models. Three distributed simulation techniques have been considered: asynchronous conservative, asynchronous optimistic and synchronous. These algorithms have been implemented in a multicomputer environment: a network of workstations. The influence that factors such as the characteristics of the simulated models, the organisation of the simulators and the characteristics of the target multicomputer have in the performance of the simulations have been measured and characterised. It is concluded that synchronous distributed simulation techniques are not suitable for the considered kind of models, although they may provide good performance in other environments. Conservative and optimistic distributed simulation techniques perform well, specially if the model to simulate is complex or large - precisely the worst case for traditional, sequential simulators. This way, studies previously considered as unrealisable, due to their exceedingly high computational cost, can be performed in reasonable times. Additionally, the spectrum of possibilities of using multicomputers can be broadened to execute more than numeric applications.
Subjects/Keywords: 005; Parallel programming; Distributed computing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Djemame, K. (1999). Distributed simulation of high-level algebraic Petri nets. (Doctoral Dissertation). University of Glasgow. Retrieved from http://theses.gla.ac.uk/76248/ ; https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.301624
Chicago Manual of Style (16th Edition):
Djemame, Karim. “Distributed simulation of high-level algebraic Petri nets.” 1999. Doctoral Dissertation, University of Glasgow. Accessed January 17, 2021.
http://theses.gla.ac.uk/76248/ ; https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.301624.
MLA Handbook (7th Edition):
Djemame, Karim. “Distributed simulation of high-level algebraic Petri nets.” 1999. Web. 17 Jan 2021.
Vancouver:
Djemame K. Distributed simulation of high-level algebraic Petri nets. [Internet] [Doctoral dissertation]. University of Glasgow; 1999. [cited 2021 Jan 17].
Available from: http://theses.gla.ac.uk/76248/ ; https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.301624.
Council of Science Editors:
Djemame K. Distributed simulation of high-level algebraic Petri nets. [Doctoral Dissertation]. University of Glasgow; 1999. Available from: http://theses.gla.ac.uk/76248/ ; https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.301624

Heriot-Watt University
14.
Sziveri, Janos.
Parallel computational techniques for explicit finite element analysis.
Degree: PhD, 1997, Heriot-Watt University
URL: http://hdl.handle.net/10399/1254
Subjects/Keywords: 005; Distributed computing; Parallel processing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Sziveri, J. (1997). Parallel computational techniques for explicit finite element analysis. (Doctoral Dissertation). Heriot-Watt University. Retrieved from http://hdl.handle.net/10399/1254
Chicago Manual of Style (16th Edition):
Sziveri, Janos. “Parallel computational techniques for explicit finite element analysis.” 1997. Doctoral Dissertation, Heriot-Watt University. Accessed January 17, 2021.
http://hdl.handle.net/10399/1254.
MLA Handbook (7th Edition):
Sziveri, Janos. “Parallel computational techniques for explicit finite element analysis.” 1997. Web. 17 Jan 2021.
Vancouver:
Sziveri J. Parallel computational techniques for explicit finite element analysis. [Internet] [Doctoral dissertation]. Heriot-Watt University; 1997. [cited 2021 Jan 17].
Available from: http://hdl.handle.net/10399/1254.
Council of Science Editors:
Sziveri J. Parallel computational techniques for explicit finite element analysis. [Doctoral Dissertation]. Heriot-Watt University; 1997. Available from: http://hdl.handle.net/10399/1254

Brunel University
15.
Suthakar, Uthayanath.
A scalable data store and analytic platform for real-time monitoring of data-intensive scientific infrastructure.
Degree: PhD, 2017, Brunel University
URL: http://bura.brunel.ac.uk/handle/2438/15788
;
https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.764857
► Monitoring data-intensive scientific infrastructures in real-time such as jobs, data transfers, and hardware failures is vital for efficient operation. Due to the high volume and…
(more)
▼ Monitoring data-intensive scientific infrastructures in real-time such as jobs, data transfers, and hardware failures is vital for efficient operation. Due to the high volume and velocity of events that are produced, traditional methods are no longer optimal. Several techniques, as well as enabling architectures, are available to support the Big Data issue. In this respect, this thesis complements existing survey work by contributing an extensive literature review of both traditional and emerging Big Data architecture. Scalability, low-latency, fault-tolerance, and intelligence are key challenges of the traditional architecture. However, Big Data technologies and approaches have become increasingly popular for use cases that demand the use of scalable, data intensive processing (parallel), and fault-tolerance (data replication) and support for low-latency computations. In the context of a scalable data store and analytics platform for monitoring data-intensive scientific infrastructure, Lambda Architecture was adapted and evaluated on the Worldwide LHC Computing Grid, which has been proven effective. This is especially true for computationally and data-intensive use cases. In this thesis, an efficient strategy for the collection and storage of large volumes of data for computation is presented. By moving the transformation logic out from the data pipeline and moving to analytics layers, it simplifies the architecture and overall process. Time utilised is reduced, untampered raw data are kept at storage level for fault-tolerance, and the required transformation can be done when needed. An optimised Lambda Architecture (OLA), which involved modelling an efficient way of joining batch layer and streaming layer with minimum code duplications in order to support scalability, low-latency, and fault-tolerance is presented. A few models were evaluated; pure streaming layer, pure batch layer and the combination of both batch and streaming layers. Experimental results demonstrate that OLA performed better than the traditional architecture as well the Lambda Architecture. The OLA was also enhanced by adding an intelligence layer for predicting data access pattern. The intelligence layer actively adapts and updates the model built by the batch layer, which eliminates the re-training time while providing a high level of accuracy using the Deep Learning technique. The fundamental contribution to knowledge is a scalable, low-latency, fault-tolerant, intelligent, and heterogeneous-based architecture for monitoring a data-intensive scientific infrastructure, that can benefit from Big Data, technologies and approaches.
Subjects/Keywords: Big data; Data science; Distributed system; Lambda Architecture; Parallel computing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Suthakar, U. (2017). A scalable data store and analytic platform for real-time monitoring of data-intensive scientific infrastructure. (Doctoral Dissertation). Brunel University. Retrieved from http://bura.brunel.ac.uk/handle/2438/15788 ; https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.764857
Chicago Manual of Style (16th Edition):
Suthakar, Uthayanath. “A scalable data store and analytic platform for real-time monitoring of data-intensive scientific infrastructure.” 2017. Doctoral Dissertation, Brunel University. Accessed January 17, 2021.
http://bura.brunel.ac.uk/handle/2438/15788 ; https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.764857.
MLA Handbook (7th Edition):
Suthakar, Uthayanath. “A scalable data store and analytic platform for real-time monitoring of data-intensive scientific infrastructure.” 2017. Web. 17 Jan 2021.
Vancouver:
Suthakar U. A scalable data store and analytic platform for real-time monitoring of data-intensive scientific infrastructure. [Internet] [Doctoral dissertation]. Brunel University; 2017. [cited 2021 Jan 17].
Available from: http://bura.brunel.ac.uk/handle/2438/15788 ; https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.764857.
Council of Science Editors:
Suthakar U. A scalable data store and analytic platform for real-time monitoring of data-intensive scientific infrastructure. [Doctoral Dissertation]. Brunel University; 2017. Available from: http://bura.brunel.ac.uk/handle/2438/15788 ; https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.764857
16.
Andersson, Filip.
Scalable applications in a distributed environment.
Degree: 2011, , School of Computing
URL: http://urn.kb.se/resolve?urn=urn:nbn:se:bth-3917
► As the amount of simultaneous users of distributed systems increase, scalability is becoming an important factor to consider during software development. Without sufficient scalability,…
(more)
▼ As the amount of simultaneous users of distributed systems increase, scalability is becoming an important factor to consider during software development. Without sufficient scalability, systems might have a hard time to manage high loads, and might not be able to support a high amount of users. We have determined how scalability can best be implemented, and what extra costs this leads to. Our research is based on both a literature review, where we have looked at what others in the field of computer engineering thinks about scalability, and by implementing a highly scalable system of our own. In the end we came up with a couple of general pointers which can help developers to determine if they should focus on scalable development, and what they should consider if they choose to do so.
Subjects/Keywords: parallel programming; mpi; distributed computing; Computer Sciences; Datavetenskap (datalogi)
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Andersson, F. (2011). Scalable applications in a distributed environment. (Thesis). , School of Computing. Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:bth-3917
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Andersson, Filip. “Scalable applications in a distributed environment.” 2011. Thesis, , School of Computing. Accessed January 17, 2021.
http://urn.kb.se/resolve?urn=urn:nbn:se:bth-3917.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Andersson, Filip. “Scalable applications in a distributed environment.” 2011. Web. 17 Jan 2021.
Vancouver:
Andersson F. Scalable applications in a distributed environment. [Internet] [Thesis]. , School of Computing; 2011. [cited 2021 Jan 17].
Available from: http://urn.kb.se/resolve?urn=urn:nbn:se:bth-3917.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Andersson F. Scalable applications in a distributed environment. [Thesis]. , School of Computing; 2011. Available from: http://urn.kb.se/resolve?urn=urn:nbn:se:bth-3917
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Washington
17.
Grabaskas, Nathaniel J.
Automated Parallelization to Improve Usability and Efficiency of Distributed Neural Network Training.
Degree: 2018, University of Washington
URL: http://hdl.handle.net/1773/41725
► The recent success of Deep Neural Networks (DNNs) [1] has triggered a race to build larger and larger DNNs [2]; however, a known limitation is…
(more)
▼ The recent success of Deep Neural Networks (DNNs) [1] has triggered a race to build larger and larger DNNs [2]; however, a known limitation is the training speed [3]. To solve this speed problem,
distributed neural network training has become an increasingly large area of research [4], [5]. Usability, the complexity for a machine learning or data scientist to implement
distributed neural network training, is an aspect rarely considered, yet critical. There is strong evidence growing complexity has a direct impact on development effort, maintainability, and fault proneness of software [6]–[8]. We investigated, if automation can greatly reduce the implementation complexity of distributing neural network training across multiple devices without loss of computational efficiency when compared to manual parallelization. Experiments were conducted using Convolutional Neural Networks (CNN) and Multi-Layer Perceptron (MLP) networks to perform image classification on CIFAR-10 and MNIST datasets. Hardware consisted of an embedded, four node NVIDIA Jetson TX1 cluster. Torch Automatic
Distributed Neural Network (TorchAD-NN) reduces the implementation complexity of data
parallel neural network training by more than 90% and providing components, with near zero implementation complexity, to easily parallelize all or only select fully-connected neural layers.
Advisors/Committee Members: Fukuda, Munehiro (advisor).
Subjects/Keywords: Automated; Distributed; Neural Networks; Parallel; Computer science; Computing and software systems
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Grabaskas, N. J. (2018). Automated Parallelization to Improve Usability and Efficiency of Distributed Neural Network Training. (Thesis). University of Washington. Retrieved from http://hdl.handle.net/1773/41725
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Grabaskas, Nathaniel J. “Automated Parallelization to Improve Usability and Efficiency of Distributed Neural Network Training.” 2018. Thesis, University of Washington. Accessed January 17, 2021.
http://hdl.handle.net/1773/41725.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Grabaskas, Nathaniel J. “Automated Parallelization to Improve Usability and Efficiency of Distributed Neural Network Training.” 2018. Web. 17 Jan 2021.
Vancouver:
Grabaskas NJ. Automated Parallelization to Improve Usability and Efficiency of Distributed Neural Network Training. [Internet] [Thesis]. University of Washington; 2018. [cited 2021 Jan 17].
Available from: http://hdl.handle.net/1773/41725.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Grabaskas NJ. Automated Parallelization to Improve Usability and Efficiency of Distributed Neural Network Training. [Thesis]. University of Washington; 2018. Available from: http://hdl.handle.net/1773/41725
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Rice University
18.
Chatterjee, Ronnie.
Enabling Distributed Reconfiguration In An Actor Model.
Degree: MS, Engineering, 2017, Rice University
URL: http://hdl.handle.net/1911/105459
► The demand for portable mainstream programming models supporting scalable, reactive and versatile distributed computing is growing dramatically with the prolifer- ation of manycore/heterogeneous processors on…
(more)
▼ The demand for portable mainstream programming models supporting scalable, reactive and versatile
distributed computing is growing dramatically with the prolifer- ation of manycore/heterogeneous processors on portable devices and cloud
computing clusters that can be elastically and dynamically allocated. With such changes, dis- tributed software systems and applications are increasingly shifting towards service oriented architectures (SOA) that consist of dynamically replaceable components, and connected via loosely coupled, interactive networks that can support more complex coordination and synchronization patterns.
In this dissertation, we address the dynamic reconfiguration challenges that arise in
distributed implementations of the Selector Model. We focus on the Selector Model (a generalization of the actor model) in this work because of its support for multi- ple guarded mailboxes, which enables the programmer to easily specify coordination patterns that are more general than those supported by the actor model. The contri- butions of this dissertation are demonstrated in two implementations of
distributed selectors, one for
distributed servers and another for
distributed Android devices. Both implementations run on
distributed JVMs and feature the automated boot- strap and global termination capabilities introduced in this dissertation. In addition, the
distributed Android implementation supports dynamic joining and leaving of de-
vices, which is also part of the dynamic reconfiguration capabilities introduced in this dissertation.
Advisors/Committee Members: Sarkar, Vivek (advisor).
Subjects/Keywords: Parallel Computing; Distributed Systems; Habanero Java Actor Model
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Chatterjee, R. (2017). Enabling Distributed Reconfiguration In An Actor Model. (Masters Thesis). Rice University. Retrieved from http://hdl.handle.net/1911/105459
Chicago Manual of Style (16th Edition):
Chatterjee, Ronnie. “Enabling Distributed Reconfiguration In An Actor Model.” 2017. Masters Thesis, Rice University. Accessed January 17, 2021.
http://hdl.handle.net/1911/105459.
MLA Handbook (7th Edition):
Chatterjee, Ronnie. “Enabling Distributed Reconfiguration In An Actor Model.” 2017. Web. 17 Jan 2021.
Vancouver:
Chatterjee R. Enabling Distributed Reconfiguration In An Actor Model. [Internet] [Masters thesis]. Rice University; 2017. [cited 2021 Jan 17].
Available from: http://hdl.handle.net/1911/105459.
Council of Science Editors:
Chatterjee R. Enabling Distributed Reconfiguration In An Actor Model. [Masters Thesis]. Rice University; 2017. Available from: http://hdl.handle.net/1911/105459

Georgia Tech
19.
Zhou, Yang.
Innovative mining, processing, and application of big graphs.
Degree: PhD, Computer Science, 2017, Georgia Tech
URL: http://hdl.handle.net/1853/59173
► With continued advances in science and technology, big graph (or network) data, such as World Wide Web, social networks, academic collaboration networks, transportation networks, telecommunication…
(more)
▼ With continued advances in science and technology, big graph (or network) data, such as World Wide Web, social networks, academic collaboration networks, transportation networks, telecommunication networks, biological networks, and electrical networks, have grown at an astonishing rate in terms of volume, variety, and velocity. Analyzing such big graph data has huge potential to reveal hidden insights and promote innovation in business, science, and engineering domains. However, there exist a number of challenging bottlenecks in developing advanced graph analytics tools in the Big Data era. This dissertation research focus on bridging graph mining and graph processing techniques to alleviate such bottlenecks in terms of both effectiveness and efficiency. This dissertation had made original contributions on exploring, understanding, and learning big graph data in graph mining, processing and application: First, we have developed a suite of novel graph mining algorithms to analyze real-world heterogeneous information networks. Our algorithmic approaches enable new ways to dive into the correlation structure of big graphs to derive new insights about how heterogeneous entities interact with one another and influence the effectiveness and efficiency of graph clustering, graph classification and graph ranking. Second, we have developed a scalable graph
parallel processing framework by exploring
parallel processing optimizations at both access tier and computation tier. We have designed a suite of hierarchically composable graph
parallel abstractions to enable large-scale graphs to be processed efficiently for iterative graph computation applications. Our approach enables computer hardware resource aware graph partitioning such that
parallel graph processing workloads can be well balanced in the presence of highly irregular graph structures and the mismatch of graph access and computation workloads. Third but not the least, we have developed innovative domain specific graph analytics frameworks to understand the hidden patterns in enterprise storage systems and to derive the interesting correlations among various enterprise web services. These novel graph algorithms and frameworks provide broader and deeper insights for better understanding of tradeoffs in enterprise system design and implementation.
Advisors/Committee Members: Liu, Ling (advisor), Lofstead, Jay (committee member), Navathe, Shamkant (committee member), Pu, Calton (committee member), Ramaswamy, Lakshmish (committee member).
Subjects/Keywords: Big data; Data mining; Parallel and distributed computing; Machine learning; Databases
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Zhou, Y. (2017). Innovative mining, processing, and application of big graphs. (Doctoral Dissertation). Georgia Tech. Retrieved from http://hdl.handle.net/1853/59173
Chicago Manual of Style (16th Edition):
Zhou, Yang. “Innovative mining, processing, and application of big graphs.” 2017. Doctoral Dissertation, Georgia Tech. Accessed January 17, 2021.
http://hdl.handle.net/1853/59173.
MLA Handbook (7th Edition):
Zhou, Yang. “Innovative mining, processing, and application of big graphs.” 2017. Web. 17 Jan 2021.
Vancouver:
Zhou Y. Innovative mining, processing, and application of big graphs. [Internet] [Doctoral dissertation]. Georgia Tech; 2017. [cited 2021 Jan 17].
Available from: http://hdl.handle.net/1853/59173.
Council of Science Editors:
Zhou Y. Innovative mining, processing, and application of big graphs. [Doctoral Dissertation]. Georgia Tech; 2017. Available from: http://hdl.handle.net/1853/59173

University of Tennessee – Knoxville
20.
Auel, Eric.
genben: A Framework for Benchmarking Genomic Data Analysis Methods on Scalable Systems.
Degree: MS, Computer Engineering, 2019, University of Tennessee – Knoxville
URL: https://trace.tennessee.edu/utk_gradthes/5569
► With an ever-increasing number of human DNA sequencing efforts being conducted, the amount of genetic variation data available for research has grown substantially over the…
(more)
▼ With an ever-increasing number of human DNA sequencing efforts being conducted, the amount of genetic variation data available for research has grown substantially over the past few decades. This data provides scientists with the ability to study various traits of humans and other species. Several data analysis methods can be applied to this genetic variation data, such as allele counting and principal component analysis (PCA). Software libraries like scikit-allel can be used to easily explore these data sets, as it contains many functions that can be directly used on genetic variation data. However, trade-offs often exist when working with unique data sets and when performing analysis on various hardware environments. Additionally, many parameters can be tweaked when storing this genetic variation data, such as compression ratios, compression algorithms, and block sizes. Having the ability to quantify the performance impact of tweaking these parameters can be extremely useful for software developers, data scientists, and researchers. Algorithms that can be used on this data could also be improved in the future, so being able to compare system resource usage before and after these modifications could be extremely insightful in terms of quantifying overall improvements of new algorithms. This thesis presents genben, a flexible framework that can be used to benchmark various functionality involved with analyzing genetic variation data, and it additionally provides several benchmark experiments that demonstrate the ability to test different algorithm implementations, different configuration parameters, and different hardware configurations utilizing high-performance
computing systems.
Advisors/Committee Members: Gregory Peterson, Edmon Begoli, Charles Cao.
Subjects/Keywords: genben; benchmark; distributed systems; framework; PCA; parallel computing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Auel, E. (2019). genben: A Framework for Benchmarking Genomic Data Analysis Methods on Scalable Systems. (Thesis). University of Tennessee – Knoxville. Retrieved from https://trace.tennessee.edu/utk_gradthes/5569
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Auel, Eric. “genben: A Framework for Benchmarking Genomic Data Analysis Methods on Scalable Systems.” 2019. Thesis, University of Tennessee – Knoxville. Accessed January 17, 2021.
https://trace.tennessee.edu/utk_gradthes/5569.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Auel, Eric. “genben: A Framework for Benchmarking Genomic Data Analysis Methods on Scalable Systems.” 2019. Web. 17 Jan 2021.
Vancouver:
Auel E. genben: A Framework for Benchmarking Genomic Data Analysis Methods on Scalable Systems. [Internet] [Thesis]. University of Tennessee – Knoxville; 2019. [cited 2021 Jan 17].
Available from: https://trace.tennessee.edu/utk_gradthes/5569.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Auel E. genben: A Framework for Benchmarking Genomic Data Analysis Methods on Scalable Systems. [Thesis]. University of Tennessee – Knoxville; 2019. Available from: https://trace.tennessee.edu/utk_gradthes/5569
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of California – Berkeley
21.
Zaharia, Matei Alexandru.
An Architecture for and Fast and General Data Processing on Large Clusters.
Degree: Computer Science, 2013, University of California – Berkeley
URL: http://www.escholarship.org/uc/item/19k949h3
► The past few years have seen a major change in computing systems, as growing data volumes and stalling processor speeds require more and more applications…
(more)
▼ The past few years have seen a major change in computing systems, as growing data volumes and stalling processor speeds require more and more applications to scale out to distributed systems. Today, a myriad data sources, from the Internet to business operations to scientific instruments, produce large and valuable data streams. However, the processing capabilities of single machines have not kept up with the size of data, making it harder and harder to put to use. As a result, a growing number of organizations – not just web companies, but traditional enterprises and research labs – need to scale out their most important computations to clusters of hundreds of machines. At the same time, the speed and sophistication required of data processing have grown. In addition to simple queries, complex algorithms like machine learning and graph analysis are becoming common in many domains. And in addition to batch processing, streaming analysis of new real-time data sources is required to let organizations take timely action. Future computing platforms will need to not only scale out traditional workloads, but support these new applications as well. This dissertation proposes an architecture for cluster computing systems that can tackle emerging data processing workloads while coping with larger and larger scales. Whereas early cluster computing systems, like MapReduce, handled batch processing, our architecture also enables streaming and interactive queries, while keeping the scalability and fault tolerance of previous systems. And whereas most deployed systems only support simple one-pass computations (e.g. aggregation or SQL queries), ours also extends to the multi-pass algorithms required for more complex analytics (e.g. iterative algorithms for machine learning). Finally, unlike the specialized systems proposed for some of these workloads, our architecture allows these computations to be combined, enabling rich new applications that intermix, for example, streaming and batch processing, or SQL and complex analytics.We achieve these results through a simple extension to MapReduce that adds primitives for data sharing, called Resilient Distributed Datasets (RDDs). We show that this is enough to efficiently capture a wide range of workloads. We implement RDDs in the open source Spark system, which we evaluate using both synthetic benchmarks and real user applications. Spark matches or exceeds the performance of specialized systems in many application domains, while offering stronger fault tolerance guarantees and allowing these workloads to be combined. We explore the generality of RDDs from both a theoretical modeling perspective and a practical perspective to see why this extension can capture a wide range of previously disparate workloads.
Subjects/Keywords: Computer science; cluster computing; distributed systems; fault tolerance; parallel computing; programming models
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Zaharia, M. A. (2013). An Architecture for and Fast and General Data Processing on Large Clusters. (Thesis). University of California – Berkeley. Retrieved from http://www.escholarship.org/uc/item/19k949h3
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Zaharia, Matei Alexandru. “An Architecture for and Fast and General Data Processing on Large Clusters.” 2013. Thesis, University of California – Berkeley. Accessed January 17, 2021.
http://www.escholarship.org/uc/item/19k949h3.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Zaharia, Matei Alexandru. “An Architecture for and Fast and General Data Processing on Large Clusters.” 2013. Web. 17 Jan 2021.
Vancouver:
Zaharia MA. An Architecture for and Fast and General Data Processing on Large Clusters. [Internet] [Thesis]. University of California – Berkeley; 2013. [cited 2021 Jan 17].
Available from: http://www.escholarship.org/uc/item/19k949h3.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Zaharia MA. An Architecture for and Fast and General Data Processing on Large Clusters. [Thesis]. University of California – Berkeley; 2013. Available from: http://www.escholarship.org/uc/item/19k949h3
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
22.
Vora, Keval.
Exploiting Asynchrony for Performance and Fault Tolerance in Distributed Graph Processing.
Degree: Computer Science, 2017, University of California – Riverside
URL: http://www.escholarship.org/uc/item/6x33d81c
► While various iterative graph algorithms can be expressed via asynchronous parallelism, lack of its proper understanding limits the performance benefits that can be achieved via…
(more)
▼ While various iterative graph algorithms can be expressed via asynchronous parallelism, lack of its proper understanding limits the performance benefits that can be achieved via informed relaxations. In this thesis, we capture the algorithmic intricacies and execution semantics that enable us to improve asynchronous processing and allow us to reason about semantics of asynchronous execution while leveraging its benefits. To this end, we specify the asynchronous processing model in a distributed setting by identifying key properties of read-write dependences and ordering of reads that expose the set of legal executions of an asynchronous program. And then, we develop techniques to exploit the availability of multiple legal executions by choosing faster executions that reduce communication and computation while processing static and dynamic graphs. For static graphs, we first develop a relaxed consistency protocol to allow the use of stale values during processing in order to eliminate long latency communication operations by up to 58%, hence accelerating the overall processing by a factor of 2. Then, to efficiently handle machine failures, we present a light-weight confined recovery strategy that quickly constructs an alternate execution state that may be different from any previously encountered program state, but is nevertheless a legal state that guarantees correct asynchronous semantics upon resumption of execution. Our confined recovery strategy enables the processing to finish 1.5-3.2x faster compared to the traditional recovery mechanism when failures impact 1-6 machines of a 16 machine cluster.We further design techniques based on computation reordering and incremental computation to amortize the computation and communication costs incurred in processing evolving graphs, hence accelerating their processing by up to 10x. Finally, to process streaming graphs, we develop a dynamic dependence based incremental processing technique that identifies the minimal set of computations required to calculate the change in results that reflects the mutation in graph structure. We show that this technique not only produces correct results, but also improves processing by 8.5-23.7x.Finally, we demonstrate the efficacy of asynchrony beyond distributed setting by leveraging it to design dynamic partitions that eliminate wasteful disk I/O involved in out-of-core graph processing by 25-76%.
Subjects/Keywords: Computer science; Asynchronous Processing; Big Data; Distributed Computing; Graph Processing; Parallel Computing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Vora, K. (2017). Exploiting Asynchrony for Performance and Fault Tolerance in Distributed Graph Processing. (Thesis). University of California – Riverside. Retrieved from http://www.escholarship.org/uc/item/6x33d81c
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Vora, Keval. “Exploiting Asynchrony for Performance and Fault Tolerance in Distributed Graph Processing.” 2017. Thesis, University of California – Riverside. Accessed January 17, 2021.
http://www.escholarship.org/uc/item/6x33d81c.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Vora, Keval. “Exploiting Asynchrony for Performance and Fault Tolerance in Distributed Graph Processing.” 2017. Web. 17 Jan 2021.
Vancouver:
Vora K. Exploiting Asynchrony for Performance and Fault Tolerance in Distributed Graph Processing. [Internet] [Thesis]. University of California – Riverside; 2017. [cited 2021 Jan 17].
Available from: http://www.escholarship.org/uc/item/6x33d81c.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Vora K. Exploiting Asynchrony for Performance and Fault Tolerance in Distributed Graph Processing. [Thesis]. University of California – Riverside; 2017. Available from: http://www.escholarship.org/uc/item/6x33d81c
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Houston
23.
Rodgers, John Scott.
MPI Based Python Libraries for Data Science Applications.
Degree: MS, Computer Science, 2020, University of Houston
URL: http://hdl.handle.net/10657/6601
► Tools commonly leveraged to tackle large-scale data science workflows have traditionally shied away from existing high performance computing paradigms, largely due to their lack of…
(more)
▼ Tools commonly leveraged to tackle large-scale data science workflows have traditionally shied away from existing high performance
computing paradigms, largely due to their lack of fault tolerance and computation resiliency. However, these concerns are typically only of critical importance to problems tackled by technology companies at the highest level. For the average data scientist, the benefits of resiliency may not be as important as the overall execution performance. To this end, the work of this thesis aims to develop prototypes of tools favored by the data science community that function in a data-
parallel environment, taking advantage of functionality commonly used in high performance
computing. To achieve this goal, a prototype-
distributed clone of the Python NumPy library and a select module from the SciPy library were developed, which leverage MPI for inter-process communication and data transfers while abstracting away the complexity of MPI programming from its users. Through various benchmarks, the overhead introduced by logic necessary to resolve functioning in a data-
parallel environment, as well as the scalability of using
parallel compute resources for routines commonly used by the emulated libraries, are analyzed. For the
distributed NumPy clone, it was found that for routines that could act solely on their local array contents, the impact of the introduced overhead was minimal; while for routines that required global scope of
distributed elements, a considerable amount of overhead was introduced. In terms of scalability, both the
distributed NumPy clone and select SciPy module, a
distributed implementation of K-Means clustering, exhibited reasonably performant results; notably showing sensitivity to local process problem sizes and operations that required large amounts of collective communication/synchronization. As this work mainly focused on the initial exploration and prototyping of behavior, the results of the benchmarks can be used in future development efforts to target operations for refinement and optimization.
Advisors/Committee Members: Gabriel, Edgar (advisor), Shah, Shishir Kirit (committee member), Huarte-Espinosa, Martin (committee member).
Subjects/Keywords: Parallel Python; Message Passing Interface; High Performance Computing; Computer Science; Distributed Computing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Rodgers, J. S. (2020). MPI Based Python Libraries for Data Science Applications. (Masters Thesis). University of Houston. Retrieved from http://hdl.handle.net/10657/6601
Chicago Manual of Style (16th Edition):
Rodgers, John Scott. “MPI Based Python Libraries for Data Science Applications.” 2020. Masters Thesis, University of Houston. Accessed January 17, 2021.
http://hdl.handle.net/10657/6601.
MLA Handbook (7th Edition):
Rodgers, John Scott. “MPI Based Python Libraries for Data Science Applications.” 2020. Web. 17 Jan 2021.
Vancouver:
Rodgers JS. MPI Based Python Libraries for Data Science Applications. [Internet] [Masters thesis]. University of Houston; 2020. [cited 2021 Jan 17].
Available from: http://hdl.handle.net/10657/6601.
Council of Science Editors:
Rodgers JS. MPI Based Python Libraries for Data Science Applications. [Masters Thesis]. University of Houston; 2020. Available from: http://hdl.handle.net/10657/6601

University of South Florida
24.
Albassam, Bader.
Enforcing Security Policies On GPU Computing Through The Use Of Aspect-Oriented Programming Techniques.
Degree: 2016, University of South Florida
URL: https://scholarcommons.usf.edu/etd/6165
► This thesis presents a new security policy enforcer designed for securing parallel computation on CUDA GPUs. We show how the very features that make a…
(more)
▼ This thesis presents a new security policy enforcer designed for securing parallel computation on CUDA GPUs. We show how the very features that make a GPGPU desirable have already been utilized in existing exploits, fortifying the need for security protections on a GPGPU. An aspect weaver was designed for CUDA with the goal of utilizing aspect-oriented programming for security policy enforcement. Empirical testing verified the ability of our aspect weaver to enforce various policies. Furthermore, a performance analysis was performed to demonstrate that using this policy enforcer provides no significant performance impact over manual insertion of policy code. Finally, future research goals are presented through a plan of work. We hope that this thesis will provide for long term research goals to guide the field of GPU security.
Subjects/Keywords: Programming Languages; Enforceability Theory; Distributed Computing; Parallel Computing; CUDA; Computer Engineering; Computer Sciences
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Albassam, B. (2016). Enforcing Security Policies On GPU Computing Through The Use Of Aspect-Oriented Programming Techniques. (Thesis). University of South Florida. Retrieved from https://scholarcommons.usf.edu/etd/6165
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Albassam, Bader. “Enforcing Security Policies On GPU Computing Through The Use Of Aspect-Oriented Programming Techniques.” 2016. Thesis, University of South Florida. Accessed January 17, 2021.
https://scholarcommons.usf.edu/etd/6165.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Albassam, Bader. “Enforcing Security Policies On GPU Computing Through The Use Of Aspect-Oriented Programming Techniques.” 2016. Web. 17 Jan 2021.
Vancouver:
Albassam B. Enforcing Security Policies On GPU Computing Through The Use Of Aspect-Oriented Programming Techniques. [Internet] [Thesis]. University of South Florida; 2016. [cited 2021 Jan 17].
Available from: https://scholarcommons.usf.edu/etd/6165.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Albassam B. Enforcing Security Policies On GPU Computing Through The Use Of Aspect-Oriented Programming Techniques. [Thesis]. University of South Florida; 2016. Available from: https://scholarcommons.usf.edu/etd/6165
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
25.
Dhanapal, Manoj.
Design and implementation of distributed Galois.
Degree: MSin Computer Sciences, Computer Science, 2013, University of Texas – Austin
URL: http://hdl.handle.net/2152/21643
► The Galois system provides a solution to the hard problem of parallelizing irregular algorithms using amorphous data-parallelism. The present system works on the shared-memory programming…
(more)
▼ The Galois system provides a solution to the hard problem of parallelizing irregular algorithms using amorphous data-parallelism. The present system works on the shared-memory programming model. The programming model has limitations on the memory and processing power available to the application. A scalable
distributed parallelization tool would give the application access to a very large amount of memory and processing power by interconnecting computers through a network.
This thesis presents the design for a
distributed execution programming model for the Galois system. This
distributed Galois system is capable of executing irregular graph based algorithms on a
distributed environment. The API and programming model of the new
distributed system has been designed to mirror that of the existing shared-memory Galois. This was done to enable existing applications on shared memory applications to run on
distributed Galois with minimal porting effort. Finally, two existing test cases have been implemented on
distributed Galois and shown to scale with increasing number of hosts and threads.
Advisors/Committee Members: Pingali, Keshav (advisor).
Subjects/Keywords: Galois; Parallel computing; Distributed computing
…This will require a distributed implementation of the algorithm.
Regular problems that use… …even on distributed systems. However, the execution
for irregular algorithms is dependent on… …1
amount of parallel processing is limited by the number of processing units available… …on
the machine.
This thesis discusses the design and implementation of a distributed… …scalable algorithms on the distributed Galois system. The existing
programming model and API of…
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Dhanapal, M. (2013). Design and implementation of distributed Galois. (Masters Thesis). University of Texas – Austin. Retrieved from http://hdl.handle.net/2152/21643
Chicago Manual of Style (16th Edition):
Dhanapal, Manoj. “Design and implementation of distributed Galois.” 2013. Masters Thesis, University of Texas – Austin. Accessed January 17, 2021.
http://hdl.handle.net/2152/21643.
MLA Handbook (7th Edition):
Dhanapal, Manoj. “Design and implementation of distributed Galois.” 2013. Web. 17 Jan 2021.
Vancouver:
Dhanapal M. Design and implementation of distributed Galois. [Internet] [Masters thesis]. University of Texas – Austin; 2013. [cited 2021 Jan 17].
Available from: http://hdl.handle.net/2152/21643.
Council of Science Editors:
Dhanapal M. Design and implementation of distributed Galois. [Masters Thesis]. University of Texas – Austin; 2013. Available from: http://hdl.handle.net/2152/21643

University of Dayton
26.
Chen, Chong.
Acceleration of Computer Based Simulation, Image Processing,
and Data Analysis Using Computer Clusters with Heterogeneous
Accelerators.
Degree: PhD, Electrical Engineering, 2016, University of Dayton
URL: http://rave.ohiolink.edu/etdc/view?acc_num=dayton148036732102682
► With the limits to frequency scaling in microprocessors due to power constraints, many-core and multi-core architectures have become the norm over the past decade. The…
(more)
▼ With the limits to frequency scaling in
microprocessors due to power constraints, many-core and multi-core
architectures have become the norm over the past decade. The goal
of this work is the acceleration of key computer simulation tools,
data processing, and data analysis algorithms in multi-core and
many-core computer clusters and the analysis of their accelerated
performances. The main contributions of this dissertation are: 1.
Acceleration of vector bilateral filtering for hyperspectral
imaging with GPGPU: a GPGPU based acceleration for vector bilateral
filtering called vBF_GPU was implemented in this dissertation.
vBF_GPU use multiple threads to processing one pixel of a
hyperspectral image to improve the efficiency of the cache memory.
The memory access operation of vBF_GPU was fully optimized to
reduce the data transfer cost of the GPGPU program. The experiment
results indicate that vBF_GPU can provide up to 19× speedup when
compared with a multi-core CPU implementation and up to 3× speedup
when compared with a naive GPGPU implementation of vector bilateral
filtering. vBF_GPU can process hyperspectral imaging with up to 266
spectrums, and the window size of the bilateral filter is
unlimited. 2. Optimization of acceleration of alternative least
square algorithm using GPGPU cluster: this study presented an
optimized implementation for Alternative Least Square Algorithm
(ALS) to realize large-scale matrix factorization based
recommendation system. In this study, a GPGPU optimized
implementation is developed to conduct the batch solver in ALS
algorithm. An equivalent mathematical form of equations was used to
simplify the computation complexity of ALS algorithm. A
distributed
version of this implementation was also developed and tested using
a cluster of GPGPUs The experiment results in this study indicates
that our application running at a GPGPU can achieve up to 3.8×
speedup when compared with an 8-core CPU. And the
distributed
implementation made excellent scalability at a computer cluster
with multiple GPGPU accelerators.3. Accelerating a preconditioned
iterative solver for a very large sparse linear system on clusters
with heterogeneous accelerators: this study presents a parallelized
preconditioned conjugate gradient solver for large sparse linear
systems on clusters with heterogeneous accelerators. The primary
accelerator we examined in this study is Intel Xeon Phi accelerator
with Many Integrated Core (MIC) architecture. We also realized a
highly optimized
parallel solver on clusters with the NVIDIA GPGPU
accelerators, and clusters of Intel Xeon CPU with the Sandy Bridge
architecture. Several approaches are applied to reduce the
communication cost between different compute nodes in the cluster.
A lightweight balancer was developed for the Xeon Phi based solver.
Our results show that the Xeon Phi based iterative solver is faster
than the GPGPU based solver for one to two compute nodes when the
balancer was applied, particularly when the number of non-zero
elements was unevenly
distributed. For a larger…
Advisors/Committee Members: Taha, Tarek (Committee Chair).
Subjects/Keywords: Computer Engineering; parallel computing; distributed computing; GPGPU; Xeon Phi; Preconditioned Iterative Solver; ALS; bilateral filtering
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Chen, C. (2016). Acceleration of Computer Based Simulation, Image Processing,
and Data Analysis Using Computer Clusters with Heterogeneous
Accelerators. (Doctoral Dissertation). University of Dayton. Retrieved from http://rave.ohiolink.edu/etdc/view?acc_num=dayton148036732102682
Chicago Manual of Style (16th Edition):
Chen, Chong. “Acceleration of Computer Based Simulation, Image Processing,
and Data Analysis Using Computer Clusters with Heterogeneous
Accelerators.” 2016. Doctoral Dissertation, University of Dayton. Accessed January 17, 2021.
http://rave.ohiolink.edu/etdc/view?acc_num=dayton148036732102682.
MLA Handbook (7th Edition):
Chen, Chong. “Acceleration of Computer Based Simulation, Image Processing,
and Data Analysis Using Computer Clusters with Heterogeneous
Accelerators.” 2016. Web. 17 Jan 2021.
Vancouver:
Chen C. Acceleration of Computer Based Simulation, Image Processing,
and Data Analysis Using Computer Clusters with Heterogeneous
Accelerators. [Internet] [Doctoral dissertation]. University of Dayton; 2016. [cited 2021 Jan 17].
Available from: http://rave.ohiolink.edu/etdc/view?acc_num=dayton148036732102682.
Council of Science Editors:
Chen C. Acceleration of Computer Based Simulation, Image Processing,
and Data Analysis Using Computer Clusters with Heterogeneous
Accelerators. [Doctoral Dissertation]. University of Dayton; 2016. Available from: http://rave.ohiolink.edu/etdc/view?acc_num=dayton148036732102682

Vanderbilt University
27.
Varshneya, Pooja.
Distributed and Adaptive Parallel Computing for Computational Finance Applications.
Degree: MS, Computer Science, 2010, Vanderbilt University
URL: http://hdl.handle.net/1803/12502
► Analysts, scientist, engineers, and multimedia professionals require massive processing power to analyze financial trends, create test simulations, model climate, compile code, render video, decode genomes…
(more)
▼ Analysts, scientist, engineers, and multimedia professionals require massive processing power to analyze financial trends, create test simulations, model climate, compile code, render video, decode genomes and other complex tasks. These group of applications can greatly benefit from the use of adaptive,
parallel computing middleware that can enable quick parallelization of existing applications and improve application performance by porting these applications on
parallel computing platforms like HPC clusters, clouds and multi-core machines.
This work focuses on benchmarking the performance of currently available
parallel computing frameworks: OpenMPI and Zircon middleware software, using computation-intensive financial computation applications like binomial option pricing and heston calibrations for option pricing and also compares and contrasts the pros and cons of the two frameworks.
Advisors/Committee Members: Dr. Aniruddha Gokhale (committee member), Dr. Douglas C. Schmidt (Committee Chair).
Subjects/Keywords: distributed adaptive computing; MPI; parallel computing; high performance computing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Varshneya, P. (2010). Distributed and Adaptive Parallel Computing for Computational Finance Applications. (Thesis). Vanderbilt University. Retrieved from http://hdl.handle.net/1803/12502
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Varshneya, Pooja. “Distributed and Adaptive Parallel Computing for Computational Finance Applications.” 2010. Thesis, Vanderbilt University. Accessed January 17, 2021.
http://hdl.handle.net/1803/12502.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Varshneya, Pooja. “Distributed and Adaptive Parallel Computing for Computational Finance Applications.” 2010. Web. 17 Jan 2021.
Vancouver:
Varshneya P. Distributed and Adaptive Parallel Computing for Computational Finance Applications. [Internet] [Thesis]. Vanderbilt University; 2010. [cited 2021 Jan 17].
Available from: http://hdl.handle.net/1803/12502.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Varshneya P. Distributed and Adaptive Parallel Computing for Computational Finance Applications. [Thesis]. Vanderbilt University; 2010. Available from: http://hdl.handle.net/1803/12502
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Virginia Tech
28.
Pulla, Gautam.
High Performance Computing Issues in Large-Scale Molecular Statics Simulations.
Degree: MS, Computer Science, 1999, Virginia Tech
URL: http://hdl.handle.net/10919/33206
► Successful application of parallel high performance computing to practical problems requires overcoming several challenges. These range from the need to make sequential and parallel improvements…
(more)
▼ Successful application of
parallel high performance
computing to practical problems requires overcoming several challenges. These range from the need to make sequential and
parallel improvements in programs to the implementation of software tools which create an environment that aids sharing of high performance hardware resources and limits losses caused by hardware and software failures. In this thesis we describe our approach to meeting these challenges in the context of a Molecular Statics code. We describe sequential and
parallel optimizations made to the code and also a suite of tools constructed to facilitate the execution of the Molecular Statics program on a network of
parallel machines with the aim of increasing resource sharing, fault tolerance and availability.
Advisors/Committee Members: Ribbens, Calvin J. (committeechair), Kafura, Dennis G. (committee member), Farkas, Diana (committee member).
Subjects/Keywords: distributed computing; high performance computing; computational environments; parallel computing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Pulla, G. (1999). High Performance Computing Issues in Large-Scale Molecular Statics Simulations. (Masters Thesis). Virginia Tech. Retrieved from http://hdl.handle.net/10919/33206
Chicago Manual of Style (16th Edition):
Pulla, Gautam. “High Performance Computing Issues in Large-Scale Molecular Statics Simulations.” 1999. Masters Thesis, Virginia Tech. Accessed January 17, 2021.
http://hdl.handle.net/10919/33206.
MLA Handbook (7th Edition):
Pulla, Gautam. “High Performance Computing Issues in Large-Scale Molecular Statics Simulations.” 1999. Web. 17 Jan 2021.
Vancouver:
Pulla G. High Performance Computing Issues in Large-Scale Molecular Statics Simulations. [Internet] [Masters thesis]. Virginia Tech; 1999. [cited 2021 Jan 17].
Available from: http://hdl.handle.net/10919/33206.
Council of Science Editors:
Pulla G. High Performance Computing Issues in Large-Scale Molecular Statics Simulations. [Masters Thesis]. Virginia Tech; 1999. Available from: http://hdl.handle.net/10919/33206

Leiden University
29.
Mark, van der, P.J.
Code generation for large scale applications.
Degree: 2006, Leiden University
URL: http://hdl.handle.net/1887/4961
► Efficient execution of large-scale application codes is a primary requirement in many cases. High efficiency can only be achieved by utilizing architecture-independent efficient algorithms and…
(more)
▼ Efficient execution of large-scale application codes is a primary requirement in many cases. High efficiency can only be achieved by utilizing architecture-independent efficient algorithms and exploiting specific architecture-dependent characteristics of a given computer architecture. However, platform specific versions of source code must be avoided to limit development and maintenance complexity. Usually, the problem can be formulated on an abstract level (mathematical equations, English). At that level, the problem is completely known, and there is no reference to the hardware on which the problem will be solved. Unfortunately, often the advantages of a high level of abstraction are overshadowed by a loss of performance compared to handwritten code. Therefore, a problem-specific code generator, called Ctadel, has been developed in order to exploit architecture-independent and dependent optimizations. We show how to extend Ctadel with more advanced numerical techniques and a interfaces to numerical libraries. A number of numerical models from Hirlam, a numerical weather prediction application in use by a number of meteorological institutes like the Dutch royal meteorological institute (KNMI) where specified using our specification language. We compared the performance of the generated program code with hand-written code. In most case, the code generated by Ctadel performs as well or even better than the hand-written code.
Advisors/Committee Members: Supervisor: H.A.G. Wijshoff Co-Supervisor: A.A. Wolters.
Subjects/Keywords: Code generation; Parallel and distributed computing; Code generation; Parallel and distributed computing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Mark, van der, P. J. (2006). Code generation for large scale applications. (Doctoral Dissertation). Leiden University. Retrieved from http://hdl.handle.net/1887/4961
Chicago Manual of Style (16th Edition):
Mark, van der, P J. “Code generation for large scale applications.” 2006. Doctoral Dissertation, Leiden University. Accessed January 17, 2021.
http://hdl.handle.net/1887/4961.
MLA Handbook (7th Edition):
Mark, van der, P J. “Code generation for large scale applications.” 2006. Web. 17 Jan 2021.
Vancouver:
Mark, van der PJ. Code generation for large scale applications. [Internet] [Doctoral dissertation]. Leiden University; 2006. [cited 2021 Jan 17].
Available from: http://hdl.handle.net/1887/4961.
Council of Science Editors:
Mark, van der PJ. Code generation for large scale applications. [Doctoral Dissertation]. Leiden University; 2006. Available from: http://hdl.handle.net/1887/4961

University of California – Berkeley
30.
Chowdhury, N M Mosharaf.
Coflow: A Networking Abstraction for Distributed Data-Parallel Applications.
Degree: Electrical Engineering & Computer Sciences, 2015, University of California – Berkeley
URL: http://www.escholarship.org/uc/item/0z1537b7
► Over the past decade, the confluence of an unprecedented growth in data volumes and the rapid rise of cloud computing has fundamentally transformed systems software…
(more)
▼ Over the past decade, the confluence of an unprecedented growth in data volumes and the rapid rise of cloud computing has fundamentally transformed systems software and corresponding infrastructure. To deal with massive datasets, more and more applications today are scaling out to large datacenters. These distributed data-parallel applications run on tens to thousands of machines in parallel to exploit I/O parallelism, and they enable a wide variety of use cases, including interactive analysis, SQL queries, machine learning, and graph processing. Communication between the distributed computation tasks of these applications often result in massive data transfers over the network. Consequently, concentrated efforts in both industry and academia have gone into building high-capacity, low-latency datacenter networks at scale. At the same time, researchers and practitioners have proposed a wide variety of solutions to minimize flow completion times or to ensure per-flow fairness based on the point-to-point flow abstraction that forms the basis of the TCP/IP stack. We observe that despite rapid innovations in both applications and infrastructure, application- and network-level goals are moving further apart. Data-parallel applications care about all their flows, but today’s networks treat each point-to-point flow independently. This fundamental mismatch has resulted in complex point solutions for application developers, a myriad of configuration options for end users, and an overall loss of performance. The key contribution of this dissertation is bridging this gap between application-level performance and network-level optimizations through the coflow abstraction. Each multipoint-to-multipoint coflow represents a collection of flows with a common application-level performance objective, enabling application-aware decision making in the network. We describe complete solutions including architectures, algorithms, and implementations that apply coflows to multiple scenarios using central coordination, and we demonstrate through large-scale cloud deployments and trace-driven simulations that simply knowing how flows relate to each other is enough for better network scheduling, meeting more deadlines, and providing higher performance isolation than what is otherwise possible using today’s application-agnostic solutions. In addition to performance improvements, coflows allow us to consolidate communication optimizations across multiple applications, simplifying software development and relieving end users from parameter tuning. On the theoretical front, we discover and characterize for the first time the concurrent open shop scheduling with coupled resources family of problems. Because any flow is also a coflow with just one flow, coflows and coflow-based solutions presented in this dissertation generalize a large body of work in both networking and scheduling literatures.
Subjects/Keywords: Computer science; Application-aware networking; Cloud computing; Coflow; Datacenter networks; Distributed data-parallel applications; Scheduling
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Chowdhury, N. M. M. (2015). Coflow: A Networking Abstraction for Distributed Data-Parallel Applications. (Thesis). University of California – Berkeley. Retrieved from http://www.escholarship.org/uc/item/0z1537b7
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Chowdhury, N M Mosharaf. “Coflow: A Networking Abstraction for Distributed Data-Parallel Applications.” 2015. Thesis, University of California – Berkeley. Accessed January 17, 2021.
http://www.escholarship.org/uc/item/0z1537b7.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Chowdhury, N M Mosharaf. “Coflow: A Networking Abstraction for Distributed Data-Parallel Applications.” 2015. Web. 17 Jan 2021.
Vancouver:
Chowdhury NMM. Coflow: A Networking Abstraction for Distributed Data-Parallel Applications. [Internet] [Thesis]. University of California – Berkeley; 2015. [cited 2021 Jan 17].
Available from: http://www.escholarship.org/uc/item/0z1537b7.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Chowdhury NMM. Coflow: A Networking Abstraction for Distributed Data-Parallel Applications. [Thesis]. University of California – Berkeley; 2015. Available from: http://www.escholarship.org/uc/item/0z1537b7
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
◁ [1] [2] [3] [4] [5] ▶
.