You searched for subject:(MPI)
.
Showing records 1 – 30 of
374 total matches.
◁ [1] [2] [3] [4] [5] … [13] ▶

University of Minnesota
1.
Vuggumudi, Viswanadh Kumar Reddy.
A MPI-based Distributed Computation for Supporting Optimization of Urban Designs with QUIC EnvSim.
Degree: MS, Computer Science, 2015, University of Minnesota
URL: http://hdl.handle.net/11299/174731
► In the present day of urbanization, rise in urban infrastructure is causing an increase in air temperatures and pollution concentrations. This leads to an increase…
(more)
▼ In the present day of urbanization, rise in urban infrastructure is causing an increase in air temperatures and pollution concentrations. This leads to an increase in the energy required to cool buildings and more focused efforts to mitigate pollution. An effective way to mitigate these problems is by carefully designing cityscapes i.e., by placing the buildings, vegetation optimally and choosing energy efficient building materials. Researchers have been building computational models to understand the effects of urban infrastructure on microclimate. Simulating these models is a computationally expensive task. QUIC EnvSim (QES) is a dynamic, scalable and high performance framework that has provided a platform for building and simulating these models. QUIC EnvSim uses Graphics Processing Units (GPUs) to run each individual simulation faster than previous simulation codes. Though each individual simulation takes a short time, it is often required to perform large numbers of simulations and it can take a long time to complete them. This thesis introduces MPI QUIC, a scalable and extendable framework for running these simulations across a cluster of machines, effectively reducing the time required to run all simulations. Various tests on the framework have shown that the framework is capable of running large numbers of simulations in a relatively less amount of time. A test running 65536 simulation was performed. The estimated time for running the test on a single computer is approximately 11.37 days, with each simulation taking approximately 15 seconds to complete. The framework was able to finish running all the simulations in 19 hours, 0 minutes and 25 seconds showing a tremendous speed up of 92.5%. Thus urban planners can use this framework along with QUIC EnvSim to understand the effects of urban forms on microclimate and take informed design decision relatively quickly for building environment friendly urban landscapes. Besides providing a distributed computational environment, the other goal of the MPI QUIC project is to provide a user friendly interface for specifying optimization problems. The current work provides the ground work for the successors of the current work to provide a programmable interface for end users for specifying optimization problems. The framework is also designed so that future implementers can incorporate optimization algorithms that can optimize on multiple fitness functions.
Subjects/Keywords: Distributed computing; MPI
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Vuggumudi, V. K. R. (2015). A MPI-based Distributed Computation for Supporting Optimization of Urban Designs with QUIC EnvSim. (Masters Thesis). University of Minnesota. Retrieved from http://hdl.handle.net/11299/174731
Chicago Manual of Style (16th Edition):
Vuggumudi, Viswanadh Kumar Reddy. “A MPI-based Distributed Computation for Supporting Optimization of Urban Designs with QUIC EnvSim.” 2015. Masters Thesis, University of Minnesota. Accessed March 02, 2021.
http://hdl.handle.net/11299/174731.
MLA Handbook (7th Edition):
Vuggumudi, Viswanadh Kumar Reddy. “A MPI-based Distributed Computation for Supporting Optimization of Urban Designs with QUIC EnvSim.” 2015. Web. 02 Mar 2021.
Vancouver:
Vuggumudi VKR. A MPI-based Distributed Computation for Supporting Optimization of Urban Designs with QUIC EnvSim. [Internet] [Masters thesis]. University of Minnesota; 2015. [cited 2021 Mar 02].
Available from: http://hdl.handle.net/11299/174731.
Council of Science Editors:
Vuggumudi VKR. A MPI-based Distributed Computation for Supporting Optimization of Urban Designs with QUIC EnvSim. [Masters Thesis]. University of Minnesota; 2015. Available from: http://hdl.handle.net/11299/174731

University of New South Wales
2.
Wang, Jingjing.
Automating the measurement of the foetal myocardial performance index using ultrasound images.
Degree: Graduate School of Biomedical Engineering, 2018, University of New South Wales
URL: http://handle.unsw.edu.au/1959.4/60057
;
https://unsworks.unsw.edu.au/fapi/datastream/unsworks:51304/SOURCE02?view=true
► Functional cardiovascular assessment is important in the study of fetal pathology, and is becom- ing an increasingly popular tool in clinical practice. The myocardial performance…
(more)
▼ Functional cardiovascular assessment is important in the study of fetal pathology, and is becom- ing an increasingly popular tool in clinical practice. The myocardial performance index (MPI) is one particularly important index for evaluating global myocardial function, as it combines assessment of both systolic and diastolic function and has been shown to be a sensitive indicator of cardiac dysfunction. Moreover, the MPI can be obtained non-invasively, and it is indepen- dent of ventricular geometry and heart rate. However, the current method for MPI calculation, which is obtained by manually annotating the time intervals required for the MPI calculation, can be time-consuming and demonstrates poor inter-operator repeatability; this is evidenced by a broad variation in normal ranges of MPI values reported in the literature. Therefore, there is motivation to develop an automated method for MPI calculation, which o↵ers a convenient and consistent way to estimate the MPI, hence making fetal cardiac functional evaluation based on the MPI easier and more reliable. In this thesis, automated fetal MPI calculation algorithms for both left and right ventricles have been developed by investigating manual measurement as well as exploring morphological characteristics of ultrasound images. The performance of these algorithms has been validated, demonstrating excellent agreement with aggregated manual an- notation performed by experts. These algorithms have also been deployed in clinical studies evaluating fetal MPI beat-to-beat variation, tissue Doppler-derived MPI, and developing normal MPI reference ranges.
Subjects/Keywords: Automation; Ultrasound; MPI
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Wang, J. (2018). Automating the measurement of the foetal myocardial performance index using ultrasound images. (Doctoral Dissertation). University of New South Wales. Retrieved from http://handle.unsw.edu.au/1959.4/60057 ; https://unsworks.unsw.edu.au/fapi/datastream/unsworks:51304/SOURCE02?view=true
Chicago Manual of Style (16th Edition):
Wang, Jingjing. “Automating the measurement of the foetal myocardial performance index using ultrasound images.” 2018. Doctoral Dissertation, University of New South Wales. Accessed March 02, 2021.
http://handle.unsw.edu.au/1959.4/60057 ; https://unsworks.unsw.edu.au/fapi/datastream/unsworks:51304/SOURCE02?view=true.
MLA Handbook (7th Edition):
Wang, Jingjing. “Automating the measurement of the foetal myocardial performance index using ultrasound images.” 2018. Web. 02 Mar 2021.
Vancouver:
Wang J. Automating the measurement of the foetal myocardial performance index using ultrasound images. [Internet] [Doctoral dissertation]. University of New South Wales; 2018. [cited 2021 Mar 02].
Available from: http://handle.unsw.edu.au/1959.4/60057 ; https://unsworks.unsw.edu.au/fapi/datastream/unsworks:51304/SOURCE02?view=true.
Council of Science Editors:
Wang J. Automating the measurement of the foetal myocardial performance index using ultrasound images. [Doctoral Dissertation]. University of New South Wales; 2018. Available from: http://handle.unsw.edu.au/1959.4/60057 ; https://unsworks.unsw.edu.au/fapi/datastream/unsworks:51304/SOURCE02?view=true

University of Utah
3.
Vakkalanka, Sarvani.
Efficient dynamic verification algorithms for MPI applications.
Degree: PhD, Computing (School of);, 2010, University of Utah
URL: http://content.lib.utah.edu/cdm/singleitem/collection/etd2/id/882/rec/406
► The Message Passing Interface (MPI) Application Programming Interface (API) is widely used in almost all high performance College of Engineering; applications. Yet, conventional debugging tools…
(more)
▼ The Message Passing Interface (MPI) Application Programming Interface (API) is widely used in almost all high performance College of Engineering; applications. Yet, conventional debugging tools for MPI suffer from two serious drawbacks: they cannot prevent the exponentially growing number of redundant schedules from being explored; and they cannot prevent the processes from being locked into a small subset of schedules, unfortunately often reaching the potentially buggy schedules only when programs are ported to new platforms.
Subjects/Keywords: Dynamic verification; MPI; Testing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Vakkalanka, S. (2010). Efficient dynamic verification algorithms for MPI applications. (Doctoral Dissertation). University of Utah. Retrieved from http://content.lib.utah.edu/cdm/singleitem/collection/etd2/id/882/rec/406
Chicago Manual of Style (16th Edition):
Vakkalanka, Sarvani. “Efficient dynamic verification algorithms for MPI applications.” 2010. Doctoral Dissertation, University of Utah. Accessed March 02, 2021.
http://content.lib.utah.edu/cdm/singleitem/collection/etd2/id/882/rec/406.
MLA Handbook (7th Edition):
Vakkalanka, Sarvani. “Efficient dynamic verification algorithms for MPI applications.” 2010. Web. 02 Mar 2021.
Vancouver:
Vakkalanka S. Efficient dynamic verification algorithms for MPI applications. [Internet] [Doctoral dissertation]. University of Utah; 2010. [cited 2021 Mar 02].
Available from: http://content.lib.utah.edu/cdm/singleitem/collection/etd2/id/882/rec/406.
Council of Science Editors:
Vakkalanka S. Efficient dynamic verification algorithms for MPI applications. [Doctoral Dissertation]. University of Utah; 2010. Available from: http://content.lib.utah.edu/cdm/singleitem/collection/etd2/id/882/rec/406

University of Georgia
4.
Springer, Robert Coleman.
Optimizing time and energy in MPI programs in a power-scalable cluster.
Degree: 2014, University of Georgia
URL: http://hdl.handle.net/10724/23013
► Recently, the idea that power is a performance-limiting factor has gained traction in the high-performance computing community. This may be due to the fact that…
(more)
▼ Recently, the idea that power is a performance-limiting factor has gained traction in the high-performance computing community. This may be due to the fact that the cost of energy has become increasingly signi cant, or that the heat produced
by higher-energy components tends to reduce their reliability. In addition, cooling and providing ample power to supercomputers is becoming more di cult as their power requirements increase. One way to reduce power (and therefore energy) requirements is
to use high-performance cluster nodes that are frequency- and voltage-scalable (e.g., AMD-64 processors). The problem we address in this thesis is: given a power-scalable cluster and an upper limit for energy consumption, choose a schedule (number of
nodes and CPU speed per node) that simultaneously (1) satis es an external upper limit for energy consumption and (2) minimizes execution time. We do this using a novel combination of modeling, execution and pro ling. Using our technique, we are able to
nd a near-optimal schedule in just a handful of partial program executions.
Subjects/Keywords: MPI; power-scalable; clusters; profiling
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Springer, R. C. (2014). Optimizing time and energy in MPI programs in a power-scalable cluster. (Thesis). University of Georgia. Retrieved from http://hdl.handle.net/10724/23013
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Springer, Robert Coleman. “Optimizing time and energy in MPI programs in a power-scalable cluster.” 2014. Thesis, University of Georgia. Accessed March 02, 2021.
http://hdl.handle.net/10724/23013.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Springer, Robert Coleman. “Optimizing time and energy in MPI programs in a power-scalable cluster.” 2014. Web. 02 Mar 2021.
Vancouver:
Springer RC. Optimizing time and energy in MPI programs in a power-scalable cluster. [Internet] [Thesis]. University of Georgia; 2014. [cited 2021 Mar 02].
Available from: http://hdl.handle.net/10724/23013.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Springer RC. Optimizing time and energy in MPI programs in a power-scalable cluster. [Thesis]. University of Georgia; 2014. Available from: http://hdl.handle.net/10724/23013
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Queens University
5.
Zounmevo, Ayi Judicael.
Scalability-Driven Approaches to Key Aspects of the Message Passing Interface for Next Generation Supercomputing
.
Degree: Electrical and Computer Engineering, 2014, Queens University
URL: http://hdl.handle.net/1974/12194
► The Message Passing Interface (MPI), which dominates the supercomputing programming environment, is used to orchestrate and fulfill communication in High Performance Computing (HPC). How far…
(more)
▼ The Message Passing Interface (MPI), which dominates the supercomputing programming environment,
is used to orchestrate and fulfill communication in High Performance Computing (HPC).
How far HPC programs can scale depends in large part
on the ability to achieve fast communication; and to overlap communication with computation or
communication with communication.
This dissertation proposes a new asynchronous solution to the nonblocking Rendezvous protocol used
between pairs of processes to transfer large payloads. On top of enforcing communication/computation
overlapping in a comprehensive way, the proposal trumps existing network device-agnostic
asynchronous solutions by being memory-scalable and by avoiding brute force strategies.
Achieving overlapping between communication and computation is important; but each
communication is also expected to generate minimal latency. In that respect, the
processing of the queues meant to hold messages pending reception inside the MPI middleware is
expected to be fast. Currently though, that processing slows down when program scales grow.
This research presents a novel scalability-driven message queue whose processing skips
altogether large portions of queue items that are deterministically guaranteed to lead to
unfruitful searches. For having little sensitivity to program sizes, the proposed message
queue maintains a very good performance,
on top of displaying a low and flattening memory footprint growth pattern.
Due to the blocking nature of its required synchronizations, the one-sided
communication model of MPI creates both communication/computation and communication/communication
serializations. This research fixes these issues and latency-related inefficiencies documented for
MPI one-sided communications by proposing completely nonblocking and non-serializing versions for
those synchronizations. The improvements, meant for consideration in a future MPI standard,
also allow new classes of programs to be more efficiently expressed in MPI.
Finally, a persistent distributed service is designed over MPI to show its impacts
at large scales beyond communication-only activities.
MPI is analyzed in situations of resource exhaustion, partial failure and heavy use of internal
objects for communicating and non-communicating routines. Important scalability issues are revealed
and solution approaches are put forth.
Subjects/Keywords: HPC
;
MPI
;
Scalability
;
Supercomputing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Zounmevo, A. J. (2014). Scalability-Driven Approaches to Key Aspects of the Message Passing Interface for Next Generation Supercomputing
. (Thesis). Queens University. Retrieved from http://hdl.handle.net/1974/12194
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Zounmevo, Ayi Judicael. “Scalability-Driven Approaches to Key Aspects of the Message Passing Interface for Next Generation Supercomputing
.” 2014. Thesis, Queens University. Accessed March 02, 2021.
http://hdl.handle.net/1974/12194.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Zounmevo, Ayi Judicael. “Scalability-Driven Approaches to Key Aspects of the Message Passing Interface for Next Generation Supercomputing
.” 2014. Web. 02 Mar 2021.
Vancouver:
Zounmevo AJ. Scalability-Driven Approaches to Key Aspects of the Message Passing Interface for Next Generation Supercomputing
. [Internet] [Thesis]. Queens University; 2014. [cited 2021 Mar 02].
Available from: http://hdl.handle.net/1974/12194.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Zounmevo AJ. Scalability-Driven Approaches to Key Aspects of the Message Passing Interface for Next Generation Supercomputing
. [Thesis]. Queens University; 2014. Available from: http://hdl.handle.net/1974/12194
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Manitoba
6.
Singh, Rajendra.
Performance Oriented Partial Checkpoint and Migration of LAM/MPI Applications.
Degree: Computer Science, 2011, University of Manitoba
URL: http://hdl.handle.net/1993/4397
► In parallel computing, MPI is heavily used due to its support of popular cluster based parallel machines and the Single Program Multiple Data (SPMD) model.…
(more)
▼ In parallel computing,
MPI is heavily used due to its support of popular cluster based parallel machines and the Single Program Multiple Data (SPMD) model. Normally cluster nodes are dedicated to a single parallel job/application but
MPI could also be used with nodes that are concurrently shared by multiple users. In this case, nodes could become overloaded with work from other users. Even a few overloaded nodes can result in application slowdown. Thus, it is desirable to relocate affected processes in a running application to lightly loaded nodes by partial checkpointing and migrating of those processes.
In some
MPI applications, groups of processes communicate frequently with one another. Such groups must be near one another to ensure communication efficiency. Thus, if any member of a group is to be checkpointed and migrated, all should be. It must therefore be possible to identify such groups.
I have built a prototype, using LAM/
MPI, that supports partial checkpoint, migration and restart of
MPI processes. To identify process groups for checkpoint and migration, I adapted TEIRESIAS (an algorithm for pattern discovery from bioinformatics) to identify frequent, recurring patterns of communication using data gathered by LAM/
MPI. I then created predictors that use the discovered patterns to predict groups of communicating processes that should be checkpointed and migrated together.
I have assessed the effectiveness of my technique using synthetic and real communication data (for a small set of representative applications) to show that my predictors can accurately predict process groups for those applications. Additionally, I have created a simple simulation system to allow me to explore scenarios related to network characteristics and overload conditions under which my system might provide useful speedup.
Not all
MPI applications will benefit from my approach (e.g. those with unpredictable communication patterns or large groups of frequently communicating processes). However, my experimental and simulation results suggest that my technique should be effective for a number of common application types, network characteristics and overload conditions. Using partial checkpoint and migration should therefore allow many long running applications to finish faster than if a subset of their processes was left running on overloaded nodes.
Advisors/Committee Members: Graham, Peter (Computer Science) (supervisor), Stacey, Deborah (University of Guelph) McLeod, Robert (Electrical and Computer Engineering) Thulasiraman, Parimala (Computer Science) (examiningcommittee).
Subjects/Keywords: Checkpoint; Migration; Partial; LAM/MPI
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Singh, R. (2011). Performance Oriented Partial Checkpoint and Migration of LAM/MPI Applications. (Thesis). University of Manitoba. Retrieved from http://hdl.handle.net/1993/4397
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Singh, Rajendra. “Performance Oriented Partial Checkpoint and Migration of LAM/MPI Applications.” 2011. Thesis, University of Manitoba. Accessed March 02, 2021.
http://hdl.handle.net/1993/4397.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Singh, Rajendra. “Performance Oriented Partial Checkpoint and Migration of LAM/MPI Applications.” 2011. Web. 02 Mar 2021.
Vancouver:
Singh R. Performance Oriented Partial Checkpoint and Migration of LAM/MPI Applications. [Internet] [Thesis]. University of Manitoba; 2011. [cited 2021 Mar 02].
Available from: http://hdl.handle.net/1993/4397.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Singh R. Performance Oriented Partial Checkpoint and Migration of LAM/MPI Applications. [Thesis]. University of Manitoba; 2011. Available from: http://hdl.handle.net/1993/4397
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Florida
7.
Zhao, Zhiyuan.
Dynamics and Magnetization Dynamics of Magnetic Nanoparticles in Applied Magnetic Fields.
Degree: PhD, Chemical Engineering, 2019, University of Florida
URL: https://ufdc.ufl.edu/UFE0055745
► Magnetic nanoparticles (MNPs) can align their magnetic dipole with the direction of externally applied magnetic field through internal rotation of the magnetic dipole, i.e. Neel…
(more)
▼ Magnetic nanoparticles (MNPs) can align their magnetic dipole with the direction of externally applied magnetic field through internal rotation of the magnetic dipole, i.e. Neel relaxation mechanism, or physical rotation of the nanoparticle, i.e. Brownian relaxation mechanism. Based on this property, MNPs has been exploited to drive magnetic assembly of particles in patterned static magnetic field gradients, to generate heat that can be employed to actuate release of a drug or magnetic hyperthermia in applied uniform alternating magnetic fields (AMFs), and to act as tracers in magnetic particle imaging (
MPI) technology. However, a literature review suggests that no prior work explicitly compared the dynamic capture process and aggregate size of MNPs with Neel and Brownian relaxation mechanisms for the application of magnetic assembly. Some prior computational work has studied the role of inter-particle interactions in heat dissipation of MNPs but has not considered the potential role of field-induced particle aggregation on energy dissipation rate. Additionally, no prior computational work has studied the effects of nature of particle diameter, magnetic anisotropy energy and magnetocrystalline anisotropy symmetry on x-space
MPI performance of MNPs undergoing the Neel relaxation.
Advisors/Committee Members: Rinaldi,Carlos (committee chair), Narayanan,Ranganathan (committee member), Arnold,David P (committee member).
Subjects/Keywords: magnetism – mpi – nanoparticle – simulation
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Zhao, Z. (2019). Dynamics and Magnetization Dynamics of Magnetic Nanoparticles in Applied Magnetic Fields. (Doctoral Dissertation). University of Florida. Retrieved from https://ufdc.ufl.edu/UFE0055745
Chicago Manual of Style (16th Edition):
Zhao, Zhiyuan. “Dynamics and Magnetization Dynamics of Magnetic Nanoparticles in Applied Magnetic Fields.” 2019. Doctoral Dissertation, University of Florida. Accessed March 02, 2021.
https://ufdc.ufl.edu/UFE0055745.
MLA Handbook (7th Edition):
Zhao, Zhiyuan. “Dynamics and Magnetization Dynamics of Magnetic Nanoparticles in Applied Magnetic Fields.” 2019. Web. 02 Mar 2021.
Vancouver:
Zhao Z. Dynamics and Magnetization Dynamics of Magnetic Nanoparticles in Applied Magnetic Fields. [Internet] [Doctoral dissertation]. University of Florida; 2019. [cited 2021 Mar 02].
Available from: https://ufdc.ufl.edu/UFE0055745.
Council of Science Editors:
Zhao Z. Dynamics and Magnetization Dynamics of Magnetic Nanoparticles in Applied Magnetic Fields. [Doctoral Dissertation]. University of Florida; 2019. Available from: https://ufdc.ufl.edu/UFE0055745

University of Limerick
8.
Meade, Anne.
Supporting data decomposition in parallel programming.
Degree: 2014, University of Limerick
URL: http://hdl.handle.net/10344/4019
► peer-reviewed
Parallelising serial software systems presents many challenges. In particular, the task of decomposing large, data-intensive applications for execution on distributed architectures is described in…
(more)
▼ peer-reviewed
Parallelising serial software systems presents many challenges. In particular,
the task of decomposing large, data-intensive applications for execution on distributed
architectures is described in the literature as error-prone and time-consuming.
The Message Passing Interface (MPI) specification is the de facto industry standard
to program for such architectures, but requires low level knowledge of data
distribution details as programmers must explicitly invoke inter-process communication
routines. This research reports the findings from empirical studies
conducted in industry, to explore and characterise the challenges associated with
performing data decomposition. Findings from these studies culminated in a
list of derived requirements for tool support, encompassing automation of grid
indexing, generation of data structures and communication calls, and provision
of assistance when changing from an implemented decomposition strategy. Additional
requirements include the need for a tool to be MPI focused, initially
target structured grids and have a low impact on the application code. These
requirements were subsequently buttressed to address gaps in the state-of-the-art
and provided motivation for the development of a tool named MPIGen.
MPIGen provides an abstraction for MPI, encapsulating the low level details
involved in decomposing data and exchanging messages between processors.
Users can express the parallel intent of their application through input parameters
and then generate code containing wrapper functions that encompass the MPI
functionality. The wrapper functions can then be invoked within the serial code
resulting in a semi-automated parallelised solution. The programmer is relieved
of the burden of deciphering memory locations when exchanging data between
processors. The tool was evaluated in two studies involving both students and
High Performance Computing (HPC) practitioners as subjects. The findings
concluded that MPIGen provides an efficient abstraction for performing data
decomposition and that it satisfies the list of empirically derived requirements.
Parallel programming is a difficult skill that software developers need to learn,
yet the low level nature of specifications such as MPI is an adverse factor to its adoption.
MPIGen makes it easier to adopt this skill-set as it offers effective support
to parallel programmers when undertaking decomposition and communication.
Advisors/Committee Members: Collins, J.J., Buckley, Jim, SFI.
Subjects/Keywords: software systems; MPI; parallel programming
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Meade, A. (2014). Supporting data decomposition in parallel programming. (Thesis). University of Limerick. Retrieved from http://hdl.handle.net/10344/4019
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Meade, Anne. “Supporting data decomposition in parallel programming.” 2014. Thesis, University of Limerick. Accessed March 02, 2021.
http://hdl.handle.net/10344/4019.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Meade, Anne. “Supporting data decomposition in parallel programming.” 2014. Web. 02 Mar 2021.
Vancouver:
Meade A. Supporting data decomposition in parallel programming. [Internet] [Thesis]. University of Limerick; 2014. [cited 2021 Mar 02].
Available from: http://hdl.handle.net/10344/4019.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Meade A. Supporting data decomposition in parallel programming. [Thesis]. University of Limerick; 2014. Available from: http://hdl.handle.net/10344/4019
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Louisiana State University
9.
Rajagopalan, Kaushik Ragavan.
GPU acceleration of the Variational Monte Carlo Method for Many Body Physics.
Degree: MSEE, Electrical and Computer Engineering, 2013, Louisiana State University
URL: etd-04122013-132014
;
https://digitalcommons.lsu.edu/gradschool_theses/3622
► High-Performance computing is one of the major areas making inroads into the future for large-scale simulation. Applications such as 3D nuclear test, Molecular Dynamics, and…
(more)
▼ High-Performance computing is one of the major areas making inroads into the future for large-scale simulation. Applications such as 3D nuclear test, Molecular Dynamics, and Quantum Monte Carlo simulations are now developed on supercomputers using the latest computing technologies. As per the TOP500 supercomputers rating, most of today‘s supercomputers are now heterogeneous: with massively parallel Graphics Processing Units (GPU) equipped with Multi-core CPU(s) to increase the computational capacity. The Variational Monte Carlo(VMC) method is used in the Many Body Physics to study the ground state properties of a system. The wavefunction depends on some variational parameters, which contain the physics for a better prediction. In general, the variational parameters are chosen to realize some sort of order or broken symmetry such as superconductivity and magnetism. The variational approach is computationally expensive and requires a large number of Markov chains (MCs) to obtain convergence. The MCs exhibit abundant data parallelism and parallelizing across CPU clusters will prove to be expensive and does not scale in proportion to the system size. Hence, this method will be a suitable candidate on a massively parallel Graphics Processing Unit (GPU). In this research, we discuss about the various optimization and parallelization strategies adopted to port the VMC method to a NVIDIA GPU using CUDA. We obtained a speedup of nearly 3.85 X compared to the MPI implementation [4] and a speedup of upto 19 X compared to an object-oriented C++ code.
Subjects/Keywords: MPI; VMC; GPU; CUDA
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Rajagopalan, K. R. (2013). GPU acceleration of the Variational Monte Carlo Method for Many Body Physics. (Masters Thesis). Louisiana State University. Retrieved from etd-04122013-132014 ; https://digitalcommons.lsu.edu/gradschool_theses/3622
Chicago Manual of Style (16th Edition):
Rajagopalan, Kaushik Ragavan. “GPU acceleration of the Variational Monte Carlo Method for Many Body Physics.” 2013. Masters Thesis, Louisiana State University. Accessed March 02, 2021.
etd-04122013-132014 ; https://digitalcommons.lsu.edu/gradschool_theses/3622.
MLA Handbook (7th Edition):
Rajagopalan, Kaushik Ragavan. “GPU acceleration of the Variational Monte Carlo Method for Many Body Physics.” 2013. Web. 02 Mar 2021.
Vancouver:
Rajagopalan KR. GPU acceleration of the Variational Monte Carlo Method for Many Body Physics. [Internet] [Masters thesis]. Louisiana State University; 2013. [cited 2021 Mar 02].
Available from: etd-04122013-132014 ; https://digitalcommons.lsu.edu/gradschool_theses/3622.
Council of Science Editors:
Rajagopalan KR. GPU acceleration of the Variational Monte Carlo Method for Many Body Physics. [Masters Thesis]. Louisiana State University; 2013. Available from: etd-04122013-132014 ; https://digitalcommons.lsu.edu/gradschool_theses/3622

Virginia Tech
10.
Banerjee, Shankha.
MPIOR: A Framework to Analyze File System Performance of MPI Applications.
Degree: MS, Computer Science and Applications, 2012, Virginia Tech
URL: http://hdl.handle.net/10919/31484
► MPI I/O replay (MPIOR) is an I/O performance modeling and prediction tool used to trace and replay a parallel application to determine application performance under…
(more)
▼ MPI I/O replay (MPIOR) is an I/O performance modeling and prediction tool used to trace and replay a parallel application to determine application performance under a new I/O sub system. The trace collector deduces synchronization inter-dependencies between nodes and I/O demands placed by each node on the storage subsystem. It uses a novel runtime graph traversal technique to filter and log only those
MPI calls that affect I/O, thus substantially reducing both the number of runs and the size of the trace file. Unlike other such tools, MPIOR collects a valid trace in a single run and it does not rely on node sampling or I/O sampling. MPIORâ s post processing engine analyzes the trace files and sets up the re-player. Due to minimal overhead for trace collection, MPIOR can be used during production runs rather than just as a debugging tool. The re-player mimics the behavior of the application across a variety of storage systems by mapping multiple processes to multiple threads running on a single node. We show average replay error for parallel applications is below 30%.
Advisors/Committee Members: Varadarajan, Srinidhi (committeechair), Tilevich, Eli (committee member), Ribbens, Calvin J. (committee member).
Subjects/Keywords: I/O; replay; MPI; trace
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Banerjee, S. (2012). MPIOR: A Framework to Analyze File System Performance of MPI Applications. (Masters Thesis). Virginia Tech. Retrieved from http://hdl.handle.net/10919/31484
Chicago Manual of Style (16th Edition):
Banerjee, Shankha. “MPIOR: A Framework to Analyze File System Performance of MPI Applications.” 2012. Masters Thesis, Virginia Tech. Accessed March 02, 2021.
http://hdl.handle.net/10919/31484.
MLA Handbook (7th Edition):
Banerjee, Shankha. “MPIOR: A Framework to Analyze File System Performance of MPI Applications.” 2012. Web. 02 Mar 2021.
Vancouver:
Banerjee S. MPIOR: A Framework to Analyze File System Performance of MPI Applications. [Internet] [Masters thesis]. Virginia Tech; 2012. [cited 2021 Mar 02].
Available from: http://hdl.handle.net/10919/31484.
Council of Science Editors:
Banerjee S. MPIOR: A Framework to Analyze File System Performance of MPI Applications. [Masters Thesis]. Virginia Tech; 2012. Available from: http://hdl.handle.net/10919/31484

Texas State University – San Marcos
11.
Abdulsalam, Sarah.
Using the Greenup, Powerup And Speedup Metrics To Evaluate Software Energy Efficiency.
Degree: MS, Computer Science, 2016, Texas State University – San Marcos
URL: https://digital.library.txstate.edu/handle/10877/6855
► Green computing has made significant progress in the past decades, which is evidenced by more energy efficient hardware (e.g. low power CPUs, GPUs, SSDs) and…
(more)
▼ Green computing has made significant progress in the past decades, which is evidenced by more energy efficient hardware (e.g. low power CPUs, GPUs, SSDs) and better power management and cooling techniques at data centers. However, the energy efficiency of software has not been improved much. The majority of software developers do not know how to reduce the energy consumption of programs due to the lack of easy-to-use measurement tools, effective evaluation metrics, in-depth knowledge on the correlations between performance and energy when optimizing software for better efficiency. This thesis proposes the Greenup, Powerup, and Speedup (GPS-UP) metrics to systematically evaluate the energy efficiency of serial and parallel applications. The GPS-UP metrics transform the performance, power, and energy of a program into one of the eight categories on the GPS-UP energy efficiency graph. Four of those categories are green (i.e., save energy) and four are red (i.e., waste energy). Using GPS-UP, we study the effect of running code with different programming languages, altering algorithms, using DVFS, changing compiler optimizations, and changing the number of ranks in
MPI programs. We show which techniques improve performance more than energy efficiency, which techniques improve energy efficiency more than performance, and which techniques hurt performance and energy efficiency instead of improving them. In addition to applying the GPS-UP metric to serial and parallel programs running on a single node, we demonstrate the usability of GPS-UP for
MPI programs, which are executed on multiple nodes. We accurately measure the energy consumption of
MPI programs. Moreover, we explore the possibility of using machine learning algorithms to build models that can predict the optimal number of ranks of
MPI programs (for either minimal power waste or for acceptable tradeoffs between performance gain and energy penalty).
Advisors/Committee Members: Zong, Ziliang (advisor), Burtscher, Martin (committee member), Qasem, Apan (committee member).
Subjects/Keywords: Energy efficiency; Power; MPI
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Abdulsalam, S. (2016). Using the Greenup, Powerup And Speedup Metrics To Evaluate Software Energy Efficiency. (Masters Thesis). Texas State University – San Marcos. Retrieved from https://digital.library.txstate.edu/handle/10877/6855
Chicago Manual of Style (16th Edition):
Abdulsalam, Sarah. “Using the Greenup, Powerup And Speedup Metrics To Evaluate Software Energy Efficiency.” 2016. Masters Thesis, Texas State University – San Marcos. Accessed March 02, 2021.
https://digital.library.txstate.edu/handle/10877/6855.
MLA Handbook (7th Edition):
Abdulsalam, Sarah. “Using the Greenup, Powerup And Speedup Metrics To Evaluate Software Energy Efficiency.” 2016. Web. 02 Mar 2021.
Vancouver:
Abdulsalam S. Using the Greenup, Powerup And Speedup Metrics To Evaluate Software Energy Efficiency. [Internet] [Masters thesis]. Texas State University – San Marcos; 2016. [cited 2021 Mar 02].
Available from: https://digital.library.txstate.edu/handle/10877/6855.
Council of Science Editors:
Abdulsalam S. Using the Greenup, Powerup And Speedup Metrics To Evaluate Software Energy Efficiency. [Masters Thesis]. Texas State University – San Marcos; 2016. Available from: https://digital.library.txstate.edu/handle/10877/6855

Universidade do Rio Grande do Sul
12.
Almeida, Alexandre Vinicius.
Uso de auto-tuning para otimização de decomposição de domínios paralela.
Degree: 2011, Universidade do Rio Grande do Sul
URL: http://hdl.handle.net/10183/39121
► O desenvolvimento de aplicações de forma a atingir níveis de desempenho próximos aos níveis teóricos de uma determinada plataforma é uma tarefa que exige conhecimento…
(more)
▼ O desenvolvimento de aplicações de forma a atingir níveis de desempenho próximos aos níveis teóricos de uma determinada plataforma é uma tarefa que exige conhecimento técnico do ambiente de hardware, uma vez que o software deve explorar detalhes específicos da plataforma em questão. Pelo fato do software ser específico à plataforma, caso ela evolua ou se altere, as otimizações realizadas podem não explorar a nova arquitetura de forma eficiente. Auto-tuners são sistemas que surgiram como um meio automatizado de adaptar um determinado software a uma arquitetura alvo. Essa adaptação ocorre através de uma busca empírica de valores ótimos para parâmetros específicos de uma aplicação, a fim de ajustá-los às características do hardware, ou ainda através da geração de códigofonte otimizado para a plataforma. Este trabalho propõe um módulo auto-tuner orientado à adaptação parametrizada de uma aplicação paralela, que trabalha variando os fatores da dimensão do domínio bidimensional, o número de processos e a extensão das regiões de sobreposição. Para cada variação dos fatores, o auto-tuner testa a aplicação na arquitetura paralela de forma a buscar a combinação de parâmetros com melhor desempenho. Para possibilitar o auto-tuning, foi desenvolvida uma classe em linguagem C++ denominada Mesh, baseada no padrão MPI. A classe busca abstrair a decomposição de domínios de uma aplicação paralela por meio do uso de Orientação a Objetos, e facilita a variação da extensão das regiões de sobreposição entre os subdomínios. Os resultados experimentais demonstraram que o auto-tuner explora o ganho de desempenho pela variação do número de processos da aplicação, que também é tratado pelo módulo auto-tuner. A arquitetura paralela utilizada na validação não se mostrou ideal para uma otimização através do aumento da extensão das regiões sobrepostas entre subdomínios.
Achieving the peak performance level of a particular platform requires technical knowledge of the hardware environment involved, since the software must explore specific details inherent to the hardware. Once the software is optimized for a target platform, if the hardware evolves or is changed, the software probably would not be as efficient in the new environment. This performance portability problem is addressed by software auto-tuning, which emerged in the past decade as an automated technique to adapt a particular software to an underlying hardware. The software adaptation is performed by an auto-tuner. The auto-tuner is an entity that empirically adjusts specific application parameters in order to improve the overall application performance, or even generates source-code optimized for the target platform. This dissertation proposes an auto-tuner to optimize the domain decomposition of a parallel application that performs stencil computations. The proposed auto-tuner works in a parameterized adaptation fashion, and varies the dimensions of a 2D domain, the number of parallel processes and the extension of the overlapping zones between subdomains. For each combination of…
Advisors/Committee Members: Maillard, Nicolas Bruno.
Subjects/Keywords: Mpi; Auto-tuning; Processamento paralelo; Domain decomposition; MPI; Paralelism
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Almeida, A. V. (2011). Uso de auto-tuning para otimização de decomposição de domínios paralela. (Thesis). Universidade do Rio Grande do Sul. Retrieved from http://hdl.handle.net/10183/39121
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Almeida, Alexandre Vinicius. “Uso de auto-tuning para otimização de decomposição de domínios paralela.” 2011. Thesis, Universidade do Rio Grande do Sul. Accessed March 02, 2021.
http://hdl.handle.net/10183/39121.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Almeida, Alexandre Vinicius. “Uso de auto-tuning para otimização de decomposição de domínios paralela.” 2011. Web. 02 Mar 2021.
Vancouver:
Almeida AV. Uso de auto-tuning para otimização de decomposição de domínios paralela. [Internet] [Thesis]. Universidade do Rio Grande do Sul; 2011. [cited 2021 Mar 02].
Available from: http://hdl.handle.net/10183/39121.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Almeida AV. Uso de auto-tuning para otimização de decomposição de domínios paralela. [Thesis]. Universidade do Rio Grande do Sul; 2011. Available from: http://hdl.handle.net/10183/39121
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Kaunas University of Technology
13.
Šeinauskas,
Vytenis.
Lygiagrečių programų efektyvumo
tyrimas.
Degree: Master, Informatics, 2008, Kaunas University of Technology
URL: http://vddb.laba.lt/obj/LT-eLABa-0001:E.02~2008~D_20080811_151827-94348
;
► Šis magistrinis darbas skirtas lygiagrečių programų efektyvumo analizei atlikti, pasinaudojant sukurta lygiagrečių programų efektyvumo tyrimo programine įranga. Pagrindinis darbo tikslas – sukurti, ištirti bei pritaikyti…
(more)
▼ Šis magistrinis darbas skirtas lygiagrečių
programų efektyvumo analizei atlikti, pasinaudojant sukurta
lygiagrečių programų efektyvumo tyrimo programine įranga.
Pagrindinis darbo tikslas – sukurti, ištirti bei pritaikyti mokymo
programinę įrangą, skirtą lygiagrečių programų analizei. Tam
tikslui buvo atliekamas sukurtos programos galimybių tyrimas bei
suplanuoti ir vykdomi programinės įrangos tobulinimo darbai. Taip
pat buvo atliekami pavyzdinių lygiagrečių programų tyrimai,
naudojant sukurtą programinę įrangą, norint parodyti lygiagrečių
programų efektyvumo tyrimo būdus bei sukurtos lygiagrečių programų
efektyvumo tyrimo programinės įrangos
galimybes.
Parallel program execution is often used to
overcome the constraints of processing speed and memory size when
executing complex and time-consuming algorithms. The downside to
this approach is the increased overall complexity of programs and
their implementations. Parallel execution introduces a new class of
software bugs and performance shortcomings, that are usually
difficult to trace using traditional methods and tools. Hence, new
tools and methods need to be introduced, which deal specifically
with problems encountered in parallel programs. The goal of this
project is the development of MPI-based parallel program
performance monitoring tool and research into the ways this tool
can be used for measuring, comparing and improving the performance
of target programs.
Advisors/Committee Members: Stulpinas, Raimundas (Master’s degree committee chair), Motiejūnas, Kęstutis (Master’s degree session secretary), Bareiša, Eduardas (Master’s degree committee member), Butleris, Rimantas (Master’s degree committee member), Kazanavičius, Egidijus (Master’s degree committee member), Tomkevičius, Arūnas (Master’s degree committee member), Šeinauskas, Rimantas (Master’s degree committee member), Štuikys, Vytautas (Master’s degree committee member), Paulikas, Kęstutis (Master’s thesis reviewer), Marcinkevičius, Romas (Master’s thesis supervisor).
Subjects/Keywords: Lygiagretus
programavimas; MPI; Programinė
įranga; Parallel
programming; MPI; Software
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Šeinauskas,
Vytenis. (2008). Lygiagrečių programų efektyvumo
tyrimas. (Masters Thesis). Kaunas University of Technology. Retrieved from http://vddb.laba.lt/obj/LT-eLABa-0001:E.02~2008~D_20080811_151827-94348 ;
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Chicago Manual of Style (16th Edition):
Šeinauskas,
Vytenis. “Lygiagrečių programų efektyvumo
tyrimas.” 2008. Masters Thesis, Kaunas University of Technology. Accessed March 02, 2021.
http://vddb.laba.lt/obj/LT-eLABa-0001:E.02~2008~D_20080811_151827-94348 ;.
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
MLA Handbook (7th Edition):
Šeinauskas,
Vytenis. “Lygiagrečių programų efektyvumo
tyrimas.” 2008. Web. 02 Mar 2021.
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Vancouver:
Šeinauskas,
Vytenis. Lygiagrečių programų efektyvumo
tyrimas. [Internet] [Masters thesis]. Kaunas University of Technology; 2008. [cited 2021 Mar 02].
Available from: http://vddb.laba.lt/obj/LT-eLABa-0001:E.02~2008~D_20080811_151827-94348 ;.
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Council of Science Editors:
Šeinauskas,
Vytenis. Lygiagrečių programų efektyvumo
tyrimas. [Masters Thesis]. Kaunas University of Technology; 2008. Available from: http://vddb.laba.lt/obj/LT-eLABa-0001:E.02~2008~D_20080811_151827-94348 ;
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete

Universitat Autònoma de Barcelona
14.
Espínola Brítez, Laura María.
Efficient communication management in cloud environments.
Degree: Departament d'Arquitectura de Computadors i Sistemes Operatius, 2018, Universitat Autònoma de Barcelona
URL: http://hdl.handle.net/10803/666690
► Scientific applications with High Performance Computing (HPC) requirements are migrating to cloud environments due to the facilities that it offers. Cloud computing plays a major…
(more)
▼ Scientific applications with High Performance Computing (HPC) requirements are
migrating to cloud environments due to the facilities that it offers. Cloud computing
plays a major role considering the compute power that it provides, avoiding the cost
of physical cluster maintenance. With features like elasticity and pay-per-use, it helps
to reduce the researchers procurement risk.
Most of HPC applications are implemented using Message Passing Interface (
MPI),
which is a key component in common and distributed computing tasks. However, for
this kind of applications on cloud environments, the major drawback is the lost of
execution performance, due to the virtualized network that affects the
communications latency and bandwidth.
To use a cloud environment with scientific applications of this kind, low latency
communication mechanisms are required. The network topology detail is not
available for users in virtualized environments, making difficult to use the existing
optimizations based on network topology information done in bare-metal cluster
environments. In some cases, cloud providers can migrate virtual machines, which
impacts the efficiency of routing optimizations and placement algorithms. Moreover,
if resource isolation is not guaranteed, resource sharing can lead to variable
bandwidth and unstable performance.
In this thesis a Dynamic
MPI Communication Balance and Management (DMCBM)
is presented, to overcome the communication challenge of HPC applications in cloud.
DMCBM is implemented as a middle-ware between the users application and the
execution environment. It improves message communication latency times in cloudbased
systems, and helps users to detect mapping and parallel implementation issues.
Our solution dynamically rebalances communication flows at higher levels of the
virtualized HPC stack, e.g. over
MPI communications layer, to dynamically remove
communication hot-spots and congestion in the underlying layers.
DMCBM abstracts the communications state between application processes based on
latency measurements. This middleware characterizes the underlying network
topology and analyzes parallel applications behavior in the cloud. This allows for
detecting network congestion and optimizing communications by either selecting
alternative communication paths between processes, or leveraging live migration of
virtual machines in cloud environments. These options are analyzed in real-time and
selected according to the type of congestion (link or destination). DMCBM achieves
lower application execution time in case of congestion, obtaining better performance
in clouds.
Finally, experiments that verify the functionality and improvements of DMCBM with
MPI Applications in public and private clouds are presented. The experiments where
done by measuring execution and communication times. NAS Parallel Benchmarks
and a real application of dynamic particles simulation NBody are used, obtaining an
improvement of up to 10% in the execution time and a communication time reduction
…
Advisors/Committee Members: [email protected] (authoremail), true (authoremailshow), Franco Puntes, Daniel (director).
Subjects/Keywords: Cloud computing; Computació de altes prestacions; Computación de altas prestaciones; High performance computing; Comunicacions MPI; Comuncaciones MPI; MPI communications; Tecnologies; 004
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Espínola Brítez, L. M. (2018). Efficient communication management in cloud environments. (Thesis). Universitat Autònoma de Barcelona. Retrieved from http://hdl.handle.net/10803/666690
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Espínola Brítez, Laura María. “Efficient communication management in cloud environments.” 2018. Thesis, Universitat Autònoma de Barcelona. Accessed March 02, 2021.
http://hdl.handle.net/10803/666690.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Espínola Brítez, Laura María. “Efficient communication management in cloud environments.” 2018. Web. 02 Mar 2021.
Vancouver:
Espínola Brítez LM. Efficient communication management in cloud environments. [Internet] [Thesis]. Universitat Autònoma de Barcelona; 2018. [cited 2021 Mar 02].
Available from: http://hdl.handle.net/10803/666690.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Espínola Brítez LM. Efficient communication management in cloud environments. [Thesis]. Universitat Autònoma de Barcelona; 2018. Available from: http://hdl.handle.net/10803/666690
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
15.
BARROS JÚNIOR, Severino José de.
Um cluster híbrido com módulos de co – processamento em hardware (FPGAS) para processamento de alto desempenho
.
Degree: 2014, Universidade Federal de Pernambuco
URL: http://repositorio.ufpe.br/handle/123456789/11839
► Organizações que lidam com sistemas computacionais buscam cada vez mais melhorar o desempenho de suas aplicações. Essas aplicações possuem como principal característica o processamento massivo…
(more)
▼ Organizações que lidam com sistemas computacionais buscam cada vez mais melhorar o desempenho de suas aplicações. Essas aplicações possuem como principal característica o processamento massivo de dados. A solução utilizada para execução desses problemas é baseada, em geral, em arquiteturas de processadores de uso geral, cuja principal característica é sua estrutura de hardware baseada no Paradigma de Von Neumann. Esse paradigma possui uma deficiência conhecida como “Gargalo de Von Neumann”, onde instruções que poderiam ser executadas de forma simultânea, devido à sua independência de dados, acabam sendo processadas sequencialmente, prejudicando o potencial desempenho dessa classe de aplicações. Para aumentar o processamento paralelo dos sistemas, as Organizações costumam adotar uma estrutura baseada na associação de vários PCs, conectados a uma rede de alta velocidade e trabalham em conjunto para resolver um grande problema. A essa associação é atribuída o nome de cluster, a qual cada integrante PC, chamado de nó, realiza uma parte da computação de um grande problema de forma simultânea, proporcionando a ideia de um paralelismo explícito da aplicação como um todo. Mesmo com um aumento significativo de elementos de processamento independentes, este crescimento é insuficiente para atender à enorme quantidade de demanda de computação de dados em aplicações complexas. Ela exige uma divisão de grupos de instruções independentes, distribuídos entre os nós. Esta estratégia dá a idéia de paralelismo e assim um melhor desempenho. No entanto, o desempenho em cada nó permanece degradado, devido ao estrangulamento seqüencial presente nós processadores. A fim de aumentar o paralelismo das operações em cada nó, soluções híbridas, compostas por CPUs convencionais e coprocessadores foram adotadas. Um desses coprocessadores é o FPGA (Field Programmable Gate Array), que geralmente é conectado ao PC através do barramento PCIe. O projeto descrito na dissertação propõe uma metodologia de desenvolvimento para este aglomerado híbrido, de modo a aumentar o desempenho de aplicações científicas que requerem uma grande quantidade de processamento de dados. A metodologia é apresentada e dois exemplos são discutidos em detalhes.
Advisors/Committee Members: LIMA, Manoel Eusébio de (advisor).
Subjects/Keywords: HPC;
Cluster Híbrido;
FPGA;
OpenMP;
MPI
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
BARROS JÚNIOR, S. J. d. (2014). Um cluster híbrido com módulos de co – processamento em hardware (FPGAS) para processamento de alto desempenho
. (Thesis). Universidade Federal de Pernambuco. Retrieved from http://repositorio.ufpe.br/handle/123456789/11839
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
BARROS JÚNIOR, Severino José de. “Um cluster híbrido com módulos de co – processamento em hardware (FPGAS) para processamento de alto desempenho
.” 2014. Thesis, Universidade Federal de Pernambuco. Accessed March 02, 2021.
http://repositorio.ufpe.br/handle/123456789/11839.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
BARROS JÚNIOR, Severino José de. “Um cluster híbrido com módulos de co – processamento em hardware (FPGAS) para processamento de alto desempenho
.” 2014. Web. 02 Mar 2021.
Vancouver:
BARROS JÚNIOR SJd. Um cluster híbrido com módulos de co – processamento em hardware (FPGAS) para processamento de alto desempenho
. [Internet] [Thesis]. Universidade Federal de Pernambuco; 2014. [cited 2021 Mar 02].
Available from: http://repositorio.ufpe.br/handle/123456789/11839.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
BARROS JÚNIOR SJd. Um cluster híbrido com módulos de co – processamento em hardware (FPGAS) para processamento de alto desempenho
. [Thesis]. Universidade Federal de Pernambuco; 2014. Available from: http://repositorio.ufpe.br/handle/123456789/11839
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Utah
16.
Chiang, Wei-Fan.
Heuristics for efficient dynamic verification of message passing interface and thread programs.
Degree: MS, Computer Science, 2011, University of Utah
URL: http://content.lib.utah.edu/cdm/singleitem/collection/etd3/id/405/rec/1221
► Concurrent programs are extremely important for efficiently programming future HPC systems. Large scientific programs may employ multiple processes or threads to run on HPC systems…
(more)
▼ Concurrent programs are extremely important for efficiently programming future HPC systems. Large scientific programs may employ multiple processes or threads to run on HPC systems for days. Reliability is an essential requirement of existing concurrent programs. Therefore, verification of concurrent programs becomes increasingly important. Today we have two significant challenges in developing concurrent program verification tools: The first is scalability. Since new types of concurrent programs keep being created, verification tools need to scale to handle all these new types of programs. The second is providing formal coverage guarantee. Dynamic verification tools always face a huge schedule space. Both these capabilities must exist for testing programs that follow multiple concurrency models. Most current dynamic verification tools can only explore either thread level or process level schedules. Consequently, they fail to verify hybrid programs. Exploring mixed process and thread level schedules is not an ideal solution because the state space will grow exponentially in both levels. It is hard to systematically traverse these mixed schedules. Therefore, our approach is to determinize all concurrent APIs except one API whose schedules will then be explored. To improve search efficiency, we proposed a random-walk based heuristic algorithm. We observed many concurrent programs and concluded some common structures of them. Based on the existence of these structures, we can make dynamic verification tools focusing on specific regions and bypassing regions of less interest. We propose a random sampling of executions in the regions of less interest.
Subjects/Keywords: Dynamic verification; Heuristic; MPI programs; Thread programs
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Chiang, W. (2011). Heuristics for efficient dynamic verification of message passing interface and thread programs. (Masters Thesis). University of Utah. Retrieved from http://content.lib.utah.edu/cdm/singleitem/collection/etd3/id/405/rec/1221
Chicago Manual of Style (16th Edition):
Chiang, Wei-Fan. “Heuristics for efficient dynamic verification of message passing interface and thread programs.” 2011. Masters Thesis, University of Utah. Accessed March 02, 2021.
http://content.lib.utah.edu/cdm/singleitem/collection/etd3/id/405/rec/1221.
MLA Handbook (7th Edition):
Chiang, Wei-Fan. “Heuristics for efficient dynamic verification of message passing interface and thread programs.” 2011. Web. 02 Mar 2021.
Vancouver:
Chiang W. Heuristics for efficient dynamic verification of message passing interface and thread programs. [Internet] [Masters thesis]. University of Utah; 2011. [cited 2021 Mar 02].
Available from: http://content.lib.utah.edu/cdm/singleitem/collection/etd3/id/405/rec/1221.
Council of Science Editors:
Chiang W. Heuristics for efficient dynamic verification of message passing interface and thread programs. [Masters Thesis]. University of Utah; 2011. Available from: http://content.lib.utah.edu/cdm/singleitem/collection/etd3/id/405/rec/1221

University of Utah
17.
Vo, Anh.
Scalable formal dynamic verification of MPI programs through distributed causality tracking.
Degree: PhD, Computer Science, 2011, University of Utah
URL: http://content.lib.utah.edu/cdm/singleitem/collection/etd3/id/177/rec/2131
► Almost all high performance computing applications are written in MPI, which will continue to be the case for at least the next several years. Given…
(more)
▼ Almost all high performance computing applications are written in MPI, which will continue to be the case for at least the next several years. Given the huge and growing importance of MPI, and the size and sophistication of MPI codes, scalable and incisive MPI debugging tools are essential. Existing MPI debugging tools have, despite their strengths, many glaring de ficiencies, especially when it comes to debugging under the presence of nondeterminism related bugs, which are bugs that do not always show up during testing. These bugs usually become manifest when the systems are ported to di fferent platforms for production runs. This dissertation focuses on the problem of developing scalable dynamic verifi cation tools for MPI programs that can provide a coverage guarantee over the space of MPI nondeterminism. That is, the tools should be able to detect diff erent outcomes of nondeterministic events in an MPI program and enforce all those di fferent outcomes through repeated executions of the program with the same test harness. We propose to achieve the coverage guarantee by introducing efficient distributed causality tracking protocols that are based on the matches-before order. The matches-before order is introduced to address the shortcomings of the Lamport happens-before order [40], which is not sufficient to capture causality for MPI program executions due to the complexity of the MPI semantics. The two protocols we propose are the Lazy Lamport Clocks Protocol (LLCP) and the Lazy Vector Clocks Protocol (LVCP). LLCP provides good scalability with a small possibility of missing potential outcomes of nondeterministic events while LVCP provides full coverage guarantee with a scalability tradeoff . In practice, we show through our experiments that LLCP provides the same coverage as LVCP. This thesis makes the following contributions: •The MPI matches-before order that captures the causality between MPI events in an MPI execution. • Two distributed causality tracking protocols for MPI programs that rely on the matches-before order. • A Distributed Analyzer for MPI programs (DAMPI), which implements the two aforementioned protocols to provide scalable and modular dynamic verifi cation for MPI programs. • Scalability enhancement through algorithmic improvements for ISP, a dynamic verifi er for MPI programs.
Subjects/Keywords: Causality tracking; Correctness checking; MPI; Verification
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Vo, A. (2011). Scalable formal dynamic verification of MPI programs through distributed causality tracking. (Doctoral Dissertation). University of Utah. Retrieved from http://content.lib.utah.edu/cdm/singleitem/collection/etd3/id/177/rec/2131
Chicago Manual of Style (16th Edition):
Vo, Anh. “Scalable formal dynamic verification of MPI programs through distributed causality tracking.” 2011. Doctoral Dissertation, University of Utah. Accessed March 02, 2021.
http://content.lib.utah.edu/cdm/singleitem/collection/etd3/id/177/rec/2131.
MLA Handbook (7th Edition):
Vo, Anh. “Scalable formal dynamic verification of MPI programs through distributed causality tracking.” 2011. Web. 02 Mar 2021.
Vancouver:
Vo A. Scalable formal dynamic verification of MPI programs through distributed causality tracking. [Internet] [Doctoral dissertation]. University of Utah; 2011. [cited 2021 Mar 02].
Available from: http://content.lib.utah.edu/cdm/singleitem/collection/etd3/id/177/rec/2131.
Council of Science Editors:
Vo A. Scalable formal dynamic verification of MPI programs through distributed causality tracking. [Doctoral Dissertation]. University of Utah; 2011. Available from: http://content.lib.utah.edu/cdm/singleitem/collection/etd3/id/177/rec/2131

Penn State University
18.
Chirravuri, Sai Krishnan.
RDF3X-MPI: A Partitioned RDF engine for Data-Parallel SPARQL Querying.
Degree: 2014, Penn State University
URL: https://submit-etda.libraries.psu.edu/catalog/22814
► The Semantic Web is a collection of technologies that facilitate universal access to linked data. The Resource Description Framework (RDF) model is one such technology…
(more)
▼ The Semantic Web is a collection of technologies that facilitate universal access to linked data. The Resource Description Framework (RDF) model is one such technology that is being developed by the World Wide Web Consortium (W3C). A common representation of RDF data is as a set of triples. Each triple contains three fields: a
subject, a predicate, and an object. A collection of triples can also be visualized as a directed graph, with subjects and objects as vertices in the graph, and predicates as edges connecting the vertices. When large collections of triples are aggregated, they form massive RDF graphs. Collections of RDF triple data sets have been growing over the past decade, and publicly-available RDF data sets now have billions of triples. As data sizes continue to grow, the time to process and query large RDF data sets also continues to increase. This work presents RDF3x-
MPI, a new scalable, parallel RDF data management and querying system based on the RDF3x data management system. RDF3x (RDF Triple eXpress) is a state-of-the-art RDF engine that is shown to outperform alternatives by one or two orders of magnitude, on several well-known benchmarks and in experimental studies. Our approach leverages all the data storage, indexing, and querying optimizations in RDF3x. We additionally partition input RDF data to support parallel data ingestion, and devise a methodology to execute SPARQL queries in parallel, with minimal inter-processor communication. Using our new approach, we demonstrate a performance improvement of up to 12.9 × in query evaluation for the LUBM benchmark, using 32-way
MPI task parallelism. This work also presents an in-depth characterization of SPARQL query execution times with RDF3x and RDF3x-
MPI on several large-scale benchmark instances.
Advisors/Committee Members: Kamesh Madduri, Thesis Advisor/Co-Advisor, Piotr Berman, Thesis Advisor/Co-Advisor.
Subjects/Keywords: RDF3x; MPI; RDF; SPARQL; DISTRIBUTED; IN-MEMORY
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Chirravuri, S. K. (2014). RDF3X-MPI: A Partitioned RDF engine for Data-Parallel SPARQL Querying. (Thesis). Penn State University. Retrieved from https://submit-etda.libraries.psu.edu/catalog/22814
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Chirravuri, Sai Krishnan. “RDF3X-MPI: A Partitioned RDF engine for Data-Parallel SPARQL Querying.” 2014. Thesis, Penn State University. Accessed March 02, 2021.
https://submit-etda.libraries.psu.edu/catalog/22814.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Chirravuri, Sai Krishnan. “RDF3X-MPI: A Partitioned RDF engine for Data-Parallel SPARQL Querying.” 2014. Web. 02 Mar 2021.
Vancouver:
Chirravuri SK. RDF3X-MPI: A Partitioned RDF engine for Data-Parallel SPARQL Querying. [Internet] [Thesis]. Penn State University; 2014. [cited 2021 Mar 02].
Available from: https://submit-etda.libraries.psu.edu/catalog/22814.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Chirravuri SK. RDF3X-MPI: A Partitioned RDF engine for Data-Parallel SPARQL Querying. [Thesis]. Penn State University; 2014. Available from: https://submit-etda.libraries.psu.edu/catalog/22814
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Brno University of Technology
19.
Špeťko, Matej.
Efektivní komunikace v multi-GPU systémech: Efficient Communication in Multi-GPU Systems.
Degree: 2020, Brno University of Technology
URL: http://hdl.handle.net/11012/187281
► After the introduction of CUDA by Nvidia, the GPUs became devices capable of accelerating any general purpose computation. GPUs are designed as parallel processors which…
(more)
▼ After the introduction of CUDA by Nvidia, the GPUs became devices capable of accelerating any general purpose computation. GPUs are designed as parallel processors which posses huge computation power. Modern supercomputers are often equipped with GPU accelerators. Sometimes the performance or the memory capacity of a single GPU is not enough for a scientific application. The application needs to be scaled into multiple GPUs. During the computation there is need for the GPUs to exchange partial results. This communication represents computation overhead. For this reason it is important to research the methods of the effective communication between GPUs. This means less CPU involvement, lower latency, shared system buffers. Inter-node and intra-node communication is examined. The main focus is on GPUDirect technologies from Nvidia and CUDA-Aware
MPI. Subsequently k-Wave toolbox for simulating the propagation of acoustic waves is introduced. This application is accelerated by using CUDA-Aware
MPI.
Advisors/Committee Members: Vaverka, Filip (advisor), Jaroš, Jiří (referee).
Subjects/Keywords: CUDA; MPI; GPUDirect; RDMA; CUDA-Aware MPI; Anselm; HPC; GPGPU; peer-to-peer; k-Wave.; CUDA; MPI; GPUDirect; RDMA; CUDA-Aware MPI; Anselm; HPC; GPGPU; peer-to-peer; k-Wave.
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Špeťko, M. (2020). Efektivní komunikace v multi-GPU systémech: Efficient Communication in Multi-GPU Systems. (Thesis). Brno University of Technology. Retrieved from http://hdl.handle.net/11012/187281
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Špeťko, Matej. “Efektivní komunikace v multi-GPU systémech: Efficient Communication in Multi-GPU Systems.” 2020. Thesis, Brno University of Technology. Accessed March 02, 2021.
http://hdl.handle.net/11012/187281.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Špeťko, Matej. “Efektivní komunikace v multi-GPU systémech: Efficient Communication in Multi-GPU Systems.” 2020. Web. 02 Mar 2021.
Vancouver:
Špeťko M. Efektivní komunikace v multi-GPU systémech: Efficient Communication in Multi-GPU Systems. [Internet] [Thesis]. Brno University of Technology; 2020. [cited 2021 Mar 02].
Available from: http://hdl.handle.net/11012/187281.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Špeťko M. Efektivní komunikace v multi-GPU systémech: Efficient Communication in Multi-GPU Systems. [Thesis]. Brno University of Technology; 2020. Available from: http://hdl.handle.net/11012/187281
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
20.
Şaan, Tahsin Gökalp.
MPI ile paralel proglama
.
Degree: ESOGÜ, Fen Edebiyat Fakültesi, Matematik ve Bilgisayar Bilimleri, 2017, Eskisehir Osmangazi University
URL: http://hdl.handle.net/11684/1620
► Paralel programlama bir işi yapabilecek birden çok birimin aynı zamanda bir problem üzerinde çalışabilecek şekilde programlanması işlemidir. Paralel programlama genellikle bir bilgisayarın çözmekte zorlanacağı işlemleri…
(more)
▼ Paralel programlama bir işi yapabilecek birden çok birimin aynı zamanda bir
problem üzerinde çalışabilecek şekilde programlanması işlemidir. Paralel programlama
genellikle bir bilgisayarın çözmekte zorlanacağı işlemleri birden çok bilgisayara veya
işlemciye paylaştırarak çözmeyi sağlar. Bundan dolayı finans modellemeleri, matematik,
meteoroloji,... gibi geniş bir kullanım alanına sahiptir. Bu kullanım alanlarında paralel
programlama teorik olarak işlemleri hızlandırdığı için zamandan tasarruf sağlar ve zor
problemlerin çözümünü mümkün hale getirir.
MPI ise paralel programlarla alakalı bir
bilgisayar iletişim protokolüdür. Bu tez paralel programlama hakkında temel kavramlar,
literatür taraması,
MPI kütüphaneleri ve bu kütüphaneler yardımıyla örnek uygulamalar
oluşturulmasını içermektedir. Ayrıca bir cluster oluşturularak yazılan örnek uygulamaların
bu cluster üzerinden çalıştırılması anlatılmaktadır.
Advisors/Committee Members: Aslan, Ahmet Faruk (advisor).
Subjects/Keywords: Paralel Programlama;
MPI;
MPICH;
Parallel Programming
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Şaan, T. G. (2017). MPI ile paralel proglama
. (Thesis). Eskisehir Osmangazi University. Retrieved from http://hdl.handle.net/11684/1620
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Şaan, Tahsin Gökalp. “MPI ile paralel proglama
.” 2017. Thesis, Eskisehir Osmangazi University. Accessed March 02, 2021.
http://hdl.handle.net/11684/1620.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Şaan, Tahsin Gökalp. “MPI ile paralel proglama
.” 2017. Web. 02 Mar 2021.
Vancouver:
Şaan TG. MPI ile paralel proglama
. [Internet] [Thesis]. Eskisehir Osmangazi University; 2017. [cited 2021 Mar 02].
Available from: http://hdl.handle.net/11684/1620.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Şaan TG. MPI ile paralel proglama
. [Thesis]. Eskisehir Osmangazi University; 2017. Available from: http://hdl.handle.net/11684/1620
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

North-West University
21.
Valentin, Berni.
Determining the contribution of tourism to poverty alleviation in Mozambique : case studies of Praia Bilene and Macanetta / Berni Valentin
.
Degree: 2014, North-West University
URL: http://hdl.handle.net/10394/13437
► Understanding the role that tourism play in poverty alleviation globally has been a research focus of many studies in different countries. For an extended period…
(more)
▼ Understanding the role that tourism play in poverty alleviation globally has been a research focus of many studies in different countries. For an extended period the trickle down method of wealth distribution, where it was believed that riches find its way down the value chain to the poor in terms of taxes spent on welfare, infrastructure, grants etc., was globally accepted. In recent years though, focus on tourism as tool for increasing economic growth and poverty alleviation has been placed at the centre. It is true that in many cases tourism made a difference in the lives of the poor but it is also true that in many instances this is not the case. This dissertation analysed the perceived contribution made by tourism to poverty alleviation in Mozambique in general, and Praia de Bilene and Macanetta peninsula in particular. These are pre-eminently tourist destinations and ideally suited for a study of this nature.
The primary goal of this dissertation was to determine the contribution of tourism to poverty alleviation in Mozambique by assessing Praia Bilene and Macanetta peninsula. The first objective was to describe and understand the link between tourism and poverty. It was found that the traditional definition of poverty no longer applies to most situations. That it is better to view poverty as a lack of access instead of money, access to natural resources, bureaucratic processes, capital markets and entrepreneurship. The review analysed different research methods, looking in depth at the livelihood analysis, ST~EP and MPI. The three pathways namely direct, indirect and induced levels on how tourism affects the poor were also explored. The most challenging area has to be the quantifying of tourism impacts on communities and local livelihoods. Concluding that the measurement of tourism impacts on poverty alleviation is an intricate debate and not easily accomplished.
The second objective was to analyse the current status of the tourism industry in Mozambique. With 48% of sub-Sahara living in poverty, the picture in Mozambique is even drearier, with 54% living under the poverty line and 81% living under the $2 poverty line in the country (OPHI, 2013:1), confirming that it is one of the world’s
poorest countries. Mozambique’s profile was analysed on its poverty status, tourism development, growth and the tourism impacts on the local communities of Bilene and Macanetta. It was found that several tourism opportunities are scooped up by foreigners and that this causes a major leakage of resources from regions where poverty alleviation by tourism is attempted. At a 7% GDP growth rate Mozambique is making very good progress, but due to being so poor and behind it is not reducing the poverty fast enough.
The third objective was to determine the perceptions of two Mozambique communities on tourism impacts and the impact of tourism on their poverty status by incorporating the multi-dimensional poverty index. A perception analysis was done by means of a structured questionnaire presented to random residents from all…
Subjects/Keywords: Tourism impacts;
Poverty alleviation;
MPI;
Mozambique
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Valentin, B. (2014). Determining the contribution of tourism to poverty alleviation in Mozambique : case studies of Praia Bilene and Macanetta / Berni Valentin
. (Thesis). North-West University. Retrieved from http://hdl.handle.net/10394/13437
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Valentin, Berni. “Determining the contribution of tourism to poverty alleviation in Mozambique : case studies of Praia Bilene and Macanetta / Berni Valentin
.” 2014. Thesis, North-West University. Accessed March 02, 2021.
http://hdl.handle.net/10394/13437.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Valentin, Berni. “Determining the contribution of tourism to poverty alleviation in Mozambique : case studies of Praia Bilene and Macanetta / Berni Valentin
.” 2014. Web. 02 Mar 2021.
Vancouver:
Valentin B. Determining the contribution of tourism to poverty alleviation in Mozambique : case studies of Praia Bilene and Macanetta / Berni Valentin
. [Internet] [Thesis]. North-West University; 2014. [cited 2021 Mar 02].
Available from: http://hdl.handle.net/10394/13437.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Valentin B. Determining the contribution of tourism to poverty alleviation in Mozambique : case studies of Praia Bilene and Macanetta / Berni Valentin
. [Thesis]. North-West University; 2014. Available from: http://hdl.handle.net/10394/13437
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Louisiana State University
22.
Bai, Shuju.
A hybrid framework of iterative MapReduce and MPI for molecular dynamics applications.
Degree: PhD, Computer Sciences, 2013, Louisiana State University
URL: etd-07012013-211426
;
https://digitalcommons.lsu.edu/gradschool_dissertations/2662
► Developing platforms for large scale data processing has been a great interest to scientists. Hadoop is a widely used computational platform which is a fault-tolerant…
(more)
▼ Developing platforms for large scale data processing has been a great interest to scientists. Hadoop is a widely used computational platform which is a fault-tolerant distributed system for data storage due to HDFS (Hadoop Distributed File System) and performs fault-tolerant distributed data processing in parallel due to MapReduce framework. It is quite often that actual computations require multiple MapReduce cycles, which needs chained MapReduce jobs. However, Design by Hadoop is poor in addressing problems with iterative structures. In many iterative problems, some invariant data is required by every MapReduce cycle. The same data is uploaded to Hadoop file system in every MapReduce cycle, causing repeated data delivering and unnecessary time cost in transferring this data. In addition, although Hadoop can process data in parallel, it does not support MPI in computing. In any Map/Reduce task, the computation must be serial. This results in inefficient scientific computations wrapped in Map/Reduce tasks because the computation can not be distributed over a Hadoop cluster, especially a Hadoop cluster on a traditional high performance computing cluster. Computational technologies have been extensively investigated to be applied into many application domains. Since the presence of Hadoop, scientists have applied the MapReduce framework to biological sciences, chemistry, medical sciences, and other areas to efficiently process huge data sets. In our research, we proposed a hybrid framework of iterative MapReduce and MPI for molecular dynamics applications. We carried out molecular dynamics simulations with the implemented hybrid framework. We improved the capability and performance of Hadoop by adding a MPI module to Hadoop. The MPI module enables Hadoop to monitor and manage the resources of Hadoop cluster so that computations incurred in Map/Reduce tasks can be performed in a parallel manner. We also applied the local caching mechanism to avoid data delivery redundancy to make the computing more efficient. Our hybrid framework inherits features of Hadoop and improves computing efficiency of Hadoop. The targeting application domain of our research is molecular dynamics simulation. However, the potential use of our iterative MapReduce framework with MPI is broad. It can be used by any applications which contain single or multiple MapReduce iterations, invoke serial or parallel (MPI) computations in Map phase or Reduce phase of Hadoop.
Subjects/Keywords: Molecular dynamics; Hadoop; MPI; MapReduce; Framework
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Bai, S. (2013). A hybrid framework of iterative MapReduce and MPI for molecular dynamics applications. (Doctoral Dissertation). Louisiana State University. Retrieved from etd-07012013-211426 ; https://digitalcommons.lsu.edu/gradschool_dissertations/2662
Chicago Manual of Style (16th Edition):
Bai, Shuju. “A hybrid framework of iterative MapReduce and MPI for molecular dynamics applications.” 2013. Doctoral Dissertation, Louisiana State University. Accessed March 02, 2021.
etd-07012013-211426 ; https://digitalcommons.lsu.edu/gradschool_dissertations/2662.
MLA Handbook (7th Edition):
Bai, Shuju. “A hybrid framework of iterative MapReduce and MPI for molecular dynamics applications.” 2013. Web. 02 Mar 2021.
Vancouver:
Bai S. A hybrid framework of iterative MapReduce and MPI for molecular dynamics applications. [Internet] [Doctoral dissertation]. Louisiana State University; 2013. [cited 2021 Mar 02].
Available from: etd-07012013-211426 ; https://digitalcommons.lsu.edu/gradschool_dissertations/2662.
Council of Science Editors:
Bai S. A hybrid framework of iterative MapReduce and MPI for molecular dynamics applications. [Doctoral Dissertation]. Louisiana State University; 2013. Available from: etd-07012013-211426 ; https://digitalcommons.lsu.edu/gradschool_dissertations/2662

University of Georgia
23.
Weatherly, Daniel Brent.
A-MPI : supporting MPI on a nondedicated cluster of workstations.
Degree: 2014, University of Georgia
URL: http://hdl.handle.net/10724/29346
► Distributing data is one of the fundamental problems in implementing efficient distributed-memory parallel programs. The problem becomes more difficult in environments where the participating nodes…
(more)
▼ Distributing data is one of the fundamental problems in implementing efficient distributed-memory parallel programs. The problem becomes more difficult in environments where the participating nodes (processors) are not dedicated to a
parallel application. Such environments increase the difficulty of the data distribution problem, which is to determine an assignment of data elements to each node to minimize completion time. We are investigating this problem in the context of explicit
message-passing programs.|We have designed and implemented an extension to the popular Message Passing Interface (MPI) that efficiently supports adaptive programs by providing the necessary infrastructure to redistribute data dynamically. (1) an
efficient memory allocation mechanism, (2) techniques for accurately determining systems load and computation time, and (3) a heuristic for determining efficient data distributions, including the removal of nodes whose participation degrades the
performance of an application. Performance results show that programs that use A-MPI can produce significant improvements over previous load-balancing systems.
Subjects/Keywords: load balancing; MPI; data distribution; parallel programming
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Weatherly, D. B. (2014). A-MPI : supporting MPI on a nondedicated cluster of workstations. (Thesis). University of Georgia. Retrieved from http://hdl.handle.net/10724/29346
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Weatherly, Daniel Brent. “A-MPI : supporting MPI on a nondedicated cluster of workstations.” 2014. Thesis, University of Georgia. Accessed March 02, 2021.
http://hdl.handle.net/10724/29346.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Weatherly, Daniel Brent. “A-MPI : supporting MPI on a nondedicated cluster of workstations.” 2014. Web. 02 Mar 2021.
Vancouver:
Weatherly DB. A-MPI : supporting MPI on a nondedicated cluster of workstations. [Internet] [Thesis]. University of Georgia; 2014. [cited 2021 Mar 02].
Available from: http://hdl.handle.net/10724/29346.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Weatherly DB. A-MPI : supporting MPI on a nondedicated cluster of workstations. [Thesis]. University of Georgia; 2014. Available from: http://hdl.handle.net/10724/29346
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

The Ohio State University
24.
Li, Mingzhe.
Designing High-Performance Remote Memory Access for MPI and
PGAS Models with Modern Networking Technologies on Heterogeneous
Clusters.
Degree: PhD, Computer Science and Engineering, 2017, The Ohio State University
URL: http://rave.ohiolink.edu/etdc/view?acc_num=osu1512070491037985
► Multi-/Many-core architectures and networking technologies like InfiniBand (IB) are fueling the growth of next-generation ultra-scale systems that have high compute density. The communication requirements of…
(more)
▼ Multi-/Many-core architectures and networking
technologies like InfiniBand (IB) are fueling the growth of
next-generation ultra-scale systems that have high compute density.
The communication requirements of scientific applications are
steadily increasing.
MPI two-sided programming models have been
used by most of the scientific applications on High-Performance
Computing (HPC) systems, however there is an increased focus on
MPI
one-sided and the Partitioned Global Address Space (PGAS)
programming models such as OpenSHMEM. As modern computer hardware
architectures keep evolving, it is critical that the
MPI and PGAS
runtimes and scientific applications are designed with high
scalability and performance for next generation systems. This
thesis focuses on designing high-performance Remote Memory Access
(RMA) for
MPI and PGAS models with modern networking technologies
on heterogeneous clusters.High-performance interconnects have been
the key drivers for high-performance computing systems. Many new
networking technologies have been offered on interconnects to meet
the increasing communication requirements of scientific
applications. However,
MPI and PGAS runtimes have not been designed
with such technologies to future boost the performance of
scientific applications on multi-petaflop/exascale HPC systems. We
present designs at
MPI and PGAS runtime level taking advantage of
hardware atomics, User-Mode Memory Registration (UMR), and
On-Demand Paging (ODP) of InfiniBand to benefit scientific
applications transparently. With our ODP-Aware
MPI runtime, the
pin-down buffer size of LAMMPS application has been reduced by 11X.
Similarly, we have shown up to 4X performance improvement in
point-to-point latency for noncontiguous data movement for 4MB
messages.Most of the scientific applications have been written with
MPI two-sided programming models that have been shown as a good fit
for regular and iterative applications. However, it can be very
difficult to use
MPI two-sided programming model and maintain
performance for irregular, data-driven applications. Although
MPI
RMA and PGAS programming models present an attractive alternative
for designing such applications, not many studies are available to
guide scientists in taking advantage of the one-sided communication
semantics on scientific applications yet. This thesis also targets
at redesigning applications making use of one-sided programming
models for better performance compared to
MPI two-sided programming
model. With our
MPI RMA-based design, the execution time of
Graph500 was reduced by 2X, compared to existing
MPI based design
at 4,096 processes.Many-core architectures (such as Intel Xeon Phi
and IBM POWER) are fueling the growth of next-generation
ultra-scale systems with high compute density. Latest Intel Xeon
Phi Knights-Landing (KNL) architecture provides more than 150
hardware threads per node. In order to fully leverage the
performance benefits offered by the modern HPC systems, it is
critical to redesign runtimes on such systems. Furthermore, the
scientific…
Advisors/Committee Members: Panda, Dhabaleswar (Advisor).
Subjects/Keywords: Computer Science; MPI; RMA; HPC; OpenSHMEM; InfiniBand
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Li, M. (2017). Designing High-Performance Remote Memory Access for MPI and
PGAS Models with Modern Networking Technologies on Heterogeneous
Clusters. (Doctoral Dissertation). The Ohio State University. Retrieved from http://rave.ohiolink.edu/etdc/view?acc_num=osu1512070491037985
Chicago Manual of Style (16th Edition):
Li, Mingzhe. “Designing High-Performance Remote Memory Access for MPI and
PGAS Models with Modern Networking Technologies on Heterogeneous
Clusters.” 2017. Doctoral Dissertation, The Ohio State University. Accessed March 02, 2021.
http://rave.ohiolink.edu/etdc/view?acc_num=osu1512070491037985.
MLA Handbook (7th Edition):
Li, Mingzhe. “Designing High-Performance Remote Memory Access for MPI and
PGAS Models with Modern Networking Technologies on Heterogeneous
Clusters.” 2017. Web. 02 Mar 2021.
Vancouver:
Li M. Designing High-Performance Remote Memory Access for MPI and
PGAS Models with Modern Networking Technologies on Heterogeneous
Clusters. [Internet] [Doctoral dissertation]. The Ohio State University; 2017. [cited 2021 Mar 02].
Available from: http://rave.ohiolink.edu/etdc/view?acc_num=osu1512070491037985.
Council of Science Editors:
Li M. Designing High-Performance Remote Memory Access for MPI and
PGAS Models with Modern Networking Technologies on Heterogeneous
Clusters. [Doctoral Dissertation]. The Ohio State University; 2017. Available from: http://rave.ohiolink.edu/etdc/view?acc_num=osu1512070491037985

The Ohio State University
25.
Maddipati, Sai Ratna Kiran.
Improving the Parallel Performance of Boltzman-Transport
Equation for Heat Transfer.
Degree: MS, Computer Science and Engineering, 2016, The Ohio State University
URL: http://rave.ohiolink.edu/etdc/view?acc_num=osu1461334523
► In a thermodynamically unstable environment, the Boltzman-Transport Equation (BTE) defines the behavior of heat transfer-rate at each location in the environment, the direction of heat…
(more)
▼ In a thermodynamically unstable environment, the
Boltzman-Transport Equation (BTE) defines the behavior of heat
transfer-rate at each location in the environment, the direction of
heat transfer by the particles of the environment and the final
equilibrium temperature conditions of the environment. The BTE is a
very computationally intensive application and there is a need for
efficient parallelization. Parallelization implementation of this
application is explained along with brief details about several
techniques that have been used by others in past work. The
implementation involves several code-iterations of the BTE
application with distinct changes in order to compare and analyze
the performance and identify the reason for the performance
improvement or deterioration. Then, experimental results are
presented to show the resulting performance of the
implementation.
Advisors/Committee Members: Sadayappan, P (Advisor).
Subjects/Keywords: Computer Science; Parallel Computing, OpenMP, MPI
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Maddipati, S. R. K. (2016). Improving the Parallel Performance of Boltzman-Transport
Equation for Heat Transfer. (Masters Thesis). The Ohio State University. Retrieved from http://rave.ohiolink.edu/etdc/view?acc_num=osu1461334523
Chicago Manual of Style (16th Edition):
Maddipati, Sai Ratna Kiran. “Improving the Parallel Performance of Boltzman-Transport
Equation for Heat Transfer.” 2016. Masters Thesis, The Ohio State University. Accessed March 02, 2021.
http://rave.ohiolink.edu/etdc/view?acc_num=osu1461334523.
MLA Handbook (7th Edition):
Maddipati, Sai Ratna Kiran. “Improving the Parallel Performance of Boltzman-Transport
Equation for Heat Transfer.” 2016. Web. 02 Mar 2021.
Vancouver:
Maddipati SRK. Improving the Parallel Performance of Boltzman-Transport
Equation for Heat Transfer. [Internet] [Masters thesis]. The Ohio State University; 2016. [cited 2021 Mar 02].
Available from: http://rave.ohiolink.edu/etdc/view?acc_num=osu1461334523.
Council of Science Editors:
Maddipati SRK. Improving the Parallel Performance of Boltzman-Transport
Equation for Heat Transfer. [Masters Thesis]. The Ohio State University; 2016. Available from: http://rave.ohiolink.edu/etdc/view?acc_num=osu1461334523

Universidade do Rio Grande do Sul
26.
Cera, Marcia Cristina.
Providing adaptability to MPI applications on current parallel architectures.
Degree: 2012, Universidade do Rio Grande do Sul
URL: http://hdl.handle.net/10183/55464
► Atualmente, adaptabilidade é uma característica desejada em aplicações paralelas. Por exemplo, o crescente número de usuários competindo por recursos em arquiteturas paralelas gera mudanças constantes…
(more)
▼ Atualmente, adaptabilidade é uma característica desejada em aplicações paralelas. Por exemplo, o crescente número de usuários competindo por recursos em arquiteturas paralelas gera mudanças constantes no conjunto de processadores disponíveis. Aplicações adaptativas são capazes de executar usando um conjunto volátil de processadores, oferecendo urna melhor utilização dos recursos. Este comportamento adaptativo é conhecido corno maleabilidade. Outro exemplo vem da constante evolução das arquiteturas multi-core, as quais aumentam o número de cores em seus chips a cada nova geração. Adaptabilidade é a chave para permitir que os programas paralelos sejam portáveis de uma máquina a outra. Assim. os programas paralelos são capazes de adaptar a extração do paralelismo de acordo com o grau de paralelismo específico da arquitetura alvo. Este comportamento pode ser visto como um caso particular de evolutividade. Nesse sentido, esta tese está focada em: (i) maleabilidade para adaptar a execução das aplicações paralelas às mudanças na disponibilidade dos processadores; e (ii) evolutividade para adaptar a extração do paralelismo de acordo com propriedades da arquitetura e dos dados de entrada. Portanto, a questão remanescente é "Como prover e suportar aplicações adaptativas?". Esta tese visa responder tal questão com base no
MPI (Message-Passing Interface), o qual é a API paralela padrão para HPC em ambientes distribuídos. Nosso trabalho baseia-se nas características do
MPI-2 que permitem criar processos em tempo de execução, dando alguma flexibilidade às aplicações
MPI. Aplicações
MPI maleáveis usam a criação dinâmica de processos para expandir-se nas ações de crescimento (para usar processadores extras). As ações de diminuição (para liberar processadores) finalizam os processos
MPI que executam nos processadores requeridos, preservando os dados da aplicação. Note que as aplicações maleáveis requerem suporte do ambiente de execução, uma vez que precisam ser notificadas sobre a disponibilidade dos processadores. Aplicações
MPI evolutivas seguem o paradigma do paralelismo de tarefas explícitas para permitir adaptação em tempo de execução. Assim, a criação dinâmica de processos é usada para extrair o paralelismo, ou seja, para criar novas tarefas
MPI sob demanda. Para prover tais aplicações nós definimos tarefas
MPI abstratas, implementamos a sincronização entre elas através da troca de mensagens, e propusemos uma abordagem para ajustar a granularidade das tarefas
MPI, visando eficiência em ambientes distribuídos. Os resultados experimentais validaram nossa hipótese de que aplicações adaptativas podem ser providas usando características do
MPI-2. Adicionalmente, esta tese identificou os requisitos rio nível do ambiente de execução para suportá-las em clusters. Portanto, as aplicações
MPI maleáveis melhoraram a utilização de recursos de clusters; e as aplicações de tarefas explícitas adaptaram a extração do paralelismo de acordo com a arquitetura alvo. mostrando que este paradigma também é eficiente em ambientes…
Advisors/Committee Members: Navaux, Philippe Olivier Alexandre.
Subjects/Keywords: Mpi; MPI; Adaptability; Processamento paralelo; Processamento : Alto desempenho; Malleability; Explicit task parallelism
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Cera, M. C. (2012). Providing adaptability to MPI applications on current parallel architectures. (Thesis). Universidade do Rio Grande do Sul. Retrieved from http://hdl.handle.net/10183/55464
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Cera, Marcia Cristina. “Providing adaptability to MPI applications on current parallel architectures.” 2012. Thesis, Universidade do Rio Grande do Sul. Accessed March 02, 2021.
http://hdl.handle.net/10183/55464.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Cera, Marcia Cristina. “Providing adaptability to MPI applications on current parallel architectures.” 2012. Web. 02 Mar 2021.
Vancouver:
Cera MC. Providing adaptability to MPI applications on current parallel architectures. [Internet] [Thesis]. Universidade do Rio Grande do Sul; 2012. [cited 2021 Mar 02].
Available from: http://hdl.handle.net/10183/55464.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Cera MC. Providing adaptability to MPI applications on current parallel architectures. [Thesis]. Universidade do Rio Grande do Sul; 2012. Available from: http://hdl.handle.net/10183/55464
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Universidade do Rio Grande do Sul
27.
Ferreira, Manuela Klanovicz.
Mapeamento estático de processos MPI com emparelhamento perfeito de custo máximo em cluster homogêneo de multi-cores.
Degree: 2012, Universidade do Rio Grande do Sul
URL: http://hdl.handle.net/10183/65636
► Um importante fator que precisa ser considerado para alcançar alto desempenho em aplicações paralelas é a distribuição dos processos nos núcleos do sistema, denominada mapeamento…
(more)
▼ Um importante fator que precisa ser considerado para alcançar alto desempenho em aplicações paralelas é a distribuição dos processos nos núcleos do sistema, denominada mapeamento de processos. Mesmo o mapeamento estático de processos é um problema NP-difícil. Por esse motivo, são utilizadas heurísticas que dependem da aplicação e do hardware no qual a aplicação será mapeada. Nas arquiteturas atuais, além da possibilidade de haver mais de um processador por nó do cluster, é possível haver mais de um núcleo de processamento por processador, assim, o mapeamento estático de processos pode considerar pelo menos três níveis de comunicação entre os processos que executam em um cluster multi-core: intra-chip, intra-nó e inter-nó. Este trabalho propõe a heurística MapEME (Mapeamento Estático MPI com Emparelhamento) que emprega o Emparelhamento Perfeito de Custo Máximo (EPCM) no cálculo do mapeamento estático de processos paralelos MPI em processadores multi-core. Os resultados alcançados pelo mapeamento gerado pela MapEME são comparados aos resultados obtidos pelo mapeamento gerado pela aplicação Scotch, que utiliza o Biparticionamento Recursivo Dual (BRD), já utilizado como heurística para mapeamento estático de processos. Ambas as heurísticas são comparadas à Busca Exaustiva (BE) para verificar o quanto estão próximas do ótimo. Os três métodos têm a complexidade e o ganho no tempo de execução em ralação à distribuição padrão da biblioteca MPICH2 comparados entre si. A principal contribuição deste trabalho é mostrar que a heurística EPCM apresenta ganho de até 40% equivalente a já difundida BRD, e possui uma complexidade menor ao ser aplicado em um cluster multi-core que compartilha cache nível 2 a cada dois núcleos.
An important factor that must be considered to achieve high performance on parallel applications is the mapping of processes on cores. However, since this is defined as an NP-Hard problem, it requires different mapping heuristics that depends on the application and the hardware on which it will be mapped. On the current architectures we can have more than one multi-core processors per node, and consequently the process mapping can consider three process communication types: intrachip, intranode and internode. This work propose the MapEME (Static Mapping MPI using Matching) that use the Maximum Weighted Perfect Matching (MWPM) to calculate the static process mapping and analyze its performance. The results provided by MapEME are compared with the results of application Scotch. It uses Dual Recursive Bipartitioning (DRB), an already used heuristics for static mapping. Both heuristics are compared with Exhaustive Search (ES) to verify how much the two heuristics are near the optimum. The three methods have theirs complexities analyzed. Also the mapping gain when compared with the standard MPICH2 distribution was measured. The main contribution of this work is to show that the heuristic, EPCM, provides gain up to 40%, close of DRB gain. Furthermore, EPCM has a lower complexity when applied to a multicore cluster…
Advisors/Committee Members: Navaux, Philippe Olivier Alexandre.
Subjects/Keywords: Mpi; Process mapping; MPI; Processamento paralelo; Multicore; Processes’ communication; Maximum weighted perfect matching
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Ferreira, M. K. (2012). Mapeamento estático de processos MPI com emparelhamento perfeito de custo máximo em cluster homogêneo de multi-cores. (Thesis). Universidade do Rio Grande do Sul. Retrieved from http://hdl.handle.net/10183/65636
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Ferreira, Manuela Klanovicz. “Mapeamento estático de processos MPI com emparelhamento perfeito de custo máximo em cluster homogêneo de multi-cores.” 2012. Thesis, Universidade do Rio Grande do Sul. Accessed March 02, 2021.
http://hdl.handle.net/10183/65636.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Ferreira, Manuela Klanovicz. “Mapeamento estático de processos MPI com emparelhamento perfeito de custo máximo em cluster homogêneo de multi-cores.” 2012. Web. 02 Mar 2021.
Vancouver:
Ferreira MK. Mapeamento estático de processos MPI com emparelhamento perfeito de custo máximo em cluster homogêneo de multi-cores. [Internet] [Thesis]. Universidade do Rio Grande do Sul; 2012. [cited 2021 Mar 02].
Available from: http://hdl.handle.net/10183/65636.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Ferreira MK. Mapeamento estático de processos MPI com emparelhamento perfeito de custo máximo em cluster homogêneo de multi-cores. [Thesis]. Universidade do Rio Grande do Sul; 2012. Available from: http://hdl.handle.net/10183/65636
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Universidade do Rio Grande do Sul
28.
Neves, Marcelo Veiga.
Modelagem e dimensionamento do custo de migração de processos em programas MPI.
Degree: 2009, Universidade do Rio Grande do Sul
URL: http://hdl.handle.net/10183/18248
► A migração de processos é importante em programas MPI por vários motivos, tais como permitir re-escalonamento de processos, balanceamento de cargas e tolerância a falhas.…
(more)
▼ A migração de processos é importante em programas MPI por vários motivos, tais como permitir re-escalonamento de processos, balanceamento de cargas e tolerância a falhas. Independentemente do tipo do uso da migração, conhecer o custo imposto pela realização desta operação é um problema pertinente. Quando utiliza-se migração para tentar diminuir o tempo de execução de uma aplicação paralela, este custo passa a ser um ponto crítico. Existem algumas soluções para migração de processos em programas MPI disponíveis atualmente. No entanto, ainda não existe um estudo que quantifique o custo destas migrações. Nesse contexto, este trabalho apresenta um estudo para modelar e dimensionar o custo de migração de processos em programasMPI. Primeiramente, o trabalho identificou, analisou, avaliou e, quando necessário, adaptou as principais soluções disponíveis atualmente para migrar processos MPI. Com base nessas soluções, foram criados modelos de custo que poderão ser utilizado para estimar dinamicamente os custos de migração e auxiliar na tomada de decisão em algoritmos de escalonamento. Osmodelos criados foram utilizados para estimar os custos demigração emaplicações paralelas e o resultado foi comparado comos custos demigração reais. Nesta comparação, os valores previsto ficaram bastante próximos dos valores observados no experimento, demonstrando a qualidade das previsões dos modelos propostos.
Process migration is essential for MPI programs for different reasons, such as processes rescheduling, load balancing and fault tolerance. Knowing well the cost necessary for this operation is a pertinent problem, regardless of the type of migration use. Whenever migration is used for improving the performance of parallel applications, its cost becomes a deciding point. Nowadays, there are some solutions to process migration available for MPI programs. However, there is not a study that can quantify the migration cost and its impact on the execution of MPI programs. In this context, this work presents a study for modeling and dimensioning the process migration cost in MPI programs. First, we identified, analyzed, evaluated and, when needed, adapted the main solutions which are presently available to migrate MPI processes. Based in these solutions, we defined cost models. These models can be used to dynamically estimate the migration costs and to guide scheduling decisions. These models were used to predict the migration cost in parallel applications and the result was compared to observed migration costs. In this comparison, the predicted values were very similar to those observed in the experiment. This work still shows an evaluation about the impact of a migration in the execution of real parallel applications in order to verifying the viability of applying this approach to improve the performance.
Advisors/Committee Members: Maillard, Nicolas Bruno.
Subjects/Keywords: Process migration; Processamento paralelo; Mpi; MPI; Cost modeling; Dynamic process scheduling; Parallel processing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Neves, M. V. (2009). Modelagem e dimensionamento do custo de migração de processos em programas MPI. (Thesis). Universidade do Rio Grande do Sul. Retrieved from http://hdl.handle.net/10183/18248
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Neves, Marcelo Veiga. “Modelagem e dimensionamento do custo de migração de processos em programas MPI.” 2009. Thesis, Universidade do Rio Grande do Sul. Accessed March 02, 2021.
http://hdl.handle.net/10183/18248.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Neves, Marcelo Veiga. “Modelagem e dimensionamento do custo de migração de processos em programas MPI.” 2009. Web. 02 Mar 2021.
Vancouver:
Neves MV. Modelagem e dimensionamento do custo de migração de processos em programas MPI. [Internet] [Thesis]. Universidade do Rio Grande do Sul; 2009. [cited 2021 Mar 02].
Available from: http://hdl.handle.net/10183/18248.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Neves MV. Modelagem e dimensionamento do custo de migração de processos em programas MPI. [Thesis]. Universidade do Rio Grande do Sul; 2009. Available from: http://hdl.handle.net/10183/18248
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
29.
Rezvoy, Clément.
Large Scale Parallel Inference of Protein and Protein Domain families : Inférence des familles de protéines et de domaines protéiques à grande échelle.
Degree: Docteur es, Informatique, 2011, Lyon, École normale supérieure
URL: http://www.theses.fr/2011ENSL0643
► Les domaines protéiques sont des segments indépendants qui sont présents de façon récurrente dans plusieurs protéines. L'arrangement combinatoire de ces domaines est à l'origine de…
(more)
▼ Les domaines protéiques sont des segments indépendants qui sont présents de façon récurrente dans plusieurs protéines. L'arrangement combinatoire de ces domaines est à l'origine de la diversité structurale et fonctionnelle des protéines. Plusieurs méthodes ont été développées pour permettre d'inférer la décomposition des protéines en domaines ainsi que la classification de ces domaines en familles. L'une de ces méthodes, MkDom2, permet l'inférence des familles de domaines de façon gloutonne. les familles sont inférées l'une après l'autre de façon a créer un découpage des protéines en arrangement de domaines et un classement de ces domaines en familles. MkDom2 est a l'origine de la base de données ProDom et est essentiel pour sa mise à jour. L'augmentation exponentielle du nombre de séquences analyser a rendue obsolète cette méthode qui nécessite désormais plusieurs années de calcul pour calculer ProDom. nous proposons un nouvel algorithme, MPI_MkDom2, permettant l'exploration simultanée de plusieurs familles de domaines sur une plate-forme de calcul distribué. MPI_MkDom2 est un algorithme distribué et asynchrone gérant l'équilibrage de charge pour une utilisation efficace de la plate-forme de calcul; il assure la création d'un découpage non-recouvrant de l'ensemble des protéines. Une mesure de proximité entre les classifications de domaines est définie afin d'évaluer l'effet du parallélisme sur le partitionnement produit. Nous proposons un second algorithme MPI_MkDom3. permettant le calcul simultanée d'une classification des domaines protéiques et des protéines en familles partageant le même arrangement en domaines.
Protein domains are recurring independent segment of proteins. The combinatorial arrangement of domains is at the root of the functional and structural diversity of proteins. Several methods have been developed to infer protein domain decomposition and domain family clustering from sequence information alone. MkDom2 is one of those methods. Mkdom2 infers domain families in a greedy fashion. Families are inferred one after the other in order to create a delineation of domains on proteins and a clustering of those domains in families. MkDom2 is instrumental in the building of the ProDom database. The exponential growth of the number of sequences to process as rendered MkDom2 obsolete, it would now take several years to compute a newrelease of ProDom. We present a nous algorithm, MPI_MkDom2, allowing computation of several families at once across a distributed computing platform. MPI_MkDom2 is an asynchronous distributed algorithm managing load balancing to ensure efficient platform usage; it ensures the creation of a non-overlapping partitioning of the whole protein set. A new proximity measure is defined to assess the effect of the parallel computation on the result. We also Propose a second algorithm, MPI_mkDom3, allowing the simultaneous computation of a clustering of protein domains as well as full protein sharing the same domain decomposition.
Advisors/Committee Members: Vivien, Frédéric (thesis director), Kahn, Daniel (thesis director).
Subjects/Keywords: Bioinformatique; Protéine; Domaine; MPI; Calcul distribué; Bioinformatics; Protein; Domain; MPI; Distributed computing; Clustering
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Rezvoy, C. (2011). Large Scale Parallel Inference of Protein and Protein Domain families : Inférence des familles de protéines et de domaines protéiques à grande échelle. (Doctoral Dissertation). Lyon, École normale supérieure. Retrieved from http://www.theses.fr/2011ENSL0643
Chicago Manual of Style (16th Edition):
Rezvoy, Clément. “Large Scale Parallel Inference of Protein and Protein Domain families : Inférence des familles de protéines et de domaines protéiques à grande échelle.” 2011. Doctoral Dissertation, Lyon, École normale supérieure. Accessed March 02, 2021.
http://www.theses.fr/2011ENSL0643.
MLA Handbook (7th Edition):
Rezvoy, Clément. “Large Scale Parallel Inference of Protein and Protein Domain families : Inférence des familles de protéines et de domaines protéiques à grande échelle.” 2011. Web. 02 Mar 2021.
Vancouver:
Rezvoy C. Large Scale Parallel Inference of Protein and Protein Domain families : Inférence des familles de protéines et de domaines protéiques à grande échelle. [Internet] [Doctoral dissertation]. Lyon, École normale supérieure; 2011. [cited 2021 Mar 02].
Available from: http://www.theses.fr/2011ENSL0643.
Council of Science Editors:
Rezvoy C. Large Scale Parallel Inference of Protein and Protein Domain families : Inférence des familles de protéines et de domaines protéiques à grande échelle. [Doctoral Dissertation]. Lyon, École normale supérieure; 2011. Available from: http://www.theses.fr/2011ENSL0643

Universidade do Rio Grande do Sul
30.
Afonso, Fernando Abrahão.
MPI2.NET : criação dinâmica de tarefas com orientação a objetos.
Degree: 2010, Universidade do Rio Grande do Sul
URL: http://hdl.handle.net/10183/26952
► Message Passing Interface (MPI) é o padrão de facto para o desenvolvimento de aplicações paralelas e de alto desempenho que executem em clusters. O padrão…
(more)
▼ Message Passing Interface (MPI) é o padrão de facto para o desenvolvimento de aplicações paralelas e de alto desempenho que executem em clusters. O padrão define APIs para as linguagens de programação Fortran, C e C++. Por outro lado a programação orientada a objetos é o paradigma de programação dominante atualmente, onde linguagens de programação como Java e C# têm se tornado muito populares. Isso se deve às abstrações voltadas para facilitar a programação oriundas dessas linguagens de programação, permitindo um ciclo de programação/manutenção mais eficiente. Devido a isso, diversas bibliotecas MPI para essas linguagens emergiram. Dentre elas, pode-se destacar a biblioteca MPI.NET, para a linguagem de programação C#, que possui a melhor relação entre abstração e desempenho. Na computação paralela, o modelo utilizado para o desenvolvimento das aplicações é muito importante, sendo que o modelo Divisão & Conquista é escalável, aplicável a diversos problemas e permite a execução eficiente de aplicações cuja carga de trabalho é desconhecida ou irregular. Para programar utilizando esse modelo é necessário que o ambiente de execução suporte dinamismo, o que não é suportado pela biblioteca MPI.NET. Desse cenário emerge a principal motivação desse trabalho, cujo objetivo é explorar a criação dinâmica de tarefas na biblioteca MPI.NET. Ao final, foi possível obter uma biblioteca com desempenho competitivo em relação ao desempenho das bibliotecas MPI para C++.
Message Passing Interface (MPI) is the de facto standard for the development of high performance applications executing on clusters. The standard defines APIs for the programming languages Fortran C and C++. On the other hand, object oriented programming has become the dominant programming paradigm, where programming languages as Java and C# are becoming very popular. This can be justified by the abstractions contained in these programming languages, allowing a more efficient programming/maintenance cycle. Because of this, several MPI libraries emerged for these programming languages. Among them, we can highlight the MPI.NET library for the C# programming language, which has the best relation between abstraction and performance. In parallel computing, the model used for the development of applications is very important, and the Divide and Conquer model is efficiently scalable, applicable to several problems and allows efficient execution of applications whose workload is unknown or irregular. To program using this model, the execution environment must provide dynamism, which is not provided by the MPI.NET library. From this scenario emerges the main goal of this work, which is to explore dynamic tasks creation on the MPI.NET library. In the end we where able to obtain a library with competitive performance against MPI C++ libraries.
Advisors/Committee Members: Maillard, Nicolas Bruno.
Subjects/Keywords: Processamento : Alto desempenho; Dynamic tasks creation; Mpi; High performance computing; Processamento paralelo; MPI; Parallel computing
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Afonso, F. A. (2010). MPI2.NET : criação dinâmica de tarefas com orientação a objetos. (Thesis). Universidade do Rio Grande do Sul. Retrieved from http://hdl.handle.net/10183/26952
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Afonso, Fernando Abrahão. “MPI2.NET : criação dinâmica de tarefas com orientação a objetos.” 2010. Thesis, Universidade do Rio Grande do Sul. Accessed March 02, 2021.
http://hdl.handle.net/10183/26952.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Afonso, Fernando Abrahão. “MPI2.NET : criação dinâmica de tarefas com orientação a objetos.” 2010. Web. 02 Mar 2021.
Vancouver:
Afonso FA. MPI2.NET : criação dinâmica de tarefas com orientação a objetos. [Internet] [Thesis]. Universidade do Rio Grande do Sul; 2010. [cited 2021 Mar 02].
Available from: http://hdl.handle.net/10183/26952.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Afonso FA. MPI2.NET : criação dinâmica de tarefas com orientação a objetos. [Thesis]. Universidade do Rio Grande do Sul; 2010. Available from: http://hdl.handle.net/10183/26952
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
◁ [1] [2] [3] [4] [5] … [13] ▶
.