Advanced search options

Advanced Search Options 🞨

Browse by author name (“Author name starts with…”).

Find ETDs with:

in
/  
in
/  
in
/  
in

Written in Published in Earliest date Latest date

Sorted by

Results per page:

Sorted by: relevance · author · university · dateNew search

You searched for subject:(checkpoint corruption). Showing records 1 – 3 of 3 total matches.

Search Limiters

Last 2 Years | English Only

No search limiters apply to these results.

▼ Search Limiters

1. Calhoun, Jon Cameron. From detection to optimization: impact of soft errors on high-performance computing applications.

Degree: PhD, Computer Science, 2017, University of Illinois – Urbana-Champaign

As high-performance computing (HPC) continues to progress, constraints on HPC system design forces the handling of errors to higher levels in the software stack. Of the types of errors facing HPC, soft errors that silently corrupt system or application state are among the most severe. The behavior of HPC applications in the presence of soft errors is critical to gain insight for effective utilization of HPC systems. The need to understand this behavior can be used in developing algorithm-based error detection guided by application characteristics from fault injection and error propagation studies. Furthermore, the realization that applications are tolerant to small errors allows optimizations such as lossy compression on high-cost data transfers. Lossy compression adds small user controllable amounts of error when compressing data, to reduce data size before expensive data transfers saving time. This dissertation investigates and improves the resiliency of HPC applications to soft errors, and explores lossy compression as a new form of optimization for expensive, time-consuming data transfers. Advisors/Committee Members: Snir, Marc (advisor), Olson, Luke N (Committee Chair), Gropp, William (committee member), Cappello, Franck (committee member).

Subjects/Keywords: High-performance computing; Fault tolerance; Silent data corruption; Soft errors; Error detection; Error recovery; Fault injection; Error propagation; Lossy compression; Checkpoint-restart

…Average corruption in selected variables on rank 3 for Jacobi within 90% confidence… …Average corruption in selected variables on rank 3 for HPCCG within 90% confidence… …3.20 Average corruption in selected variables on rank 3 for CoMD… …Overhead of weak scaling applications instrumented to track corruption propagation… …Weak scaling of applications with and without instrumentation for corruption propagation… 

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Calhoun, J. C. (2017). From detection to optimization: impact of soft errors on high-performance computing applications. (Doctoral Dissertation). University of Illinois – Urbana-Champaign. Retrieved from http://hdl.handle.net/2142/98379

Chicago Manual of Style (16th Edition):

Calhoun, Jon Cameron. “From detection to optimization: impact of soft errors on high-performance computing applications.” 2017. Doctoral Dissertation, University of Illinois – Urbana-Champaign. Accessed April 20, 2021. http://hdl.handle.net/2142/98379.

MLA Handbook (7th Edition):

Calhoun, Jon Cameron. “From detection to optimization: impact of soft errors on high-performance computing applications.” 2017. Web. 20 Apr 2021.

Vancouver:

Calhoun JC. From detection to optimization: impact of soft errors on high-performance computing applications. [Internet] [Doctoral dissertation]. University of Illinois – Urbana-Champaign; 2017. [cited 2021 Apr 20]. Available from: http://hdl.handle.net/2142/98379.

Council of Science Editors:

Calhoun JC. From detection to optimization: impact of soft errors on high-performance computing applications. [Doctoral Dissertation]. University of Illinois – Urbana-Champaign; 2017. Available from: http://hdl.handle.net/2142/98379

2. Campbell, Keith A. Low-cost error detection through high-level synthesis.

Degree: MS, Electrical & Computer Engineering, 2015, University of Illinois – Urbana-Champaign

System-on-chip design is becoming increasingly complex as technology scaling enables more and more functionality on a chip. This scaling and complexity has resulted in a variety of reliability and validation challenges including logic bugs, hot spots, wear-out, and soft errors. To make matters worse, as we reach the limits of Dennard scaling, efforts to improve system performance and energy efficiency have resulted in the integration of a wide variety of complex hardware accelerators in SoCs. Thus the challenge is to design complex, custom hardware that is efficient, but also correct and reliable. High-level synthesis shows promise to address the problem of complex hardware design by providing a bridge from the high-productivity software domain to the hardware design process. Much research has been done on high-level synthesis efficiency optimizations. This thesis shows that high-level synthesis also has the power to address validation and reliability challenges through two solutions. One solution for circuit reliability is modulo-3 shadow datapaths: performing lightweight shadow computations in modulo-3 space for each main computation. We leverage the binding and scheduling flexibility of high-level synthesis to detect control errors through diverse binding and minimize area cost through intelligent checkpoint scheduling and modulo-3 reducer sharing. We introduce logic and dataflow optimizations to further reduce cost. We evaluated our technique with 12 high-level synthesis benchmarks from the arithmetic-oriented PolyBench benchmark suite using FPGA emulated netlist-level error injection. We observe coverages of 99.1% for stuck-at faults, 99.5% for soft errors, and 99.6% for timing errors with a 25.7% area cost and negligible performance impact. Leveraging a mean error detection latency of 12.75 cycles (4150x faster than end result check) for soft errors, we also explore a rollback recovery method with an additional area cost of 28.0%, observing a 175x increase in reliability against soft errors. Another solution for rapid post-silicon validation of accelerator designs is Hybrid Quick Error Detection (H-QED): inserting signature generation logic in a hardware design to create a heavily compressed signature stream that captures the internal behavior of the design at a fine temporal and spatial granularity for comparison with a reference set of signatures generated by high-level simulation to detect bugs. Using H-QED, we demonstrate an improvement in error detection latency (time elapsed from when a bug is activated to when it manifests as an observable failure) of two orders of magnitude and a threefold improvement in bug coverage compared to traditional post-silicon validation techniques. H-QED also uncovered previously unknown bugs in the CHStone benchmark suite, which is widely used by the HLS community. H-QED incurs less than 10% area overhead for the accelerator it validates with negligible performance impact, and we also introduce techniques to minimize any possible intrusiveness introduced by H-QED. Advisors/Committee Members: Chen, Deming (advisor).

Subjects/Keywords: High-level synthesis; Automation; error detection; scheduling; binding; compiler transformation; compiler optimization; pipelining; modulo arithmetic; logic optimization; state machine; datapath, control logic; shadow logic; low cost; high performance; electrical bugs; Aliasing; stuck-at faults; soft errors; timing errors; checkpointing; rollback; recovery; post-silicon validation; Accelerators; system on a chip; signature generation; execution signatures; execution hashing; logic bugs; nondeterministic bugs; masked errors; circuit reliability; hot spots; wear out; silent data corruption; observability; detection latency; mixed datapath; diversity; checkpoint corruption; error injection; error removal; Quick Error Detection (QED); Hybrid Quick Error Detection (H-QED); hybrid hardware/software; execution tracing; address conversion; undefined behavior; High-Level Synthesis (HLS) engine bugs; detection coverage

…elements, there can be a third “limbo” state known as silent data corruption. In this state, the… …stored in memory indefinitely, there is no limit to how long silent data corruption can last… …For each liveness interval, we attempt to schedule a checkpoint at each use (read)… …be masked or aliased. If the checkpoint cannot be scheduled at a state, we attempt to… …reducers allocated. 3.1.3 Recovery To enable error recovery for soft errors, we use a checkpoint… 

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Campbell, K. A. (2015). Low-cost error detection through high-level synthesis. (Thesis). University of Illinois – Urbana-Champaign. Retrieved from http://hdl.handle.net/2142/89068

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Chicago Manual of Style (16th Edition):

Campbell, Keith A. “Low-cost error detection through high-level synthesis.” 2015. Thesis, University of Illinois – Urbana-Champaign. Accessed April 20, 2021. http://hdl.handle.net/2142/89068.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

MLA Handbook (7th Edition):

Campbell, Keith A. “Low-cost error detection through high-level synthesis.” 2015. Web. 20 Apr 2021.

Vancouver:

Campbell KA. Low-cost error detection through high-level synthesis. [Internet] [Thesis]. University of Illinois – Urbana-Champaign; 2015. [cited 2021 Apr 20]. Available from: http://hdl.handle.net/2142/89068.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Council of Science Editors:

Campbell KA. Low-cost error detection through high-level synthesis. [Thesis]. University of Illinois – Urbana-Champaign; 2015. Available from: http://hdl.handle.net/2142/89068

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

3. Campbell, Keith A. Robust and reliable hardware accelerator design through high-level synthesis.

Degree: PhD, Electrical & Computer Engr, 2017, University of Illinois – Urbana-Champaign

System-on-chip design is becoming increasingly complex as technology scaling enables more and more functionality on a chip. This scaling-driven complexity has resulted in a variety of reliability and validation challenges including logic bugs, hot spots, wear-out, and soft errors. To make matters worse, as we reach the limits of Dennard scaling, efforts to improve system performance and energy efficiency have resulted in the integration of a wide variety of complex hardware accelerators in SoCs. Thus the challenge is to design complex, custom hardware that is efficient, but also correct and reliable. High-level synthesis shows promise to address the problem of complex hardware design by providing a bridge from the high-productivity software domain to the hardware design process. Much research has been done on high-level synthesis efficiency optimizations. This dissertation shows that high-level synthesis also has the power to address validation and reliability challenges through three automated solutions targeting three key stages in the hardware design and use cycle: pre-silicon debugging, post-silicon validation, and post-deployment error detection. Our solution for rapid pre-silicon debugging of accelerator designs is hybrid tracing: comparing a datapath-level trace of hardware execution with a reference software implementation at a fine temporal and spatial granularity to detect logic bugs. An integrated backtrace process delivers source-code meaning to the hardware designer, pinpointing the location of bug activation and providing a strong hint for potential bug fixes. Experimental results show that we are able to detect and aid in localization of logic bugs from both C/C++ specifications as well as the high-level synthesis engine itself. A variation of this solution tailored for rapid post-silicon validation of accelerator designs is hybrid hashing: inserting signature generation logic in a hardware design to create a heavily compressed signature stream that captures the internal behavior of the design at a fine temporal and spatial granularity for comparison with a reference set of signatures generated by high-level simulation to detect bugs. Using hybrid hashing, we demonstrate an improvement in error detection latency (time elapsed from when a bug is activated to when it manifests as an observable failure) of two orders of magnitude and a threefold improvement in bug coverage compared to traditional post-silicon validation techniques. Hybrid hashing also uncovered previously unknown bugs in the CHStone benchmark suite, which is widely used by the HLS community. Hybrid hashing incurs less than 10% area overhead for the accelerator it validates with negligible performance impact, and we also introduce techniques to minimize any possible intrusiveness introduced by hybrid hashing. Finally, our solution for post-deployment error detection is modulo-3 shadow datapaths: performing lightweight shadow computations in modulo-3 space for each main computation. We leverage the binding and scheduling… Advisors/Committee Members: Chen, Deming (advisor), Chen, Deming (Committee Chair), Hwu, Wen-Mei W (committee member), Wong, Martin D F (committee member), Kim, Nam Sung (committee member).

Subjects/Keywords: High-level synthesis (HLS); Automation; Error detection; Scheduling; Binding; Compiler transformation; Compiler optimization; Pipelining; Modulo arithmetic; Modulo-3; Logic optimization; State machine; Datapath; Control logic; Shadow datapath; Modulo datapath; Low cost; High performance; Electrical bug; Aliasing; Stuck-at fault; Soft error; Timing error; Checkpointing; Rollback; Recovery; Pre-silicon validation; Post-silicon validation; Pre-silicon debug; Post-silicon debug; Accelerator; System on a chip; Signature generation; Execution signature; Execution hash; Logic bug; Nondeterministic bug; Masked error; Circuit reliability; Hot spot; Wear out; Silent data corruption; Observability; Detection latency; Mixed datapath; Diversity; Checkpoint corruption; Error injection; Error removal; Quick Error Detection (QED); Hybrid Quick Error Detection (H-QED); Instrumentation; Hybrid co-simulation; Hardware/software; Integration testing; Hybrid tracing; Hybrid hashing; Source-code localization; Software debugging tool; Valgrind; Clang sanitizer; Clang static analyzer; Cppcheck; Root cause analysis; Execution tracing; Realtime error detection; Simulation trigger; Nonintrusive; Address conversion; Undefined behavior; High-level synthesis (HLS) bug; Detection coverage; Gate-level architecture; Mersenne modulus; Full adder; Half adder; Quarter adder; Wraparound; Modulo reducer; Modulo adder; Modulo multiplier; Modulo comparator; Cross-layer; Algorithm; Instruction; Architecture; Logic synthesis; Physical design; Algorithm-based fault tolerance (ABFT); Error detection by duplicated instructions (EDDI); Parity; Flip-flop hardening; Layout design through error-aware transistor positioning dual interlocked storage cell (LEAP-DICE); Cost-effective; Place-and-route; Field programmable gate array (FPGA) emulation; Application specific integrated circuit (ASIC); Field programmable gate array (FPGA); Energy; Area; Latency

…languages SDC Silent Data Corruption SEC Statistical Error Compensation SEMU Single Event… …silent data corruption. In this state, the error has changed the internal behavior of the… …limit to how long silent data corruption can last. While unmasked errors are clearly the most… 

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Campbell, K. A. (2017). Robust and reliable hardware accelerator design through high-level synthesis. (Doctoral Dissertation). University of Illinois – Urbana-Champaign. Retrieved from http://hdl.handle.net/2142/99294

Chicago Manual of Style (16th Edition):

Campbell, Keith A. “Robust and reliable hardware accelerator design through high-level synthesis.” 2017. Doctoral Dissertation, University of Illinois – Urbana-Champaign. Accessed April 20, 2021. http://hdl.handle.net/2142/99294.

MLA Handbook (7th Edition):

Campbell, Keith A. “Robust and reliable hardware accelerator design through high-level synthesis.” 2017. Web. 20 Apr 2021.

Vancouver:

Campbell KA. Robust and reliable hardware accelerator design through high-level synthesis. [Internet] [Doctoral dissertation]. University of Illinois – Urbana-Champaign; 2017. [cited 2021 Apr 20]. Available from: http://hdl.handle.net/2142/99294.

Council of Science Editors:

Campbell KA. Robust and reliable hardware accelerator design through high-level synthesis. [Doctoral Dissertation]. University of Illinois – Urbana-Champaign; 2017. Available from: http://hdl.handle.net/2142/99294

.