Cross-layer Accelerated Self-healing (CLASH)
Cross-Layer Accelerated Self-Healing: Circadian Rythms for Resilient Electronic Systems
This is a cover page for the HPLP Research - CLASH project, funded by NSF and SRC.
- Xinfei Guo (email@example.com)
- Alec Roelke (firstname.lastname@example.org)
- Mircea Stan (email@example.com)
A great challenge in recent chip design is coping with process, voltage, temperature and aging (PVTA) variations. These variations require increased design margins or alternative circuit designs to minimize their effects, which often reduce power, performance, and area (PPA) metrics. Aging in particular causes a gradual degradation of these metrics while a circuit is under stress and eventually leads to catastrophic failure. Since this degradation does not scale down with technology size, it is becoming a bigger and bigger concern as its effect on electronic circuits increases. However, many mechanisms involved in aging show evidence that a certain amount of recovery can be achieved by removing the conditions that caused stress in the first place. These mechanisms are not well-understood and so their recovery processes cannot be effectively taken advantage of.
This project approaches the issue by developing cross-layer models for aging degradation and recovery, from device-level models for mechanisms such as bias temperature instability (BTI) to architecture level models for power and performance degradation. With these models, CLASH proposes an idea of periodic active recovery inspired by circadian rhythms in biological systems where an animal regularly undergoes a period of deep rejuvenation, or sleeps. The modern idea of "sleeping" for an electronic system is one where it is merely inactive; by adding environmental conditions such as elevated temperature, recovery can be accelerated and the system can be recovered beyond what it would normally experience.
Models for aging and recovery will be developed for memory (Flash and SRAM), FPGA, and processors. These models, along with experimental data from production and custom chips, will be used to demonstrate the effects of aging as well as improvements in lifetime and PPA metrics that can be achieved using periodic active recovery. This project is funded by Intel, AMD, and the National Science Foundation.
- Adapteva Parallella: Applying active accelerated self-healing techniques on Parallella platform
- [C3] X. Guo, M. Stan, "Work hard, sleep well - Avoid irreversible IC wearout with proactive rejuvenation, " Proc. of the ACM/IEEE Asia and South Pacific Design Automation Conference (ASP-DAC), Macau, China, to appear.
- [C2] X. Guo, M. Stan, "MCPENS: Multiple-Critical-Path Embeddable NBTI Sensors for Dynamic Wearout Management," Proc. of 11th IEEE Workshop on Silicon Errors in Logic–System Effects (SELSE-11), pp. 116-121, Austin, TX, April 2015.
- [C1] X. Guo, W. Burleson, M. Stan, "Modeling and Experimental Demonstration of Accelerated Self-Healing Techniques," In Proc. of ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, June 2014.
- [P5] X. Guo, M. Stan, "Towards Wearout-aware and Accelerated Self-healing Digital Systems," In SIGDA PhD Forum, Design Automation Conference (DAC), San Francisco, CA, June 2015.
- [P4] X. Guo, M. Stan, "Towards Aging-aware and Self-healing VLSI Chips and Systems," the 11st Annual University of Virginia Engineering Research Symposium (UVERS), Charlottesville, VA, March 2015.
- [P3] M. Stan, X. Guo, A. Roelke, "Modeling and Experimental Demonstration of Accelerated Self-Healing Techniques in CMOS Circuits," Proc. of the Government Microcircuit Applications & Critical Technology Conference (GOMAC Tech), St. Louis, MO, March 2015.
- [P2] X. Guo, M. Stan, "Exploring Accelerated Self-Healing Techniques for Electronic Chips and Systems," the 10th Annual University of Virginia Engineering Research Symposium (UVERS), Charlottesville, VA, March 2014.
- [P1] X. Guo and M. Stan, "Aging effect in FPGA chips and systems," A. Richard Newton Young Fellow Poster Session, Design Automation Conference (DAC), Austin, TX, June 2013.
This work was supported in part by NSF grant CCF-1255907, SRC task 2410.001 and the Center for Future Architectures Research (C-FAR), one of the six SRC STARnet Centers, sponsored by MARCO and DARPA.