Saltar al contingut Menu
Map
  • Home
  • Information
  • Contact
  • Map

Conference: Memory Errors in Modern Systems

Compartir
Introduced: 24-11-2014
HPC (CAP) research group invites you to attend the talk.
Speaker: Vilas Sridharan (AMD)
Date: Fri, 12/Dec/2014, 10:00
Room: C6-E101
ABSTRACT
Several recent publications have shown that hardware faults in the memory subsystem are commonplace. These faults are predicted to become more frequent in future systems that contain orders of magnitude more DRAM and SRAM than found in current memory subsystems. These memory subsystems will need to provide resilience techniques to tolerate these faults when deployed in high-performance computing systems and data centers containing tens of thousands of nodes. In this talk, I will focus on learnings about hardware reliability gathered from systems in the field, and use our data to project to whether hardware reliability will become a larger problem at exascale.

BIOGRAPHY
Vilas Sridharan works in the RAS (Reliability, Availability, and Serviceability) Architecture and Strategy group at AMD, Inc., where he is responsible for defining the reliability features of AMD server products. He received his Ph.D. and M.S.E. from the Department of Electrical and Computer Engineering at Northeastern University, and his B.S.E. in Computer Engineering from Princeton University in 2000. From 2000 - 2004, he worked in the SPARC server division at Sun Microsystems. His research focuses on system reliability and fault tolerance, specifically on modeling the effects of soft errors on systems and on system recovery techniques.

Conference information


Compartir

 
logo FIB © Barcelona school of informatics - Contact - RSS
This website uses cookies to offer you the best experience and service. If you continue browsing, it is understood that you accept our cookies policy.
Classic version Mobile version