DRAM errors in the wild

Bianca Schroeder; Eduardo Pinheiro; Wolf-Dietrich Weber

Страница публикации Публикация в OpenAlex

Аннотация: Errors in dynamic random access memory (DRAM) are a common form of hardware failure in modern compute clusters. Failures are costly both in terms of hardware replacement costs and service disruption. While a large body of work exists on DRAM in laboratory conditions, little has been reported on real DRAM failures in large production clusters. In this paper, we analyze measurements of memory errors in a large fleet of commodity servers over a period of 2.5 years. The collected data covers multiple vendors, DRAM capacities and technologies, and comprises many millions of DIMM days.

Год издания: 2009

Авторы: Bianca Schroeder, Eduardo Pinheiro, Wolf-Dietrich Weber

Ключевые слова: Radiation Effects in Electronics, Semiconductor materials and devices, Distributed systems and fault tolerance

Показать дополнительные сведения

Будние дни	9:00–19:00
Суббота	9:00–17:00
Воскресенье	выходной день

Подразделения:

8:30–17:00 (обед 12:30–13:00), пн-пт

Контакты

Единый телефон	+7 (391) 291-25-74
Библиотека	+7 (391) 206-21-06
Издательство	+7 (391) 206-25-88
E-mail	bik [at] sfu-kras.ru
Адрес	пр. Свободный, 79/10

Библиотечно-издательский комплекс СФУ

DRAM errors in the wild
статья

DRAM errors in the wildстатья

DRAM errors in the wild
статья