ORNL-TechInt-Page/nvm.src at master · ORNL-TechInt/ORNL-TechInt-Page · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
%include('html_intro.inc')
    <meta name="keywords" content="projects">
    <meta name="description" content="NCCS TechInt group thrust area">
    <title>Technology Integration Group: Non-Volatile Memory</title>
%include('hdbody.inc')

%include('hdrnav.inc')

<!-- ========================================================== -->
%include('content_intro.inc')
<div class="overview">
In the following projects, we explore non-volatile memory devices from
several perspectives including fault tolerance, memory extension, and
out-of-core data analytics.
</div>
<div class="plinks">
<a href="#SSDlifetime">SSD Lifetime Analysis under Checkpointing and
Memory Extension Workload</a>
<br><a href="#NVMalloc">NVMalloc</a>
| <a href="#ActiveProcessing">Active Processing</a>
| <a href="#Emerging">Emerging Storage Technologies</a>
</div>

%include('content_close.inc')

<!-- ========================================================== -->
%include('content_intro.inc')

    <h2 id="SSDlifetime" class="title">SSD Lifetime Analysis under
    Checkpointing and Memory Extension Workload</h2>

    <p>Future extreme-scale compute systems will take advantage of
    non-volatile memory technology for both memory extension and
    checkpointing. While SSD devices offer desirable properties such
    as lower cost ($/GB) and power consumption than DRAM, they suffer
    from poor random write performance and write endurance problems.
    Specifically the SSD lifetime can be highly influenced by the
    access pattern, the level of over-provisioning, and the choices of
    garbage collection reclaiming policy and wear-leveling techniques
    on the SSDs. In this project, we explore the SSD lifetime and
    performance issues for various scientific applications on
    extreme-scale systems that employ SSDs for both memory extension
    as well as checkpointing. The work involves several areas.

    <ul>
    <li>Characterizing memory accesses and checkpoint I/O of HPC applications
    on the SSDs.

    <li>SSD lifetime analysis for different scientific applications.

    <li>Developing techniques to improve the SSD lifetime in both checkpoint
    and memory extension libraries.

    </ul>

    <p>Contributors: Chao Wang, and Sudharshan Vazhkudai

  <p>&nbsp; <a class="right" href="#">Top</a>
%include('content_close.inc')

<!-- ========================================================== -->
%include('content_intro.inc')
    <h2 id="NVMalloc" class="title"><a href="http://users.nccs.gov/~vazhkuda/NVMalloc.html"
    target="other">NVMalloc</a></h2>


<p>DRAM is a precious resource in extreme-scale machines and is
increasingly becoming scarce, mainly due to the growing number of
cores per node. On future multi-petaflop and exaflop machines, the
memory pressure is likely to be so severe that we need to rethink our
memory usage models. Fortunately, the advent of non-volatile memory
(NVM) offers a unique opportunity in this space. Current NVM offerings
possess several desirable properties, such as low cost and power
efficiency, but also suffer from high latency and lifetime issues. We
need rich techniques to be able to use them alongside DRAM. NVMalloc
is an approach for exploiting NVM as a secondary memory partition so
that applications can explicitly allocate and manipulate memory
regions therein. More specifically, it is a middleware library with a
suite of services that enables applications to access node-local or
distributed NVM storage systems in a seamless fashion. The work
inovolves several areas.


<ul>

  <li>Exploring NVM as a secondary memory partition for HPC systems

  <li>Developing the NVMalloc library with a suite of services that
  enables applications to access NVM systems

  <li>Studying the performance of NVMalloc with various HPC benchmark
  suites

  <li>Developing data management techniques to efficiently use the
  NVM partition and DRAM partition

</ul>

    <p>Contributors: Chao Wang, and Sudharshan Vazhkudai

    <p>Status: On-going

<p>Selected publications
    <ul>

      <li>
      Chao Wang, Sudharshan Vazhkudai, Xiaosong Ma, Fei Meng, Youngjae
      Kim, Christian Engelmann, "NVMalloc: Exposing an Aggregate SSD
      Store as a Memory Partition in Extreme-Scale Machines",
      <i>Proceedings of the 26th IEEE Int'l Parallel and Distributed
      Processing Symposium (IPDPS 2012)</i>, Shanghai, China, May
      21-25, 2012. (118/569 = 20.7%)
      <a href="http://users.nccs.gov/~vazhkuda/NVMalloc.pdf">pdf</a>

    </ul>
  <p>&nbsp; <a class="right" href="#">Top</a>
%include('content_close.inc')

<!-- ========================================================== -->
%include('content_intro.inc')
<h2 id="ActiveProcessing">Active Processing</h2>

<h3 id="ActiveFlash">Active Flash</h3>

Modern scientific discovery is increasingly driven by large-scale
supercomputing simulations, followed by data analysis tasks. These
data analyses are either performed offline, on smaller-scale clusters,
or on the supercomputer itself. Unfortunately, these techniques suffer
from performance and energy inefficiencies due to increased data
movement between the compute and storage subsystems. We propose Active
Flash, an in-situ scientific data analysis approach, wherein data
analysis is conducted on the solid-state device (SSD), where the data
already resides. Active Flash has the potential to enable true
out-of-core data analytics by freeing up both the compute core and the
associated main memory. The work involves several areas.

<ul>
  <li>Developing analytical models for the performance and energy
  feasibility of Active Flash, and comparing those models against traditional
  in-situ and offline data analysis schemes

  <li>Building an Active Flash prototype to demonstrate its viability

  <li>Studying co-scheduling of analyses and internal flash management
  activities on the SSD controller

  <li>Defining efficient representations of data object interfaces for
  Active Flash
</ul>

<p>Contributors: Sudharshan Vazhkudai

<p>Status: On-going

<p>Selected publications
    <ul>

      <li>Devesh Tiwari, Simona Bobila, Sudharshan Vazhkudai, Youngjae
      Kim, Xiaosong Ma, Peter Desnoyers, Yan Solin, "Active Flash:
      Towards Energy-Efficient, In-Situ Data Analytics on
      Extreme-Scale Machines,", <i>(To appear) Proceedings of the
      USENIX Conference on File and Storage Technologies (FAST
      2013)</i>, February, 2013. (24/127 = 18.9%)

      <li>Simona Boboila, Youngjae Kim, Sudharshan Vazhkudai, Peter
      Desnoyers, Galen M. Shipman, "Active Flash: Out-of-core Data
      Analytics on Flash Storage", <i>(Proceedings of the 28th IEEE
      Symposium on Massive Storage Systems and Technologies (MSST
      2012)</i>, Monterey, CA, April 16-20, 2012. (14/57 = 24.5%)

    </ul>

<h3 id="AnalyzeThis">AnalyzeThis</h3>

<p>
The need for novel data analysis is urgent in the face of
a data deluge from modern applications. Traditional approaches
to data analysis incur significant data movement
costs, moving data back and forth between the storage system
and the processor. Emerging Active Flash devices enable
processing on the flash, where the data already resides.
An array of such Active Flash devices allows us to revisit how
analysis workflows interact with storage systems. By seamlessly
blending together the flash storage and data analysis,
we create an analysis workflow-aware storage system, AnalyzeThis.
Our guiding principle is that analysis-awareness be
deeply ingrained in each and every layer of the storage, elevating
data analyses as first-class citizens, and transforming
AnalyzeThis into a potent analytics-aware appliance. We
implement the AnalyzeThis storage system atop an emulation
platform of the Active Flash array. Our results indicate
that AnalyzeThis is viable, expediting workflow execution
and minimizing data movement.

<p>
TechInt contributors: Sudharshan Vazhkudai, Hyogi Sim

<p>
Selected publications:

<ul>
<li>
Hyogi Sim, Youngjae Kim, Sudharshan S. Vazhkudai, Devesh Tiwari, Ali
Anwar, Ali R. Butt, Lavanya Ramakrishnan, "AnalyzeThis: An Analysis
Workflow-Aware Storage System," <i>Proceedings of Supercomputing 2015
(SC15): 28th Int'l Conference on High Performance Computing,
Networking, Storage and Analysis</i>, Austin, TX, November 2015.
<a href="http://users.nccs.gov/~vazhkuda/AnalyzeThis.pdf">pdf</a>
</ul>

<h3 id="AnalyzeThat">AnalyzeThat</h3>

<p>
AnalyzeThat is a programmable shared-memory system for parallel data
processing with PIM (processing-in-memory) devices.

<p>
Processing In Memory (PIM), the concept of integrating processing directly
with memory, has been attracting a lot of attention since PIM can assist in
overcoming the throughput limitation caused by data movement between CPU and
memory.  The challenge, however, is that it requires the programmers to have a
deep understanding of the PIM architecture to maximize the benefits such as
data locality and parallel thread execution on multiple PIM devices.  In this
study, we present AnalyzeThat, a programmable shared-memory system for
parallel data processing with PIM devices.  Thematic to AnalyzeThat is a rich
PIM-Aware Data Structure (PADS), which is an encapsulation that integrally
ties together the data, the analysis tasks and the runtime needed to interface
with the PIM device array.  The PADS abstraction provides (i) a key-value data
container that allows programmers to easily store data on multiple PIMs, (ii)
a suite of parallel operations with which users can easily implement data
analysis applications, and (iii) a runtime, hidden to programmers, which
provides the mechanisms needed to overlay both the data and the tasks on the
PIM device array in an intelligent fashion, based on PIM-specific information
collected from the hardware.  We have developed a PIM emulation framework
called AnalyzeThat.  Our experimental evaluation with representative data
analytics applications suggests that the proposed system can significantly
reduce the PIM programming effort without losing its technology benefits.

<p>
Contributors: Hyogi Sim, Sudharshan Vazhkudai

<p>
Selected publications:

<ul>
<li>
Sangkuen Lee, Hyogi Sim, Youngjae Kim, Sudharshan S. Vazhkudai,
"AnalyzeThat: A Programmable Shared-Memory System for an Array of
Processing-In-Memory Devices",
<i>IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
(CCGRID '17)</i>, Madrid, Spain, May 2017
</ul>

  <p>&nbsp; <a class="right" href="#">Top</a>

%include('content_close.inc')

<!-- ========================================================== -->
%include('content_intro.inc')
<h2 id="Emerging">Emerging Storage Technologies</h2>

<p>
Hard disk drives (HDDs) have been the preferred media for data storage
in high-performance computing. A center-wide file system at ORNL has
deployed 13,440 HDDs for back-end storage systems, running the Lustre
parallel file system. However, there are shortcomings inherent to
HDDs. Alongside improvements in HDD technology, significant advances
have also been made in various forms of solid-state memory such as
NAND flash memory, phase-change memories (PRAM), and spin-transfer
torque memories (STT-RAM). These emerging storage technologies close a
huge gap between DRAM and disk based storage systems. In this project,
we evaluate these emerging memory technologies, study how they are to
coexist with magnetic disks, and address technology challenges towards
employing these memory devices in HPC storage systems. More
specifically we measure the technology's impact on I/O intensive
scientific applications. The work includes several areas.

<ul>
  <li>Benchmarking emerging memory technologies and studying their
      performance and behavior

  <li>Detailed performance modeling of these memory devices for different
      scientific applications

  <li>Identifying technology challenges towards employing these devices in
      HPC storage systems

  <li>Understanding internal memory management algorithms at the device
      controller
</ul>

<p>
Contributors: Sarp Oral, and Sudharshan Vazhkudai

<p>
Status: On-going

<p>
Selected publications:
  <ul>
    <li>Youngjae Kim, Junghee Lee, Sarp Oral, David Dillow, Feiyi
    Wang, Galen M. Shipman, "Coordinating Garbage Collection for
    Arrays of Solid-state Drives", (To appear) <i>IEEE Transactions on
    Computers (IEEE TC)</i>, 2013.

    <li>Ramya Prabhakar, Sudharshan Vazhkudai, Youngjae Kim, Ali Butt,
    Min Li, Mahmut Kandemir, "Provisioning a Multi-Tiered Data Staging
    Area for Extreme-Scale Machines", <i>Proceedings of the 31th Int'l
    Conference on Distributed Computing Systems (ICDCS 2011)</i>,
    Minneapolis, Minnesota, June 20-24, 2011.
    <a href="http://users.nccs.gov/~vazhkuda/ICDCS11.pdf">pdf</a>
  </ul>

  <p>&nbsp; <a class="right" href="#">Top</a>
%include('content_close.inc')
%include('html_close.inc')