ORNL-TechInt-Page/data.src at master · ORNL-TechInt/ORNL-TechInt-Page · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
%include('html_intro.inc')
    <meta name="keywords" content="projects">
    <meta name="description" content="NCCS TechInt group thrust area">
    <title>Technology Integration Group: Data Management</title>
%include('hdbody.inc')

%include('hdrnav.inc')

<!-- ========================================================== -->
%include('content_intro.inc')
<div class="overview">

These projects exemplify the Technology Integration Group's
contributions to data management and analysis, including data
modeling, data capture, publication, search and discovery, analysis,
and visualization.

</div>
<div class="plinks">
<a href="#ADARA">ADARA</a>
| <a href="#Constellation">Constellation</a>
| <a href="#DOI">DOIs for HPC Data</a>
| <a href="#GUIDE">GUIDE</a>
| <a href="#BPIO">Balanced Placement I/O Library</a>
| <a href="#TagIt">TagIt</a>
</div>

%include('content_close.inc')

<!-- ========================================================== -->
%include('content_intro.inc')
    <div class="logo-flow">
      <img class="adara" src="images/ADARA_Logo.png" alt="ADARA Logo"/>
    </div>

    <h2 class="title" id="ADARA">ADARA</h2>

<p>

In addition to hosting the world's fastest supercomputer, ORNL also
operates the world's brightest neutron source, the <a
href="http://neutrons.ornl.gov/" target="other">Spallation Neutron
Source</a> (SNS). Funded by the <a
href="https://science.energy.gov/bes/" target="other">US DOE Office of
Basic Energy Science</a>, this national user facility hosts hundreds
of scientists from around the world, providing a platform to enable
break-through research in materials science, sustainable energy, and
basic science. OCLF personnel have been engaged to help manage and
analyze the large data sets (ranging in size from 100's of gigabytes
to over 1 terabyte) generated by the intense pulses of neutrons.

<p>

OLCF staff and SNS data specialists collaborated to successfully
complete the Accelerating Data Acquisition, Reduction, and Analysis
(ADARA) Lab-Directed Research and Development project to improve the
production and analysis of these data sets. OLCF provided its
expertise in high-performance file systems, parallel processing,
cluster configuration and management, and data management to the
project. As a result of the ADARA project, a new data infrastructure
was created that enhances users’ ability to collect, reduce, and
analyze data as it is taken; create data files immediately after
acquisition, regardless of size; reduce a data set in seconds after
acquisition; and provide the resources for any user to do
post-acquisition reduction, analysis, visualization, and modeling
without requiring users to be on-site at the SNS facility.

<p>
ADARA is currently running on the HYSPEC beam line providing near
real-time access to result data sets (both raw event data and reduced
data) so that instrument scientists and users now obtain live
feedback from their experiments. Moving forward, ADARA will be
deployed in production across a number of beam lines at SNS as the
capabilities developed within ADARA continue to be adopted by the SNS
facility.

<p> More complete details on ADARA are available at <a
href="http://www.csm.ornl.gov/newsite/adara.html"
target="other">http://www.csm.ornl.gov/newsite/adara.html</a>.

<p>
OLCF Contributors: Feiyi Wang, Dale Stansberry, and Ross Miller

<p>
Status: Ongoing

  <p>&nbsp; <a class="right" href="#">Top</a>
%include('content_close.inc')


<!-- ========================================================== -->
%include('content_intro.inc')
    <h2 class="title" id="Constellation">Constellation</h2>

<p>
Constellation federates metadata from the OLCF resource fabric (stat
metadata from ~1 Billion files from Spider PFS/HPSS, millions of jobs
metadata from the scheduler, thousands of users/groups, publications
and systems), and captures them in a custom-built in-memory graph. It
builds links or associations among resources (vertices in the graph)
by correlating the metadata to infer hidden relationships (e.g.,
linking data to jobs, extracting keywords from publications to link
together relevant publications or publications, jobs and data). Graph
traversals and high-performance indexes external to the graph enable
searches.

<p>
The stat metadata index is built using Hbase and Spark queries on PFS
stat/job metadata. We also build an hierarchical index by extracting
metadata from within the datasets themselves, and create more metadata
from base metadata. Based on this graph engine, we can discover
relationships, suggest related data products/papers of interest,
identify popular datasets via pagerank, and study and create new
user-specified “tags” that tie together resources for quick retrieval
and sharing. <a href="http://users.nccs.gov/~vazhkuda/Constellation.pdf"
target="_blank">[ConstellationGraph:BigData16]</a>

<p>
The first workflow to be supported in Constellation is the acquisition
of Digital Object Identifiers for scientific datasets (see "DOIs for
HPC Data" below).

<p>
Contributors: Sudharshan Vazhkudai, Raghul Gunasekaran, Dale
Stansberry, Tom Barron

<p>
Status: Ongoing

<p>
Selected publications:
<ul>

<li> Sudharshan S. Vazhkudai, John Harney, Raghul Gunasekaran, Dale
Stansberry, Seung-Hwan Lim, Tom Barron, Andrew Nash and Arvind
Ramanathan, "Constellation: A Science Graph Network for Scalable Data
and Knowledge Discovery in Extreme-Scale Scientific Collaborations,"
<i>Proceedings of the IEEE Workshop on Big Data Metadata and
Management</i>, Washington D.C., December 2016.
<a href="http://users.nccs.gov/~vazhkuda/Constellation.pdf">pdf</a>
</ul>


  <p>&nbsp; <a class="right" href="#">Top</a>
%include('content_close.inc')

<!-- ========================================================== -->
%include('content_intro.inc')
    <h2 class="title" id="DOI">DOIs for HPC Data</h2>

<p>
Recent directives from federal agencies outline a new desire and
requirement to provide access to scientific data arising from
taxpayer-funded research. The provision of this data will require new
policies and procedures including a much-improved mechanism for dataset
identification and tracking.

<p>
To this end, we are exploring the viability of digital object identifiers
(DOI) as a means to track data products emanating from scientific
simulations. DOI is a mechanism that can be used to help track,
identify, and share the data sets that are produced by researchers
globally.

<p>
The ability to facilitate data-related services via DOIs has new uses for
both the HPC center and the end-user. The center could utilize DOIs in their
interactions with funding agencies by providing improved accounting and
visibility of our user facility's data production. The center can also directly
benefit from new "data strategies" such as data warehousing and other beneficial
schemes that result from the improved planning information associated with DOIs
and Data Management Planning. From a user standpoint, DOIs help facilitate data
sharing, enable publication credit, enable data preservation beyond the lifetime
of the project at the center, facilitate lineage tracking, and facilitate the
use of intermediate data products.

<p>
We are working on creating a workflow so that users can obtain DOIs for
their data products of interest. We are working with OSTI in trying to
create this infrastructure.

<p>
Contributors: Sudharshan Vazhkudai, Raghul Gunasekaran, Dale
Stansberry, Mitchell Griffith, Tom Barron

<p>
Status: Ongoing

  <p>&nbsp; <a class="right" href="#">Top</a>
%include('content_close.inc')


<!-- ========================================================== -->
%include('content_intro.inc')
    <h2 class="title" id="GUIDE">GUIDE - Data Analytics on Large-Scale Logs to
    Optimize HPC Data Center Operations</h2>

<p>
GUIDE is a scalable Splunk-based infrastructure which aggregates and
supports analysis of huge amounts of log data, presenting a window
into HPC data center operations. Logs include (i) Spider PFS: I/O
bandwidth data from controllers, Lustre-level profiling data, file
size distribution of the 1 Billion files and health logs from 2016
OSTs comprising of 20,160 disks, (ii) Titan CPU/GPU RAS data from
18,688 nodes, (iii) Moab job scheduler and node allocation data, (iv)
interconnect congestion data from the 9600 routers, and (v) HPSS
archival storage usage and file size distribution data from 61 million
files. GUIDE provides higher-level services using a variety of
analytics techniques such as the correlation of logs, data mining,
statistical and visual analytics. These can be used to identify
hotspots, debug performance bottlenecks and trend analysis for future
resource provisioning.

<p>
Contributors: Sudharshan Vazhkudai, Ross Miller, Deryl Steinert, Chris
Zimmer, Feiyi Wang

<p>
Status: Complete

  <p>&nbsp; <a class="right" href="#">Top</a>
%include('content_close.inc')

<!-- ========================================================== -->
%include('content_intro.inc')
    <h2 class="title" id="BPIO">Balanced Placement I/O Library</h2>

<p>
Large-scale scientific applications' usage patterns lead to I/O
resource contention and load imbalance. This project implemented a dynamic,
shared library based on BPIO, a method to resolve contention, provides a
transparent way to balance resource usage without source code modification
or recompilation.

<p>
The BPIO Runtime Environment can be built as a shared, preloadable library.
It utilizes the BPIO Library for balanced data placement and function
interposition to prioritize itself over standard function calls. This
provides end-to-end and per job load balancing supporting a range of I/O
interfaces including POSIX and MPI-IO. HDF5 is under development.

<p>
Aequilibro integrates BPIO with ADIOS, combining BPIO's interconnect level
optimization with the benefits of the ADIOS I/O framework to provide
portable, fast, scalable, easy-to-use, and metadata-rich output and I/O
interfaces that can be changed during runtime.

<p>
TechInt contributors: Sarah Neuwrith, Feiyi Wang, Sarp Oral, Sudharshan Vazhkudai

<p>
Status: Ongoing

<p>
Selected publications:
<ul>
<li>
Sarah Neuwirth, Feiyi Wang, Sarp Oral, Sudharshan S. Vazhkudai, Ulrich
Bruening,``An I/O Load Balancing Framework for Large-scale
Applications (BPIO 2.0),'' (Poster) Proceedings of Supercomputing 2016
(SC16): 29th Int'l Conference on High Performance Computing,
Networking, Storage and Analysis, Salt Lake City, UT, November 2016.
<a href="http://users.nccs.gov/~vazhkuda/sc16-poster.pdf">pdf</a>
</ul>

  <p>&nbsp; <a class="right" href="#">Top</a>
%include('content_close.inc')

<!-- ========================================================== -->
%include('content_intro.inc')
    <h2 class="title" id="TagIt">TagIt: An integrated search and
    discovery service for file systems</h2>

<p>
TagIt is a novel data management
service framework supporting annotation, tagging, indexing, and filtering
operations on files. These services are tightly integrated into a shared-nothing
GlusterFS distributed file system. TagIt manages a scalable and
consistent metadata index database (MySQL) inside the file system,
exploiting readily available resources. It enables advanced tagging that
allows users to mark entire collections of files down to specific
portions of a file. It enables the association of operators to a tag for
pre-processing, filtering, or automatic metadata extraction which are
seamlessly offloaded to file servers in a load-aware fashion.

<p>
TechInt contributors: Sudharshan Vazhkudai, Hyogi Sim

<p>
Status: Ongoing

<p>
Selected publications:
<ul>
<li>Hyogi Sim, Youngjae Kim, Sudharshan S. Vazhkudai, Geoffroy R. Vallee,
Seung-Hwan Lim, and Ali R. Butt, "TagIt: An Integrated Search and
Discovery Service for Extreme-Scale File Systems," (Poster)
<em>Proceedings of the USENIX Annual Technical Conference
(ATC)</em>, Denver, CO, June 2016.
</ul>

  <p>&nbsp; <a class="right" href="#">Top</a>
%include('content_close.inc')

%include('html_close.inc')