Skip to content

Commit 2c87c8d

Browse files
committed
document publisher side of the File Engine
1 parent efa3e33 commit 2c87c8d

File tree

3 files changed

+77
-13
lines changed

3 files changed

+77
-13
lines changed

doc/source/Inside_File_engine.rst

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,65 @@
55
Inside the File engine
66
######################
77

8+
One of the important characteristics of the **File engine** is that publishers and subscribers can interact with the
9+
DTL in a completely **desynchronized** way. Publishers can write down Variables into files, close the engine, and even
10+
disconnect themselves from the DTL before subscribers start to read the Variable from these files. However, when
11+
subscribers aim to retrieve Variables from the DTL while the publishers are exporting them, they must wait for the data
12+
to be fully written on storage (or more precisely for the simulation of the corresponding I/O operations to be over)
13+
before being able to read it. These two situations implies the implementation of additional checks related to which
14+
actors (i.e., publishers and/or subscribers) are currently using the File engine and of internal synchronization
15+
mechanisms.
16+
17+
On the publisher side
18+
---------------------
19+
20+
The first publisher to call to :cpp:func:`begin_transaction` marks that a new :ref:`Concept_Transaction` is in progress
21+
and increments an internal transaction counter. Then, from the second transaction using that File engine only, the
22+
publishers check whether the previous transaction is over. To this end, the publishers are blocked on a **condition
23+
variable**, until all the **write activities** from that transaction are completed. Once the publishers are unblocked,
24+
they proceed with the current transaction, i.e., handling the calls to :cpp:func:`put` made in the simulator's code.
25+
26+
These calls to :cpp:func:`put` are delegated by the :ref:`Concept_Engine` to the selected :ref:`Concept_Transport`. For
27+
each call to :cpp:func:`put`, a publisher simply determines, based on the part of the :ref:`Concept_Variable` it
28+
locally owns and the **selection** made by the user, how many bytes it has to write and in which file (determined by
29+
the name of the :ref:`Concept_Stream`) to write them. With the **default File transport method**, each publisher
30+
creates and writes to a file called ``data.i``, where ``i`` is the unique index of the publisher.
31+
32+
The most important call is thus that to :cpp:func:`end_transaction` where the I/O activities are created and started.
33+
In this function, each publisher goes over all the write operation it registered during the different calls to the
34+
:cpp:func:`put` functions made in this transaction, and creates the corresponding simulated I/O activities by calling
35+
the ``File::write_async()`` function of the
36+
`FSMod file system module <https://github.com/simgrid/file-system-module>`_. These calls are made in **detached** mode,
37+
meaning that the publisher starts an **asynchronous write**, can forget about it, and proceeds with the next Variable.
38+
DTLMod also leverages the **signal/callback** mechanism provided by SimGrid to attach a callback triggered on the
39+
completion of a given asynchronous write activity. This callback does two things: 1) Notify all the actors waiting for
40+
the completion of that activity that it is now completed; and 2) Remove this activity from the list of **pending
41+
activities** maintained by the publisher that created it.
42+
43+
The last action performed in :cpp:func:`end_transaction` is only done by the **last publisher** to call the function.
44+
This actor marks the :ref:`Concept_Transaction` as over, increments an internal counter of completed transactions, and
45+
most importantly, notifies subscribers to that Stream that this transaction is complete. Note that the fact that a
46+
transaction is complete and the subscribers are notified does not mean that the corresponding I/O actictivites (i.e.,
47+
writing into files) are complete and that the subscribers can start reading the files to get Variables. However, it
48+
means that the publishers have all the publishers have determined what and where to write and that the subscribers are
49+
allowed to use this information to create their I/O activities (i.e., reading from files) as explained in the
50+
:ref:`File_sub_side` section.
51+
52+
To determine which publisher is the last to call the :cpp:func:`end_transaction` function, DTLMod relies on a
53+
synchronization barrier for all the publishers using this Engine. This barrier is created in the very first to
54+
:cpp:func:`end_transaction`.
55+
56+
The last operation performed on the publisher side of a File Engine is to **close** the engine by calling the
57+
:cpp:func:`close` function. The publishers first have to wait for the completion of the I/O activities strated by the
58+
last transaction performed on this engine, i.e., they are blocked on a condition variable and wait to be notified of
59+
the respective completion of these activities. Then the last publisher to call the :cpp:func:`close` function actually
60+
closes the Engine, as well as all the opened files on the simulated file system. Finally, if the
61+
:cpp:func:`set_metadata_export` function has been called for the :ref:`Concept_Stream` that created the Engine, this
62+
publisher export a summary of all the I/O operations performed during the lifetime of the Engine.
63+
64+
.. _File_sub_side:
65+
66+
On the subscriber side
67+
----------------------
68+
869
TBD

doc/source/app_API.rst

Lines changed: 13 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -39,13 +39,15 @@ Internally, a |Concept_DTL|_ is implemented as a server daemon process that answ
3939
**disconnection** requests from the simulated actors and maintains the set of active connections.
4040

4141
.. |Concept_Stream| replace:: **Stream**
42-
.. _Concept_Stream:
4342

4443
.. |Concept_Streams| replace:: **Streams**
44+
4545
.. _Concept_Streams:
4646

47-
Streams
48-
^^^^^^^
47+
.. _Concept_Stream:
48+
49+
Stream
50+
^^^^^^
4951

5052
The |Concept_Stream|_ abstraction represents a connection between a simulated actor and the DTL, through which data
5153
transport operations occur, and acts as a |Concept_Variable|_ factory. The publishers **define** the variables a
@@ -58,8 +60,8 @@ specific |Concept_Engine|_ to actually handle data transport.
5860
.. |Concept_Engine| replace:: **Engine**
5961
.. _Concept_Engine:
6062

61-
Engines
62-
^^^^^^^
63+
Engine
64+
^^^^^^
6365
The |Concept_Engine|_ abstraction is the base interface through which the |Concept_DTL|_ interacts with the simulated
6466
communication or I/O subsystems in charge of the simulation of data movement or storage. DTLMod exposes two types of
6567
engines: **file-based** engines, that write and read data to and from storage and **staging** engines that stream data
@@ -74,8 +76,8 @@ analysis component to another. The type of |Concept_Engine|_ to use can be speci
7476
.. |Concept_Transport| replace:: **Transport**
7577
.. _Concept_Transport:
7678

77-
Transport methods
78-
^^^^^^^^^^^^^^^^^
79+
Transport method
80+
^^^^^^^^^^^^^^^^
7981

8082
An engine is then associated to a specific |Concept_Transport|_ **method** that further specifies how data is written
8183
to and read from a file system or streamed from one workflow component to another. This separation between
@@ -134,8 +136,8 @@ necessary. The exact redistribution pattern is automatically determined by DTLMo
134136
.. |Concept_Variable| replace:: **Variable**
135137
.. _Concept_Variable:
136138

137-
Variables
138-
^^^^^^^^^^
139+
Variable
140+
^^^^^^^^
139141

140142
At the core of the DTLMod is the data transported from publishers to subscribers. Many in situ processing workflows
141143
involve parallel MPI codes as data producers. These codes manipulate **multidimensional arrays** distributed over
@@ -161,8 +163,8 @@ array. Finally, the tuple stores the **size of the elements** in the array.
161163
.. |Concept_Transaction| replace:: **Transaction**
162164
.. _Concept_Transaction:
163165

164-
Transactions
165-
^^^^^^^^^^^^
166+
Transaction
167+
^^^^^^^^^^^
166168

167169
Simulated actors can publish, or subscribe to, one or more |Concept_Variable|_ variables within a
168170
|Concept_Transaction|_. This logical construct delimits the interactions between an actor and the |Concept_DTL|_ and

doc/source/index.rst

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -44,8 +44,9 @@ effects of resource allocation strategies.
4444
:caption: DTLMod's Internals:
4545

4646
Connection manager <Connection_manager.rst>
47-
.. Design goals <Design_goals.rst>
48-
.. Contributor's documentation <Contributors_Documentation.rst>
47+
Engines <Engines.rst>
48+
   Inside the File engine <Inside_File_engine.rst>
49+
   Inside the Staging engine <Inside_Staging_engine.rst>
4950

5051
.. Cheat Sheet on the sublevels
5152
..

0 commit comments

Comments
 (0)