-
Notifications
You must be signed in to change notification settings - Fork 1
Chapter/distributed computing completed #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
aac2491
added refresher on parallelism slides
0e510ed
added subheadings
f7dcc3d
added distributed computing introduction
e646fd4
Added openMPI section
3610239
Added message passing section
f56b4c6
Added challenges
58f7892
Added outline
7120b06
fixed broken navigation links
c972aa6
updated order of content
a1ba767
updated relevant bits based on pr feedback
fd77aca
Formatting
oraqlle 13da6b1
Warning in challenges
oraqlle File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1 +1,52 @@ | ||
| # Challenges | ||
|
|
||
| 🚧 Under Construction! 🏗️ | ||
|
|
||
| ## Install MPI for the tasks | ||
|
|
||
| - ```~/vf38/HPC_Training/spack/share/spack/setup-env.sh``` #configure spack | ||
| - ```spack load mpich``` #load MPICH through spack | ||
| - ```module load gcc``` #load gcc | ||
| - ```cp -r ~/vf38/HPC_Training/MPI_examples ~``` #copy tasks to your home directory | ||
|
|
||
| ## Task 1: Hello World | ||
|
|
||
| 1. Go to folder ‘hello’ | ||
| 2. Modify the files | ||
| 3. Compile mpi_hello_world.c using makefile | ||
| 4. Execute the file | ||
|
|
||
| ## Task 2: Ping Pong | ||
|
|
||
| 1. Go to ‘ping_pong’ folder | ||
| 2. Modify the files | ||
| 3. Compile the files | ||
| 4. Run (world size must be two for this file) | ||
|
|
||
| Output should be similar to this. May be slightly different due to process scheduling | ||
|
|
||
|  | ||
|
|
||
| ## Task 3: Monte Carlo | ||
|
|
||
| - Run “./calcPiSeq 100000000” # make sure you use gcc for compiling serial code | ||
| - Modify calcPiMPI.c | ||
| - Run calcPiMPI 100000000 with mpi and see the difference. You can change the number of processes. However, please be mindful that you are in login node! | ||
| Hint: # <https://www.mpich.org/static/docs/v3.3/www3/MPI_Reduce.html> | ||
|
|
||
| ## Task 4: Parallel computing task on compute nodes | ||
|
|
||
| - Submit your parallelised Monte Carlo task on compute nodes with 8 tasks | ||
|
|
||
| ## Task 5: Trapezoidal Rule Integration | ||
|
|
||
| - Run “./seq_trap 10000000000” | ||
| - Modify calcPiMPI.c | ||
| - Run seq_MPI 100000000000 with mpi and see the difference. You can change the number of processes. However, please be mindful that you are in login node! | ||
|
|
||
| ## Task 6: Bonus: Merge Sort | ||
|
|
||
| - This task is quite challenging to parallelise it yourself | ||
| - Please refer to the answer and check if you can understand it <https://selkie-macalester.org/csinparallel/modules/MPIProgramming/build/html/mergeSort/mergeSort.html> | ||
|
|
||
| Additional resources: <https://selkie-macalester.org/csinparallel/modules/MPIProgramming/build/html/index.html> | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1 +1,7 @@ | ||
| # Distributed Computing | ||
|
|
||
| - [Refresher on Parallelism](parallel-refresher.md) | ||
| - [What is Distributed Computing](distributed-computing.md) | ||
| - [OpenMPI](openmpi.md) | ||
| - [Message Passing](message-passing.md) | ||
| - [Challenges](challenges.md) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,44 @@ | ||
| # What is Distributed Computing | ||
|
|
||
| **Distributed computing is parallel execution on distributed memory architecture.** | ||
|
|
||
| This essentially means it is a form of parallel computing, where the processing power is spread across multiple machines in a network rather than being contained within a single system. In this memory architecture, the problems are broken down into smaller parts, and each machine is assigned to work on a specific part. | ||
|
|
||
|  | ||
|
|
||
| ## Distributed Memory Architecture | ||
|
|
||
| Lets have a look at the distributed memory architecture in more details. | ||
|
|
||
| - Each processor has its own local memory, with its own address space | ||
| - Data is shared via a communications network using a network protocol, e.g Transmission Control Protocol (TCP), Infiniband etc.. | ||
|
|
||
|  | ||
|
|
||
| ## Distributed vs Shared program execution | ||
|
|
||
| The following diagram provides another way of looking at the differences between distributed and shared memory architecture and their program execution. | ||
|
|
||
|  | ||
|
|
||
| ## Advantages of distributed computing | ||
|
|
||
| There are number of benefits to distributed computing in particular it addresses some shortcomings of shared memory architecture. | ||
|
|
||
| - No contention for shared memory since each machine has its own memory. Compare this to shared memory architecture where all the cpu's are sharing the same memory. | ||
| - Highly scalable as we can add more machines and are not limited by RAM. | ||
| - Effectively resulting in being able to handle large-scale problems | ||
|
|
||
| The benefits above do not come without some drawbacks including network overhead. | ||
|
|
||
| ## Disadvantages of distributed computing | ||
|
|
||
| - Network overload. Network can be overloaded by: | ||
| - Multiple small messages | ||
| - Very large data throughput | ||
| - Multiple all-to-all messages ($N^2$ growth of messages) | ||
| - Synchronization failures | ||
| - Deadlock (processes waiting for an input from another process that never comes) | ||
| - Livelock (even worse as it’s harder to detect. All processes shuffling data around but not progressing in the algorithm ) | ||
| - More complex software architecture design. | ||
| - Can also be combined with threading-technologies as openMP/pthreads for optimal performance. |
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1 +1,11 @@ | ||
| # Message Passing | ||
|
|
||
| As each processor has its own local memory with its own address space in distributed computing, we need a way to communicate between the processes and share data. Message passing is the mechanism of exchanging data across processes. Each process can communicate with one or more other processes by sending messages over a network. | ||
|
|
||
| The MPI (message passing interface) in OpenMPI is a communication protocol standard defining message passing between processors in distributed environments and are implemented by different groups with the main goals being high performance, scalability, and portability. | ||
|
|
||
| OpenMPI is one implementation of the MPI standard. It consists of a set of headers library functions that you call from your program. i.e. C, C++, Fortran etc. | ||
|
|
||
| For C, you will need a header file for all the functions (mpi.h) and link in the relevant library functions. This is all handled by the mpicc program (or your compiler if you wanted to specify all the paths). | ||
|
|
||
| In the next chapter we will look at how to implement message passing using OpenMPI. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1 +1,254 @@ | ||
| # OpenMPI | ||
|
|
||
| ## Primary MPI Routines | ||
|
|
||
| ``` C | ||
| int MPI_Init(int * argc, char ** argv); | ||
| // initializes the MPI environment. | ||
| // Argc argv are the parameters come | ||
| // from main(argc,argv). The return value is an | ||
| // error code. 0 is OK. Non-zero is an error code | ||
| ``` | ||
|
|
||
| ``` C | ||
| int MPI_Comm_size(MPI_Comm comm, int \* size); | ||
| // this functions gets the number of MPI processes | ||
| // i.e. the number you enter when you go mpirun -np \<size> myprog.exe | ||
| // *size is C syntax indicating that size will be modified to contain | ||
| // the value after the function returns. The return value is only used | ||
| // for error detection. printf(“MPI size is %d\n”,size); | ||
| int MPI_Comm_rank(MPI_Comm comm, int \* rank); | ||
| // this returns the rank of this particular process | ||
| // rank contains the value for that process- the function return value is an error code | ||
| ``` | ||
|
|
||
|  | ||
|
|
||
| ### Point-to-Point communication | ||
|
|
||
| These are blocking functions - they wait until the message is sent or received. Note that the CPU is actively polling the network interface when waiting for a message. This is opposite in behaviour to other C functions, i.e. c= getChar() (which causes a context switch and then a sleep in the OS). This is done for speed reasons. | ||
|
|
||
| ```C | ||
| int MPI_Send(void * buf, int count, MPI_Datatype type, int dest, int tag, MPI_Comm comm); | ||
| ``` | ||
|
|
||
| Sends a message from the calling process to another process | ||
|
|
||
| INPUT PARAMETERS | ||
|
|
||
| - ```buf``` | ||
| - Initial address of send buffer (choice). | ||
| - ```count``` | ||
| - Number of elements sent (non negative integer). | ||
| - ```type``` | ||
| - DataType of each send buffer element (handle). | ||
| - ```dest``` | ||
| - Rank of destination (integer). | ||
| - ```tag``` | ||
| - Message tag (integer). | ||
| - ```comm``` | ||
| - Communicator (handle). | ||
|
|
||
| OUTPUT PARAMETER | ||
|
|
||
| - ```IERROR``` | ||
| - Fortran only: Error status (integer). | ||
|
|
||
| ```c | ||
| int MPI_Recv(void * buf, int count, MPI_Datatype type, int source, int tag, MPI_Comm comm, MPI_Status * status); | ||
| ``` | ||
|
|
||
| Receives a message from another process | ||
|
|
||
| INPUT PARAMETERS | ||
|
|
||
| - ```count``` | ||
| - Maximum number of elements to receive (integer). | ||
| - ```type``` | ||
| - DataType of each receive buffer entry (handle). | ||
| - ```source``` | ||
| - Rank of source (integer). | ||
| - ```tag``` | ||
| - Message tag (integer). | ||
| - ```comm``` | ||
| - Communicator (handle). | ||
|
|
||
| OUTPUT PARAMETERS | ||
|
|
||
| - ```buf``` | ||
| - Initial address of receive buffer (choice). | ||
| - ```status``` | ||
| - Status object (status). | ||
| - ```IERROR``` | ||
| - Fortran only: Error status (integer). | ||
|
|
||
| ### Primary MPI Routines closing | ||
|
|
||
| In a header file you will find | ||
|
|
||
| ``` C | ||
| int MPI_Finalize(void); | ||
| ``` | ||
|
|
||
| To call in your C or C++ program | ||
|
|
||
| ``` C | ||
| #include <mpi.h> | ||
| MPI_Finalize(); | ||
| ``` | ||
|
|
||
| ## General overview MPI program | ||
|
|
||
| ``` C | ||
| ... | ||
| int MPI_Init(int argc, char ** argv); | ||
| --------------------------Parallel algorithm starts---------------------- | ||
| int MPI_Comm_size(MPI_Comm comm, int * size); | ||
| int MPI_Comm_rank(MPI_Comm comm, int * rank); | ||
| ... | ||
| int MPI_Send(void * buf, int count, MPI_Datatype type, int dest, int tag, MPI_Comm comm); | ||
| int MPI_Recv(void * buf, int count, MPI_Datatype type, int source, int tag, MPI_Comm comm, MPI_Status * status); | ||
| … | ||
| --------------------------Parallel algorithm ends----------------------- | ||
| int MPI_Finalize(void); | ||
| ... | ||
|
|
||
| ``` | ||
|
|
||
| Use man pages to find out more about each routine | ||
|
|
||
| When sending a Process it packs up all of its necessary data into a buffer for the receiving process. These buffers are often referred to as envelopes since the data is being packed into a single message before transmission (similar to how letters are packed into envelopes before transmission to the post office) | ||
|
|
||
| ## Elementary MPI Data types | ||
|
|
||
| MPI_Send and MPI_Recv utilize MPI Datatypes as a means to specify the structure of a message at a higher level. The data types defined in the table below are simple in nature and for custom data structures you will have to define the structure. | ||
|
|
||
| | MPI datatype | C equivalent | | ||
| |-------------------------|------------------------| | ||
| | MPI_SHORT | short int | | ||
| | MPI_INT | int | | ||
| | MPI_LONG | long int | | ||
| | MPI_LONG_LONG | long long int | | ||
| | MPI_UNSIGNED_CHAR | unsigned char | | ||
| | MPI_UNSIGNED_SHORT | unsigned short int | | ||
| | MPI_UNSIGNED | unsigned int | | ||
| | MPI_UNSIGNED_LONG | unsigned long int | | ||
| | MPI_UNSIGNED_LONG_LONG | unsigned long long int | | ||
| | MPI_FLOAT | float | | ||
| | MPI_DOUBLE | double | | ||
| | MPI_LONG_DOUBLE | long double | | ||
| | MPI_BYTE | char | | ||
|
|
||
| ## Example of a simple program | ||
|
|
||
| ``` C | ||
|
|
||
| /* | ||
| MPI Program, send ranks | ||
| */ | ||
|
|
||
| #include <stdio.h> | ||
| #include <mpi.h> | ||
|
|
||
| #define MASTER 0 | ||
|
|
||
| int main(int argc, char *argv[]) | ||
| { | ||
|
|
||
| int my_rank; | ||
| /* Also known as world size */ | ||
| int num_processes; | ||
|
|
||
| /* Initialize the infrastructure necessary for communication */ | ||
| MPI_Init(&argc, &argv); | ||
|
|
||
| /* Identify this process */ | ||
| MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); | ||
|
|
||
| /* Find out how many total processes are active */ | ||
| MPI_Comm_size(MPI_COMM_WORLD, &num_processes); | ||
|
|
||
| printf("Process %d: There is a total of %d \n", my_rank, num_processes); | ||
|
|
||
| if (my_rank == MASTER) | ||
| { | ||
| int dest = 1; | ||
| int tag = 0; | ||
| int count = 1; | ||
|
|
||
| MPI_Send(&my_rank, count, MPI_INT, dest, tag, MPI_COMM_WORLD); | ||
|
|
||
| printf("Process %d: Sent my_rank to process %d \n", my_rank, dest); | ||
| } | ||
| else | ||
| { | ||
| int tag = 0; | ||
| int count = 1; | ||
| int buffer; | ||
| MPI_Recv(&buffer, count, MPI_INT, MASTER, tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE); | ||
| printf("Process %d: Received %d from process %d \n", my_rank, buffer, MASTER); | ||
| } | ||
|
|
||
| /* Tear down the communication infrastructure */ | ||
| MPI_Finalize(); | ||
| return 0; | ||
| } | ||
| ``` | ||
|
|
||
| ## Compilation and Linking | ||
|
|
||
| - Make sure you have the following packages installed and that they are in your $PATH: | ||
| - gcc | ||
| - OPENMPI or MPICH | ||
| - To compile and Link: | ||
| - ```mpicc -Wall -o <program-name.exe> <program-name.c>``` | ||
| - -Wall This enables all the warnings about questionable code. | ||
| - -o sets the output executable name. If you omit it, it defaults to a.out | ||
| - To run: | ||
| - ```mpirun -np <Number-of-processes> <program-name.exe>``` | ||
| - Behind the scenes: | ||
| - mpicc is just a wrapper around a C compiler. To see what it does type: | ||
| - ```mpicc –showme``` | ||
|
|
||
| ### sbatch to send job to compute nodes using SLURM | ||
|
|
||
| ``` bash | ||
| #!/bin/bash | ||
| #SBATCH --job-name=Vaccinator | ||
| #SBATCH --ntasks=4 | ||
| #SBATCH --ntasks-per-node=4 | ||
| #SBATCH –time=00:10:00 | ||
|
|
||
| ~/vf38/HPC_Training/spack/share/spack/setup-env.sh | ||
| spack load mpich | ||
|
|
||
| mpirun -np 4 ./my-awesome-program | ||
| ``` | ||
|
|
||
| <https://docs.massive.org.au/M3/slurm/mpi-jobs.html> | ||
|
|
||
| - ntasks Controls the number of tasks to be created for the job | ||
| - ntasks-per-node Controls the maximum number of tasks per allocated node | ||
| - cpus-per-task Controls the number of CPUs allocated per task | ||
|
|
||
| ## Measuring performance | ||
|
|
||
| - ```htop``` to check the CPU usage. You need to run this command while the process is running | ||
| - If you are using SLURM, you will need to use ```squeue``` or ```scontrol``` to find the compute node it is running on and then ssh into it. | ||
| - ```time``` is a shell command to check the overall wall time , i.e. | ||
| - ```time mpirun -np 4 myProg.exe``` | ||
| - You can also use a MPI profiler | ||
|
|
||
| There are some useful commands to check the parallelism of the code. | ||
| The command top or htop looks into a process. As you can see from the image below, it shows the CPU usages | ||
|
|
||
|  | ||
|
|
||
| - The command ```time``` checks the overall performance of the code | ||
| - By running this command, you get real time, user time and system time. | ||
| - Real is wall clock time - time from start to finish of the call. This includes the time of overhead | ||
| - User is the amount of CPU time spent outside the kernel within the process | ||
| - Sys is the amount of CPU time spent in the kernel within the process. | ||
| - User time +Sys time will tell you how much actual CPU time your process used. | ||
|
|
||
|  |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try and come up with new challenges. The old repo is not going to be used anymore so they're are not relevant.