4 Chapter 3: Distributed Memory Programming with MPI
In this chapter, we will be focusing on using the OpenMPI library with C to create parallel programs (SPMD) that run on distributed memory systems (multiple processes as opposed to multithreading).
4.1 OpenMPI Cheatsheet
4.1.1 Import
#include <mpi.h>4.1.2 Compilation
mpicc -g -Wall -o path/to/executable path/to/source/code4.1.3 Execution
mpiexec -n <number of processes> path/to/executable4.1.4 Initialization
int MPI_Init(
int* argc_p,
char*** argv_p
);Takes in the arguments given to the main function (the values passed in the CLI when executing). Sincle usually the main function will take no input, you will most likely call it in the following manner
MPI_Init(NULL, NULL);4.1.5 Finalization
int MPI_Finalize(void);Usually called
MPI_Finalize();4.1.6 MPI_COMM_WORLD
The most commonly used communication world in MPI is MPI_COMM_WORLD (all caps).
4.1.7 Number of Processors in Communicator
int MPI_Comm_size(
MPI_Comm comm,
int* comm_sz_p
);You give it an MPI communicator and an integer pointer. It places the number of processes in the passed pointer. Usually used in the following manner
int comm_sz;
MPI_Comm_size(MPI_COMM_WORLD, &comm_sz);The variable commm_sz now stores the number of processes in MPI_COMM_WORLD.
4.1.8 Current Process Rank
The rank acts like the id of the process in its communicator, and it is a value in between 0 and \(p-1\) where \(p\) is the number of process in the communicator.
int MPI_Comm_rank(
MPI_Comm comm,
int* my_rank_p
);Its interface is very similar to MPI_Comm_size. It is usually used in the following manner
int my_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);4.1.9 Datatypes
| MPI Datatype | C Datatype |
|---|---|
MPI_CHAR |
signed char |
MPI_SHORT |
signed short int |
MPI_INT |
signed int |
MPI_LONG |
signed long int |
MPI_LONG_LONG |
signed long long int |
MPI_UNSIGNED_CHAR |
unsigned char |
MPI_UNSIGNED_SHORT |
unsigned short int |
MPI_UNSIGNED |
unsigned int |
MPI_UNSIGNED_LONG |
unsigned long int |
MPI_FLOAT |
float |
MPI_DOUBLE |
double |
MPI_LONG_DOUBLE |
long double |
MPI_BYTE |
|
MPI_PACKED |
4.1.10 Sending Data
int MPI_Send(
void* msg_buffer_pointer,
int msg_size,
MPI_Datatype msg_type
int dest_rank,
int tag,
MPI_Comm communicator
);MPI_Send is used to send a message from one process to another. It takes 6 parameters
- A pointer to the variable containing the message to be sent
- The number of
msg_types to be sent. For example, if you are sending a strings, the message type will beMPI_CHARand the value ofmsg_sizewill bestrlen(s) + 1, where+1is for the\0character in the end. - The type of the data being sent.
- The rank of the destination process
- Tag of the message to be sent
- The communicator; usually
MPI_COMM_WORLD
Example of using it:
MPI_Send(greeting, strlen(greeting)+1, MPI_CHAR, 0, 0, MPI_COMM_WORLD);4.1.11 Receiving Data
int MPI_Recv(
void* msg_buffer_pointer,
int buffer_size,
MPI_Datatype buffer_type
int dest_rank,
int tag,
MPI_Comm communicator,
MPI_Status* status_pointer
);7 parameters. The first 6 are the same as MPI_Send with the only difference being that the second parameter is now the buffer size and can be bigger (but not smaller) than the received message. The last parameter will have its own section. If you are not interested in recieveing the status, just pass MPI_STATUS_IGNORE to it.
Example of using it:
const int MAX_STRING = 100;
char greeting[MAX_STRING];
MPI_Recv(greeting, MAX_STRING, MPI_CHAR, 5, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);4.1.12 The Stauts of MPI_Recv
The last paramater of MPI_Recv is a pointer to a struct that will carry information related to the received message. At least, it includes
status.MPI_SOURCE: the rank of the message senderstatus.MPI_TAG: the tag of the recieved messagestatus.MPI_ERROR
You can also use it to extract the amount of data that has been recieved by passing it to the MPI_Get_count function.
int count;
MPI_Get_count(&status, recv_type, &count);4.1.13 Wildcards
You could pass MPI_ANY_SOURCE to the source parameter of MPI_Recv to receive a message from any process.
You could pass MPI_ANY_TAG to the tag parameter of MPI_Recv to receive a message from any process.
There is no wildcard for the communicator.
4.1.14 General Structure of an MPI program
#include <mpi.h>
int main(){
MPI_Init(NULL, NULL);
int comm_sz;
int my_rank;
MPI_Comm_size(MPI_COMM_WORLD, &comm_sz);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
/*
The actual algorithm code
Uses MPI_Send and MPI_Recv
*/
MPI_Finalize();
return 0;
}4.1.15 MPI Collective Communication
- All the processes in the communicator must call the same collective function
- The arguments passed from the different processes must be compatible
- The
output_data_pparameter, which is used in most collective communication functions, is only used on thedest_process. All others must pass an argument that can beNULLto it. - Collective communications don’t use tags. They are mached based on the communicator and the order of the calls.
4.1.16 MPI_Reduce
This is a collective communication operation.
It performs reduction operation: combines data from all processes and performs an operation that joins them together into a single value. The result is stored in the output_data_p parameter of the dest_process.
int MPI_Reduce(
void* input_data_p,
void* output_data_p,
int count,
MPI_Datatype datatype,
MPI_Op op,
int root,
MPI_Comm communicator
);Sample usage:
int local_int = my_rank;
int global_sum;
MPI_Reduce(&local_int, &global_sum, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD);4.1.17 MPI_Op
Available operations in MPI are
| Operation | Description |
|---|---|
MPI_MAX |
Maximum value |
MPI_MIN |
Minimum value |
MPI_SUM |
Sum of values |
MPI_PROD |
Product of values |
MPI_LAND |
Logical AND |
MPI_BAND |
Bitwise AND |
MPI_LOR |
Logical OR |
MPI_BOR |
Bitwise OR |
MPI_LXOR |
Logical XOR |
MPI_BXOR |
Bitwise XOR |
MPI_MAXLOC |
Maximum value and location |
MPI_MINLOC |
Minimum value and location |
4.1.18 MPI_Allreduce
Same like reduce but stores the result in all processes. Same parameters except for the root.
int MPI_Allreduce(
void* input_data_p,
void* output_data_p,
int count,
MPI_Datatype datatype,
MPI_Op op,
MPI_Comm communicator
);4.1.19 Broadcast
A collective communication operation that sends data from one process to all other processes in the communicator.
The images below show how a broadcast operation (global sum) works using the tree-like structure and the butterfly structure.


4.1.20 MPI_Bcast
Sends data from one process to all other processes in the communicator.
int MPI_Bcast(
void* data_p,
int count,
MPI_Datatype datatype,
int source_proc,
MPI_Comm communicator
);4.1.21 MPI_Scatter
Reads an entire vector from the source process and sends a portion of it to each process in the communicator.
int MPI_Scatter(
void* send_data_p,
int send_count,
MPI_Datatype send_datatype,
void* recv_data_p,
int recv_count,
MPI_Datatype recv_datatype,
int src_proc,
MPI_Comm comm
);send_count is the amount of data elements going to each process not the total amount of data being sent.
4.1.22 MPI_Gather
Reads a portion of a vector from each process in the communicator and sends it to the source process where it is concatenated together.
int MPI_Gather(
void* send_data_p,
int send_count,
MPI_Datatype send_datatype,
void* recv_data_p,
int recv_count,
MPI_Datatype recv_datatype,
int dest_proc,
MPI_Comm comm
);recv_count is the amount of data elements coming from each process not the total amount of data being received.
4.1.23 MPI_Allgather
Same as MPI_Gather but the result is stored in all processes in the communicator. Once again, recv_count is the amount of data elements coming from each process not the total and there is no dest_proc parameter.
int MPI_Allgather(
void* send_data_p,
int send_count,
MPI_Datatype send_datatype,
void* recv_data_p,
int recv_count,
MPI_Datatype recv_datatype,
MPI_Comm comm
);4.1.24 MPI_Barrier
Ensures that no process will return from calling it until every process in the communicator has started calling it.
int MPI_Barrier(MPI_Comm comm);4.1.25 MPI_Ssend
Alternative to MPI_Send that is synchronous. Guranteed to block until the message receive starts.
4.1.26 MPI_Sendrecv
Carries out a blocking send and a receive in a single call.
The destination and source processes can be the same or different, and the send and receive buffers can be the same or different.
int MPI_Sendrecv(
void* send_data_p,
int send_count,
MPI_Datatype send_datatype,
int dest_proc,
int send_tag,
void* recv_data_p,
int recv_count,
MPI_Datatype recv_datatype,
int source_proc,
int recv_tag,
MPI_Comm comm,
MPI_Status* status_p
);4.1.27 Tips
- All MPI functions start with
MPI_and only the first follwing character is capitalized - Constants are all caps
These notes skip MPI derived datatypes and performace evaluation.