Simulating PPFL

This package provides users with the capability of simulating PPFL on either a single machine or a cluster.

Note

Running (either training or simulating) PPFL on multiple heterogeneous machines is described in Training PPFL.

We describe how to simulate PPFL with a given model and datasets. For simulation, we assume that test_data is available to validate the training.

Serial run

Serial runs begin simply by calling the following API function.

Some remarks are made as follows:

Parallel run with MPI

We can parallelize the PPFL simulation by usinig MPI through mpi4py package. The following two API functions need to be called for parallelization.

The server and the clients begin by run_server and run_client, respectively, where MPI communicator (e.g., MPI.COMM_WORLD in this example) is given as an argument.

Note

We assume that MPI process 0 runs the server, and the other processes run clients.

Note

mpiexec may need to specify additional argument to use CUDA: --mca opal_cuda_support 1