Parallel Groups

When systems are added to a ParallelGroup, they will be executed in parallel, assuming that the ParallelGroup is given an MPI communicator of sufficient size. Adding subsystems to a ParallelGroup is no different than adding them to a normal Group. For example:

from openmdao.api import Problem, IndepVarComp, ParallelGroup, ExecComp, PETScVector

prob = Problem()
model = prob.model

model.add_subsystem('p1', IndepVarComp('x', 1.0))
model.add_subsystem('p2', IndepVarComp('x', 1.0))

parallel = model.add_subsystem('parallel', ParallelGroup())
parallel.add_subsystem('c1', ExecComp(['y=-2.0*x']))
parallel.add_subsystem('c2', ExecComp(['y=5.0*x']))

model.add_subsystem('c3', ExecComp(['y=3.0*x1+7.0*x2']))

model.connect("parallel.c1.y", "c3.x1")
model.connect("parallel.c2.y", "c3.x2")

model.connect("p1.x", "parallel.c1.x")
model.connect("p2.x", "parallel.c2.x")

prob.setup(check=False, mode='fwd')
prob.set_solver_print(level=0)
prob.run_model()

print(prob['c3.y'])
(rank 0) [ 29.]
(rank 1) [ 29.]

In this example, components c1 and c2 will be executed in parallel, provided that the ParallelGroup is given two MPI processes. If the name of the python file containing our example were my_par_model.py, we could run it under MPI and give it two processes using the following command:

mpirun -n 2 python my_par_model.py

Note

This will only work if you’ve installed the mpi4py and petsc4py python packages, which are not installed by default in OpenMDAO.

In the previous example, both components in the ParallelGroup required just a single MPI process, but what happens if we want to add subsystems to a ParallelGroup that has other processor requirements? In OpenMDAO, we control process allocation behavior by setting the min_procs and/or max_procs or proc_weights args when we call the add_subsystem function to add a particular subsystem to a ParallelGroup.

Group.add_subsystem(name, subsys, promotes=None, promotes_inputs=None, promotes_outputs=None, min_procs=1, max_procs=None, proc_weight=1.0)[source]

Add a subsystem.

Parameters:
name : str

Name of the subsystem being added

subsys : <System>

An instantiated, but not-yet-set up system object.

promotes : iter of (str or tuple), optional

A list of variable names specifying which subsystem variables to ‘promote’ up to this group. If an entry is a tuple of the form (old_name, new_name), this will rename the variable in the parent group.

promotes_inputs : iter of (str or tuple), optional

A list of input variable names specifying which subsystem input variables to ‘promote’ up to this group. If an entry is a tuple of the form (old_name, new_name), this will rename the variable in the parent group.

promotes_outputs : iter of (str or tuple), optional

A list of output variable names specifying which subsystem output variables to ‘promote’ up to this group. If an entry is a tuple of the form (old_name, new_name), this will rename the variable in the parent group.

min_procs : int

Minimum number of MPI processes usable by the subsystem. Defaults to 1.

max_procs : int or None

Maximum number of MPI processes usable by the subsystem. A value of None (the default) indicates there is no maximum limit.

proc_weight : float

Weight given to the subsystem when allocating available MPI processes to all subsystems. Default is 1.0.

Returns:
<System>

the subsystem that was passed in. This is returned to enable users to instantiate and add a subsystem at the same time, and get the reference back.

If you use both min_procs/max_procs and proc_weights, it can become less obvious what the resulting process allocation will be, so you may want to stick to just using one or the other. The algorithm used for the allocation starts, assuming that the number of processes is greater or equal to the number of subsystems, by assigning the min_procs for each subsystem. It then adds any remaining processes to subsystems based on their weights, being careful not to exceed their specified max_procs, if any.

If the number of processes is less than the number of subsystems, then each subsystem, one at a time, starting with the one with the highest proc_weight, is allocated to the least-loaded process. An exception will be raised if any of the subsystems in this case have a min_procs value greater than one.