This feature requires MPI, and may not be able to be run on Colab.
When systems are added to a
ParallelGroup, they will be executed in parallel, assuming that the
ParallelGroup is given an MPI communicator of sufficient size. Adding subsystems to a ParallelGroup is no different than adding them to a normal Group. For example:
%%px import openmdao.api as om prob = om.Problem() model = prob.model model.set_input_defaults('x', 1.) parallel = model.add_subsystem('parallel', om.ParallelGroup(), promotes_inputs=[('c1.x', 'x'), ('c2.x', 'x'), ('c3.x', 'x'), ('c4.x', 'x')]) parallel.add_subsystem('c1', om.ExecComp(['y=-2.0*x'])) parallel.add_subsystem('c2', om.ExecComp(['y=5.0*x'])) parallel.add_subsystem('c3', om.ExecComp(['y=-3.0*x'])) parallel.add_subsystem('c4', om.ExecComp(['y=4.0*x'])) model.add_subsystem('c5', om.ExecComp(['y=3.0*x1 + 7.0*x2 - 2.0*x3 + x4'])) model.connect("parallel.c1.y", "c5.x1") model.connect("parallel.c2.y", "c5.x2") model.connect("parallel.c3.y", "c5.x3") model.connect("parallel.c4.y", "c5.x4") prob.setup(check=False, mode='fwd') prob.run_model() print(prob['c5.y'])
In this example, components c1 through c4 will be executed in parallel, provided that the
ParallelGroup is given four MPI processes. If the name of the python file containing our example were
my_par_model.py, we could run it under MPI and give it two processes using the following command:
mpirun -n 4 python my_par_model.py
This will only work if you’ve installed the mpi4py and petsc4py python packages, which are not installed by default in OpenMDAO.
In the previous example, all four components in the
ParallelGroup required just a single MPI process, but
what happens if we want to add subsystems to a
ParallelGroup that has other processor requirements?
In OpenMDAO, we control process allocation behavior by setting the
proc_weight args when we call the
add_subsystem function to add a particular subsystem to
- Group.add_subsystem(name, subsys, promotes=None, promotes_inputs=None, promotes_outputs=None, min_procs=1, max_procs=None, proc_weight=1.0, proc_group=None)
Add a subsystem.
Name of the subsystem being added.
An instantiated, but not-yet-set up system object.
- promotesiter of (str or tuple), optional
A list of variable names specifying which subsystem variables to ‘promote’ up to this group. If an entry is a tuple of the form (old_name, new_name), this will rename the variable in the parent group.
- promotes_inputsiter of (str or tuple), optional
A list of input variable names specifying which subsystem input variables to ‘promote’ up to this group. If an entry is a tuple of the form (old_name, new_name), this will rename the variable in the parent group.
- promotes_outputsiter of (str or tuple), optional
A list of output variable names specifying which subsystem output variables to ‘promote’ up to this group. If an entry is a tuple of the form (old_name, new_name), this will rename the variable in the parent group.
Minimum number of MPI processes usable by the subsystem. Defaults to 1.
- max_procsint or None
Maximum number of MPI processes usable by the subsystem. A value of None (the default) indicates there is no maximum limit.
Weight given to the subsystem when allocating available MPI processes to all subsystems. Default is 1.0.
- proc_groupstr or None
Name of a processor group such that any system with that processor group name within the same parent group will be allocated on the same mpi process(es). If this is not None, then any other systems sharing the same proc_group must have identical values of min_procs, max_procs, and proc_weight or an exception will be raised.
The subsystem that was passed in. This is returned to enable users to instantiate and add a subsystem at the same time, and get the reference back.
If you use both
proc_weight, it can become less obvious what the
resulting process allocation will be, so you may want to stick to just using one or the other.
The algorithm used for the allocation starts, assuming that the number of processes is greater than or
equal to the number of subsystems, by assigning the
min_procs for each subsystem. It then adds
any remaining processes to subsystems based on their weights, being careful not to exceed their
max_procs, if any.
If the number of processes is less than the number of subsystems, then each subsystem, one at a
time, starting with the one with the highest
proc_weight, is allocated to the least-loaded process.
An exception will be raised if any of the subsystems in this case have a
min_procs value greater than one.