DOEDriver#

DOEDriver facilitates performing a design of experiments (DOE) with your OpenMDAO model. It will run your model multiple times with different values for the design variables depending on the selected input generator. A number of generators are available, each with its own parameters that can be specified when it is instantiated:

  • UniformGenerator

  • FullFactorialGenerator

  • PlackettBurmanGenerator

  • BoxBehnkenGenerator

  • LatinHypercubeGenerator

  • CSVGenerator

  • ListGenerator

See the source documentation of these generators for details.

Note

FullFactorialGenerator, PlackettBurmanGenerator, BoxBehnkenGenerator and LatinHypercubeGenerator are provided via the pyDOE3 package, which is an updated version of pyDOE. See the original pyDOE page for information on those algorithms.

The generator instance may be supplied as an argument to the DOEDriver or as an option.

DOEDriver Options#

OptionDefaultAcceptable ValuesAcceptable TypesDescription
debug_print[]['desvars', 'nl_cons', 'ln_cons', 'objs', 'totals']['list']List of what type of Driver variables to print at each iteration.
generatorDOEGeneratorN/A['DOEGenerator']The case generator. If default, no cases are generated.
invalid_desvar_behaviorwarn['warn', 'raise', 'ignore']N/ABehavior of driver if the initial value of a design variable exceeds its bounds. The default value may beset using the `OPENMDAO_INVALID_DESVAR_BEHAVIOR` environment variable to one of the valid options.
procs_per_model1N/A['int']Number of processors to give each model under MPI.
run_parallelFalse[True, False]['bool']Set to True to execute cases in parallel.

DOEDriver Constructor#

The call signature for the DOEDriver constructor is:

DOEDriver.__init__(generator=None, **kwargs)[source]

Construct A DOEDriver.

Simple Example#

UniformGenerator implements the simplest method and will generate a requested number of samples randomly selected from a uniform distribution across the valid range for each design variable. This example demonstrates its use with a model built on the Paraboloid Component. An SqliteRecorder is used to capture the cases that were generated. We can see that that the model was evaluated at random values of x and y between -10 and 10, per the lower and upper bounds of those design variables.

import openmdao.api as om
from openmdao.test_suite.components.paraboloid import Paraboloid

prob = om.Problem()
model = prob.model

model.add_subsystem('comp', Paraboloid(), promotes=['*'])

model.add_design_var('x', lower=-10, upper=10)
model.add_design_var('y', lower=-10, upper=10)
model.add_objective('f_xy')

prob.driver = om.DOEDriver(om.UniformGenerator(num_samples=5))
prob.driver.add_recorder(om.SqliteRecorder("cases.sql"))

prob.setup()

prob.set_val('x', 0.0)
prob.set_val('y', 0.0)

prob.run_driver()
prob.cleanup()

cr = om.CaseReader(prob.get_outputs_dir() / "cases.sql")
cases = cr.list_cases('driver')

driver
rank0:DOEDriver_Uniform|0
rank0:DOEDriver_Uniform|1
rank0:DOEDriver_Uniform|2
rank0:DOEDriver_Uniform|3
rank0:DOEDriver_Uniform|4
values = []
for case in cases:
    outputs = cr.get_case(case).outputs
    values.append((outputs['x'], outputs['y'], outputs['f_xy']))

print("\n".join(["x: %5.2f, y: %5.2f, f_xy: %6.2f" % xyf for xyf in values]))
x: -3.75, y:  3.19, f_xy:  82.25
x:  3.63, y: -6.11, f_xy: -20.34
x: -6.92, y:  3.78, f_xy: 129.72
x: -0.79, y:  7.45, f_xy: 136.66
x:  6.52, y: -1.37, f_xy:   7.42
/tmp/ipykernel_24524/2620143268.py:6: DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
  print("\n".join(["x: %5.2f, y: %5.2f, f_xy: %6.2f" % xyf for xyf in values]))

Running a DOE in Parallel#

In a parallel processing environment, it is possible for DOEDriver to run cases concurrently. This is done by setting the run_parallel option to True as shown in the following example and running your script using MPI.

Here we are using the FullFactorialGenerator with 3 levels to generate inputs for our Paraboloid model. With two inputs, \(3^2=9\) cases have been generated. In this case we are running on four processors and have specified options['run_parallel']=True to run cases on all available processors. The cases have therefore been split with 3 cases run on the first processor and 2 cases on each of the other processors.

Note that, when running in parallel, the SqliteRecorder will generate a separate case file for each processor on which cases are recorded. The case files will have a suffix indicating the recording rank and a message will be displayed indicating the file name, as seen in the example.

Note

This feature requires MPI, and may not be able to be run on Colab or Binder.

%%px

import openmdao.api as om
from openmdao.test_suite.components.paraboloid import Paraboloid

prob = om.Problem()

prob.model.add_subsystem('comp', Paraboloid(), promotes=['x', 'y', 'f_xy'])
prob.model.add_design_var('x', lower=0.0, upper=1.0)
prob.model.add_design_var('y', lower=0.0, upper=1.0)
prob.model.add_objective('f_xy')

prob.driver = om.DOEDriver(om.FullFactorialGenerator(levels=3))
prob.driver.options['run_parallel'] = True
prob.driver.options['procs_per_model'] = 1

prob.driver.add_recorder(om.SqliteRecorder("cases.sql"))

prob.setup()
prob.run_driver()
prob.cleanup()
[stdout:3] Note: SqliteRecorder is running on multiple processors. Cases from rank 3 are being written to /home/runner/work/OpenMDAO/OpenMDAO/openmdao/docs/problem_out/cases.sql_3.
[stdout:1] Note: SqliteRecorder is running on multiple processors. Cases from rank 1 are being written to /home/runner/work/OpenMDAO/OpenMDAO/openmdao/docs/problem_out/cases.sql_1.
[stdout:2] Note: SqliteRecorder is running on multiple processors. Cases from rank 2 are being written to /home/runner/work/OpenMDAO/OpenMDAO/openmdao/docs/problem_out/cases.sql_2.
[stdout:0] Note: SqliteRecorder is running on multiple processors. Cases from rank 0 are being written to /home/runner/work/OpenMDAO/OpenMDAO/openmdao/docs/problem_out/cases.sql_0.
Note: Metadata is being recorded separately as /home/runner/work/OpenMDAO/OpenMDAO/openmdao/docs/problem_out/cases.sql_meta.
%%px

# check recorded cases from each case file
from mpi4py import MPI
rank = MPI.COMM_WORLD.rank

filename = "cases.sql_%d" % rank

cr = om.CaseReader(prob.get_outputs_dir() / filename)
cases = cr.list_cases('driver', out_stream=None)
print(cases)
[stdout:1] ['rank0:DOEDriver_FullFactorial|0', 'rank0:DOEDriver_FullFactorial|1']
[stdout:2] ['rank0:DOEDriver_FullFactorial|0', 'rank0:DOEDriver_FullFactorial|1']
[stdout:0] ['rank0:DOEDriver_FullFactorial|0', 'rank0:DOEDriver_FullFactorial|1', 'rank0:DOEDriver_FullFactorial|2']
[stdout:3] ['rank0:DOEDriver_FullFactorial|0', 'rank0:DOEDriver_FullFactorial|1']
%%px

values = []
for case in cases:
    outputs = cr.get_case(case).outputs
    values.append((outputs['x'], outputs['y'], outputs['f_xy']))

print("\n"+"\n".join(["x: %5.2f, y: %5.2f, f_xy: %6.2f" % xyf for xyf in values]))
[stdout:0] 
x:  0.00, y:  0.00, f_xy:  22.00
x:  0.50, y:  0.50, f_xy:  23.75
x:  1.00, y:  1.00, f_xy:  27.00
[stdout:2] 
x:  1.00, y:  0.00, f_xy:  17.00
x:  0.00, y:  1.00, f_xy:  31.00
[stderr:2] /tmp/ipykernel_19388/4258879601.py:6: DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
  print("\n"+"\n".join(["x: %5.2f, y: %5.2f, f_xy: %6.2f" % xyf for xyf in values]))
[stdout:1] 
x:  0.50, y:  0.00, f_xy:  19.25
x:  1.00, y:  0.50, f_xy:  21.75
[stdout:3] 
x:  0.00, y:  0.50, f_xy:  26.25
x:  0.50, y:  1.00, f_xy:  28.75
[stderr:3] /tmp/ipykernel_19389/4258879601.py:6: DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
  print("\n"+"\n".join(["x: %5.2f, y: %5.2f, f_xy: %6.2f" % xyf for xyf in values]))
[stderr:0] /tmp/ipykernel_19386/4258879601.py:6: DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
  print("\n"+"\n".join(["x: %5.2f, y: %5.2f, f_xy: %6.2f" % xyf for xyf in values]))
[stderr:1] /tmp/ipykernel_19387/4258879601.py:6: DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
  print("\n"+"\n".join(["x: %5.2f, y: %5.2f, f_xy: %6.2f" % xyf for xyf in values]))

Running a DOE in Parallel with a Parallel Model#

If the model that is being subjected to the DOE is also parallel, then the total number of processors should reflect the model size as well as the desired concurrency.

To illustrate this, we will demonstrate performing a DOE on a model based on the ParallelGroup example:

%%px

class FanInGrouped(om.Group):
    """
    Topology where two components in a Group feed a single component
    outside of that Group.
    """

    def __init__(self):
        super().__init__()

        self.set_input_defaults('x1', 1.0)
        self.set_input_defaults('x2', 1.0)

        self.sub = self.add_subsystem('sub', om.ParallelGroup(),
                                      promotes_inputs=['x1', 'x2'])

        self.sub.add_subsystem('c1', om.ExecComp(['y=-2.0*x']),
                               promotes_inputs=[('x', 'x1')])
        self.sub.add_subsystem('c2', om.ExecComp(['y=5.0*x']),
                               promotes_inputs=[('x', 'x2')])

        self.add_subsystem('c3', om.ExecComp(['y=3.0*x1+7.0*x2']))

        self.connect("sub.c1.y", "c3.x1")
        self.connect("sub.c2.y", "c3.x2")

In this case, the model itself requires two processors, so in order to run cases concurrently we need to allocate at least four processors in total. We can allocate as many processors as we have available, however the number of processors must be a multiple of the number of processors per model, which is 2 here. Regardless of how many processors we allocate, we need to tell the DOEDriver that the model needs 2 processors, which is done by specifying options['procs_per_model']=2. From this, the driver figures out how many models it can run in parallel, which in this case is also 2.

The SqliteRecorder will record cases on the first two processors, which serve as the “root” processors for the parallel cases.

%%px

import openmdao.api as om

prob = om.Problem(FanInGrouped())

prob.model.add_design_var('x1', lower=0.0, upper=1.0)
prob.model.add_design_var('x2', lower=0.0, upper=1.0)
prob.model.add_objective('c3.y')

prob.driver = om.DOEDriver(om.FullFactorialGenerator(levels=3))
prob.driver.add_recorder(om.SqliteRecorder("cases.sql"))

# the FanInGrouped model uses 2 processes, so we can run
# two instances of the model at a time, each using 2 of our 4 procs
prob.driver.options['run_parallel'] = True
prob.driver.options['procs_per_model'] = procs_per_model = 2

prob.setup()
prob.run_driver()
prob.cleanup()

# a separate case file will be written by rank 0 of each parallel model
# (the top two global ranks)
rank = prob.comm.rank

num_models = prob.comm.size // procs_per_model

if rank < num_models:
    filename = "cases.sql_%d" % rank

    cr = om.CaseReader(prob.get_outputs_dir() / filename)
    cases = cr.list_cases('driver', out_stream=None)

    values = []
    for case in cases:
        outputs = cr.get_case(case).outputs
        values.append((outputs['x1'], outputs['x2'], outputs['c3.y']))

    print("\n"+"\n".join(["x1: %5.2f, x2: %5.2f, c3.y: %6.2f" % (x1, x2, y) for x1, x2, y in values]))
[stdout:1] Note: SqliteRecorder is running on multiple processors. Cases from rank 1 are being written to /home/runner/work/OpenMDAO/OpenMDAO/openmdao/docs/problem2_out/cases.sql_1.

x1:  0.50, x2:  0.00, c3.y:  -3.00
x1:  0.00, x2:  0.50, c3.y:  17.50
x1:  1.00, x2:  0.50, c3.y:  11.50
x1:  0.50, x2:  1.00, c3.y:  32.00
[stdout:0] Note: SqliteRecorder is running on multiple processors. Cases from rank 0 are being written to /home/runner/work/OpenMDAO/OpenMDAO/openmdao/docs/problem2_out/cases.sql_0.
Note: Metadata is being recorded separately as /home/runner/work/OpenMDAO/OpenMDAO/openmdao/docs/problem2_out/cases.sql_meta.

x1:  0.00, x2:  0.00, c3.y:   0.00
x1:  1.00, x2:  0.00, c3.y:  -6.00
x1:  0.50, x2:  0.50, c3.y:  14.50
x1:  0.00, x2:  1.00, c3.y:  35.00
x1:  1.00, x2:  1.00, c3.y:  29.00
[stderr:1] /tmp/ipykernel_19387/1633958096.py:38: DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
  print("\n"+"\n".join(["x1: %5.2f, x2: %5.2f, c3.y: %6.2f" % (x1, x2, y) for x1, x2, y in values]))
[stderr:0] /tmp/ipykernel_19386/1633958096.py:38: DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
  print("\n"+"\n".join(["x1: %5.2f, x2: %5.2f, c3.y: %6.2f" % (x1, x2, y) for x1, x2, y in values]))

Using Prepared Cases#

If you have a previously generated set of cases that you want to run using DOEDriver, there are a couple of ways to do that. The first is to provide those inputs via an external file in the CSV (comma separated values) format. The file should be organized with one column per design variable, with the first row containing the names of the design variables. The following example demonstrates how to use such a file to run a DOE using the CSVGenerator:

import openmdao.api as om
from openmdao.test_suite.components.paraboloid import Paraboloid

prob = om.Problem()
model = prob.model

model.add_subsystem('comp', Paraboloid(), promotes=['x', 'y', 'f_xy'])

model.add_design_var('x', lower=0.0, upper=1.0)
model.add_design_var('y', lower=0.0, upper=1.0)
model.add_objective('f_xy')

prob.setup()

prob.set_val('x', 0.0)
prob.set_val('y', 0.0)

# this file contains design variable inputs in CSV format
with open('saved_cases.csv', 'r') as f:
    print(f.read())
 x ,   y
0.0,  0.0
0.5,  0.0
1.0,  0.0
0.0,  0.5
0.5,  0.5
1.0,  0.5
0.0,  1.0
0.5,  1.0
1.0,  1.0
# run problem with DOEDriver using the CSV file
prob.driver = om.DOEDriver(om.CSVGenerator('saved_cases.csv'))
prob.driver.add_recorder(om.SqliteRecorder("cases.sql"))

prob.run_driver()
prob.cleanup()

cr = om.CaseReader(prob.get_outputs_dir() / "cases.sql")
cases = cr.list_cases('driver', out_stream=None)

values = []
for case in cases:
    outputs = cr.get_case(case).outputs
    values.append((outputs['x'], outputs['y'], outputs['f_xy']))

print("\n".join(["x: %5.2f, y: %5.2f, f_xy: %6.2f" % xyf for xyf in values]))
x:  0.00, y:  0.00, f_xy:  22.00
x:  0.50, y:  0.00, f_xy:  19.25
x:  1.00, y:  0.00, f_xy:  17.00
x:  0.00, y:  0.50, f_xy:  26.25
x:  0.50, y:  0.50, f_xy:  23.75
x:  1.00, y:  0.50, f_xy:  21.75
x:  0.00, y:  1.00, f_xy:  31.00
x:  0.50, y:  1.00, f_xy:  28.75
x:  1.00, y:  1.00, f_xy:  27.00
/tmp/ipykernel_24524/1028622366.py:16: DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
  print("\n".join(["x: %5.2f, y: %5.2f, f_xy: %6.2f" % xyf for xyf in values]))

The second method is to provide the data directly as a list of cases, where each case is a collection of name/value pairs for the design variables. You might use this method if you want to generate the cases programmatically via another algorithm or if the data is available in some format other than a CSV file and you can reformat it into this simple list structure. The DOEGenerator you would use in this case is the ListGenerator, but if you pass a list directly to the DOEDriver it will construct the ListGenerator for you. In the following example, a set of cases has been pre-generated and saved in JSON (JavaScript Object Notation) format. The data is decoded and provided to the DOEDriver as a list:

# load design variable inputs from JSON file and decode into list
with open('cases.json', 'r') as f:
    json_data = f.read()

print(json_data)
[[["x", [0.0]], ["y", [0.0]]],
 [["x", [0.5]], ["y", [0.0]]],
 [["x", [1.0]], ["y", [0.0]]],
 [["x", [0.0]], ["y", [0.5]]],
 [["x", [0.5]], ["y", [0.5]]],
 [["x", [1.0]], ["y", [0.5]]],
 [["x", [0.0]], ["y", [1.0]]],
 [["x", [0.5]], ["y", [1.0]]],
 [["x", [1.0]], ["y", [1.0]]]]
# create DOEDriver using provided list of cases
case_list = json.loads(json_data)
prob.driver = om.DOEDriver(case_list)

prob.driver.add_recorder(om.SqliteRecorder("cases.sql"))

prob.run_driver()
prob.cleanup()

# check the recorded cases
cr = om.CaseReader(prob.get_outputs_dir() / "cases.sql")
cases = cr.list_cases('driver', out_stream=None)

values = []
for case in cases:
    outputs = cr.get_case(case).outputs
    values.append((outputs['x'], outputs['y'], outputs['f_xy']))

print("\n".join(["x: %5.2f, y: %5.2f, f_xy: %6.2f" % xyf for xyf in values]))
/usr/share/miniconda/envs/test/lib/python3.11/site-packages/openmdao/recorders/sqlite_recorder.py:230: UserWarning:The existing case recorder file, /home/runner/work/OpenMDAO/OpenMDAO/openmdao/docs/openmdao_book/features/building_blocks/drivers/problem2_out/cases.sql, is being overwritten.
x:  0.00, y:  0.00, f_xy:  22.00
x:  0.50, y:  0.00, f_xy:  19.25
x:  1.00, y:  0.00, f_xy:  17.00
x:  0.00, y:  0.50, f_xy:  26.25
x:  0.50, y:  0.50, f_xy:  23.75
x:  1.00, y:  0.50, f_xy:  21.75
x:  0.00, y:  1.00, f_xy:  31.00
x:  0.50, y:  1.00, f_xy:  28.75
x:  1.00, y:  1.00, f_xy:  27.00
/tmp/ipykernel_24524/3984179496.py:19: DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
  print("\n".join(["x: %5.2f, y: %5.2f, f_xy: %6.2f" % xyf for xyf in values]))

Note

When using pre-generated cases via CSVGenerator or ListGenerator, there is no enforcement of the declared bounds on a design variable as with the algorithmic generators.