I'm currently working on a project to run an OpenMDAO parallelized optimization problem. The optimization problem I would like to run is the vehicle complex tutorial http://openmdao.org/releases/0.1.5/docs/user-guide/example.html. I will be running this optimization problem on a cluster of 4 raspberry pi machines. I've got the raspberry pi's talking properly to each other, each can run OpenMDAO correctly, but I'm not sure how to allocate resources to them through the resource allocators here. I have looked at the API for ClusterAllocator, but I'm not sure how to properly use it. So, I have one main question:
What are the steps required to set up the resource allocator and use it to allocate tasks to different machines?
Any inputs are greatly appreciated. Thank you very much.
asked 14 Nov '14, 00:29
Wow, you either have the best timing ever or the worst! I'm not sure which. First, have you seen this recent post I made about upcoming changes to OpenMDAO? If not, pleas take a moment to read it. Its got some relevant information to you.
Second, We were just the other day talking about totally re-writing this car optimization tutorial. The way it is written right now is not super clear and could be done much better. So as a warning, I'm not sure its the best example of how to do vehicle design with OpenMDAO. We recently have done some other work with time integration and discrete optimization that I wanted to port over to this problem to make it much more interesting and useful. The update to this example is going to be happening in december, when we get an intern to work on it.
Given the above two points, I'm not confident its worth the effort to write a Resource Allocation Manager for your application because it could be obsolete soon. However, our new MPI capabilities won't even be ready for you to test for at least a month. So, in the event that you don't want to wait, I need a bit more information before I can point you in the right direction.
How did you link up the pi's? Are they set up as an MPI cluster? Or just a loose network of computers. Is there some kind of queuing system that needs to be delt with?
answered 14 Nov '14, 08:03
CADRE actually has some opportunity for parallelization as you described, where different parts would run on different pi's. But getting that to work isn't going to happen in any current release of OpenMDAO. You'll need the MPI capability that were just developing now. We'll be very happy to have you as an early beta tester, but we just need a bit more time.
The cadre problem has 6 independent orbits that it analyzes, which could all be run in parallel. You could easily modify it to have only 4 orbits to match your current hardware. I've never set up a pi cluster before, but you'll need it to look like a regular cluster interface where you can issue an MPIrun command with 4 processors and have. The CADRE problem would be decent, since each of the parallel sub-problems is fairly large. So the amount of MPI based communication overhead should be lower relative to compute. Since you only have ethernet connections between the pi's, mpi overhead will be a bit heavier. Regardless, this is a really neat idea. Once we have our parallel capability up and running I think we'll be excited to see if you can get this working!
Running things in parallel with caseiteratordriver, using the RAM, is possible right now. What you would want to do is pick one of the design variables, and remove it from the optimizer. Then pick 4 values for it, and have the case iterator driver run 4 separate optimizations with each of those values. If you're going to take this route, I suggest you get it running in serial first. Then second, you can set it to run in parallel on your local multi-core machine. Lastly, use the RAM to run on the PIs. I think that RAM will be fairly easy to set up. You should be able to use an existing RAM class we have, ClusterAllocator to allow your OpenMDAO to delegate to the pi's. You could run from one of the pis or even your laptop and just allocate to the pi's.
answered 15 Nov '14, 07:53
Got it, that sounds good. I'll go ahead with your suggestions to run the serial version first, then in parallel on my multicore PC machine, and then finally run it on the PIs using RAM. Just had a couple of questions on CADRE and the RAM delegation.
First, I was looking at the docs for CADRE and it looks like I need a SNOPT license to run the full optimization for CADRE with all the design points and such. Is that actually necessary or can I run the optimizations you mentioned above without needing the SNOPT license? If I don't require it, then I simply run the lines mentioned in - http://openmdao-plugins.github.io/CADRE/full.html, making the following changes correct?
Second, to use the ClusterAllocator, I've made the following test.py script. Is this the proper way to create a ClusterAllocator and assign machines to the allocator? Currently, it says it makes a connection to the pi04, it just stalls, and then mentions no hosts were able to be connected. Is there anything you think I'm missing/can improve on here?
Thanks a bunch!
answered 15 Nov '14, 15:03