Return to Index
Operations Research Models and Methods
 
Computation Section
Subunit Dynamic Programming Solver

 

The Dynamic Programming Solver add-in solves several kinds of problems regarding state based systems: Deterministic Dynamic Programming (DDP), Stochastic Dynamic Programs (MDP) and Discrete Time Markov Chains (DTMC). Continuous Time Markov Chains (CTMC) are analyzed with the Markov Analysis add-in. We use MDP as an acronym for stochastic dynamic programming to represent Markov Decision Processes. MDP problems are modeled and solved with stochastic dynamic programming.

 

The Dynamic Programming Solver (DP Solver) add-in provides a menu shown on the left. It can be called to build models directly as shown on these pages. The DP Models add-in uses the DP Solver add-in to find solutions. In this case the data for the solver is automatically loaded and ready for solution.

 

 

Selecting the Build Model command from the menu presents the dialog to the left. Select the appropriate model type at the top and the structure of model at the bottom.

The data for the problem is entirely described on the worksheet created by the DP Solver add-in. The MDP and DTMC models prescribe transition probabilities and cost. The structure specifies how these are stored on the worksheet. There are three data structures available. The Transition List option stores all transitions in a list. The length of the list is equal to the number of transitions. The primary limit to the size of this representation is the number of rows of the Excel worksheet. This number is very large especially with Excel 2007. Size limits have not been tested for Excel 2007.

The Probability/Cost Matrix option stores transition probabilities and transition costs in matrices. For the MDP the number of rows of the model is equal to the number of states multiplied by the number of actions and the number of columns is equal to 2 times the number of states. Practical MDP models may be very large with respect to the number of states and therefore the number of rows and columns allowed by Excel is limiting. Since the number of rows on a page is much greater than the number columns, the number of columns is the primary limitation. Excel 2003 allows 256 columns, so models are limited to about 100 states. Excel 2007 allows quite large models.

The Probability/No Cost Matrix option represents transition costs by a list rather than a matrix. In Excel 2003 models are limited to about 200 states. Excel 2007 allows larger models because its worksheets have more columns.

 

  dp value dialog

The dialog on the left appears when you click the Solve button on the Solver worksheet. We use it here to illustrate the options available in the DP Solver add-in. We discuss each option on the following pages. Although you may not be acquainted with all the methods, we provide some introduction on each page.

The first field accepts the number of iterations of the chosen solution method. The problem can be one that models a finite horizon problem, when the Fixed Horizon button is checked, or infinite horizon problem. When the Fix Policy button is unchecked, the algorithm searches for the optimum policy. When checked the algorithms evaluate a given policy.

Two solution methods are available, either the policy or value method. The criterion depends on whether the problem is discounted or not. For a discounted problem some nonzero interest rate is provided, and the criterion is the discounted present worth. When the discount rate is zero, the criterion must be either the total value (the discounted value with 0 interest rate) or the average value per step.

There are three options for showing intermediate results: all iterations, only the last iteration or no iterations.

The iterations may begin with initial values or may continue from some advanced solution. The error correction method is used for faster convergence of the values for an infinite horizon problem. The Stop Error field holds the value of the value error estimate that causes the value iterations to terminate.

 

In this section we illustrate the DP Solver add-in with a small problem on the Example page . Small problems can be entered directly into worksheet created by the DP Solver add-in. Most of the computations are performed by equations directly on the worksheet. The DP Solver dialogs are explained on the Build Models page. Different solution methods are explained on the remaining pages of this section.

A good book on dynamic programming would be helpful for the beginning student. I have used several books including a books by Ron Howard (Dynamic Programming and Markov Processes, M.I.T. Press, 1960), Sheldon Ross (Introduction to Stochastic Dynamic Programming, Academic Press, 1983), and Dimitri Bertsekas (Dynamic Programming and Optimal Control, Vol 2, Athena Scientific, 2007).

 

  
Return to Top

tree roots

Operations Research Models and Methods
Internet
by Paul A. Jensen
Copyright 2004 - All rights reserved

Next Page