【文章內(nèi)容簡(jiǎn)介】
eginning of the next stage (possibly according to a probability distribution). The fortune seeker39。s decision as to his next destination led him from his current state to the next state on his journey. 重慶大學(xué)制造工程研究所副所長(zhǎng) 鄢萍 教授 博士 ?2022SYSTEMS ENGINEERING ?The work would consist of columns of nodes, with each column corresponding to a stage, so that the flow from a node can go only to a node in the next column to the right. ?The links from a node to nodes in the next column correspond to the possible policy decisions on which state to go to next. ?The value assigned to each link usually can be interpreted as the immediate contribution to the objective function from making that policy decision. In most cases, the objective corresponds to finding either the shortest or the longest path through the work. 重慶大學(xué)制造工程研究所副所長(zhǎng) 鄢萍 教授 博士 ?2022SYSTEMS ENGINEERING 4. The solution procedure is designed to find an optimal policy for the overall problem, ., a prescription of the optimal policy decision at each stage for each of the possible states. For the stagecoach problem, the solution procedure constructed a table for each stage (n) that prescribed the optimal decision (x*) for each possible state (s). 重慶大學(xué)制造工程研究所副所長(zhǎng) 鄢萍 教授 博士 ?2022SYSTEMS ENGINEERING Thus, in addition to identifying these optimal solutions (optimal routes) for the overall problem, the results show the fortune seeker how he should proceed if he gets detoured(便道 ) to a state that is not on an optimal route. For any problem, dynamic programming provides this kind of policy prescription of what to do under every possible circumstance (which is why the actual decision made upon reaching a particular state at a given stage is referred to as a policy decision). Providing this additional information beyond simply specifying an optimal solution (optimal sequence of decisions) can be helpful in a variety of ways, including sensitivity analysis. 重慶大學(xué)制造工程研究所副所長(zhǎng) 鄢萍 教授 博士 ?2022SYSTEMS ENGINEERING 5. Given the current state, an optimal policy for the remaining stages is independent of the policy decisions adopted in previous stages. Therefore, the optimal immediate decision depends on only the current state and not on how you got there. This is the principle of optimality for dynamic programming. 重慶大學(xué)制造工程研究所副所長(zhǎng) 鄢萍 教授 博士 ?2022SYSTEMS ENGINEERING Given the state in which the fortune seeker is currently located, the optimal life insurance policy (and its associated route) from this point onward is independent of how he got there. For dynamic programming problems in general, knowledge of the current state of the system conveys all the information about its previous behavior necessary for determining the optimal policy henceforth(今后) . Any problem lacking this property cannot be formulated as a dynamic programming problem. 重慶大學(xué)制造工程研究所副所長(zhǎng) 鄢萍 教授 博士 ?2022SYSTEMS ENGINEERING 6. The solution procedure begins by finding the optimal policy for the last stage. The optimal policy for the last stage prescribes the optimal