Sun Yat-Sen Management Review  1993/9
Vol. 1, No.1 p.153-175
Graduate School of Resources Management National Defense Management College
本研究探討部分性觀察馬可夫決革過程,於離散性無限時間幅度下之最佳檢-修復-重置政策。首先,假設狀態空間皆為可數有限,且其特定行動空間為:「不採取行動、期初檢查、期初瞬時修復與重置」。再者,實施檢查需耗費成本,唯可確知系統實際狀態;然除重置行動外,採取修復行動,並未能確保系統回復至最佳狀態。據此,建構最大總期望折現報酬遞迴式,在隨機優勢與單調概似率等偏序,以及二階全正條件下,推證各項重要結果;顯示最佳政策結構,具有將狀態機率向量空間,分割為至多五區域之特性。其次,建構比較互異行動空間「不採取行動、期末瞬時檢查與修復、期初瞬時重置」與「不採取行動、期初檢查、期末瞬時修復與重置」等模式之重要結果。最後,彙述後續相關研究供參考。
(633621863954218750.pdf 35KB)部分性觀察馬可夫決策過程、隨機優勢、單調概似率、二階全正、檢查─修復─重置政策
This study examines optimal inspection-repair-replacement policy for the discrete-time partially observable Markov decision processes over an infinite horizon in which the state space is finite and the action space consist of “no action, inspection at beginning, instantaneous repair and replacement at beginning."Upon inspection to determine the precise state of the system, an additional cost is required. It is noted that repair cannot return the system to an as-good-as-new state. First, we construct the recursion to maximize the expected total discounted reward. Useful results are derived under the conditions of partial orders, namely stochastic dominance and monotone likelihood ratio as well as the totally positive of order two. Consequently, we show that the optimal policies have the structure which break up the space of state probability vectors into at most five-region. Next, alternate modeling results are set forth within two different action spaces:“no action, instantaneous inspection and repair at end, instantaneous replacement at beginning";“no action, inpection at beginning, instantaneous repair and replacement at end."Finally, several relevant studies presented for further consideration.
(633621863954218750.pdf 35KB)Partially observable markov decision processes(POMDP’s), stochastic dominance, monotone likelihood ratio, totally positive of order two, inspection-repair-replacement policy.