中国大学MOOC: There are two optimal policies for Dynamic Programming, one is ______________, and the other is policy iteration.动态规划有两种优化策略,一个是___________,而另一种是策略迭代。
主观题
中国大学MOOC: There are two optimal policies for Dynamic Programming, one is ______________, and the other is policy iteration.动态规划有两种优化策略,一个是___________,而另一种是策略迭代。