3.2 Goals and Rewards

风油精

2020-03-04 12:42:42

看RL越看越像是在说人生啊（well，一个22岁人眼里的人生）。独立意志所能把握的其实只有当下的一个个action而已呀。

Newcomers to reinforcement learning are sometimes surprised that the rewards -- which define of the goal of learning -- are computed in the environment rather than in the agent. Certainly most ultimate goals for animals are recognized by computations occuring inside their body: by sensors for recognizing food, hunger, pain and pleasure. Nevertheless, as we discussed in the previous section, one can redraw the agent-environment interface in such a way that these parts of the body are considered to be outside of the agent. ... For our purpose, it is convenient to place the boundary of the learning agent not at the limit of its physical body, but at the limit of its control.

引自第57页

68人阅读

> 风油精的所有笔记（27篇）

说明 · · · · · ·

表示其中内容是对原文的摘抄