generated at
(2.2.3.1) Exploration-exploitation tradeoff
This problem is called 'exploration-exploitation tradeoff' in the field of reinforcement learning. You can not find better options if you choose only the option that looks the best from your experiences. It is a lack of exploration. (*1)

On the other hand, if you are looking for better options and only choosing inexperienced options, your experiences are not used. It is a lack of exploitation.

Since exploration and exploitation are in a trade-off relationship, it is necessary to execute both in a well-balanced manner, not on one side. So how can we make the well-balanced choices?

---

Footnote *1:

The discussion went detail in the field of reinforcement learning.
However, its origin is unclear. The cencept is used in wide domain.
Box, G. E., 1954. The exploration and exploitation of response surfaces: some general considerations and examples. Biometrics, 10(1), pp.16-60.
March, J.G., 1991. Exploration and exploitation in organizational learning. Organization science, 2(1), pp.71-87.

en