Define state-value and (true) state value of an MDP Define Q-value and (true) Q value of an MDP The idea of discounting stems from the common idea that a reward now is better than the same reward ...