In Reinforcement Learning, there is the main quantitative discussion between exploration and exploitation. We discuss epsilon greedy and exploration function. Regret is used to quantify and compare approaches. Features across states allow us to generalize our learning. Credit Assignment Problem and Reward Shaping.