تگ: Optimal Policy Value Estimation