COMA (Counterfactual Multi-Agent Policy Gradient)