Naive Neural Network policy for Reinforcement Learning