Gradient-based reinforcement learning techniques for underwater robotics behavior learning