Policy gradient based Reinforcement Learning for real autonomous underwater cable tracking