In recent months, researchers at OpenAI have been focusing on developing artificial intelligence (AI) that learns better. Their machine learning algorithms are now capable of training themselves, so to speak, thanks to the reinforcement learning methods of their OpenAI Baselines. Now, a new algorithm lets their AI learn from its own mistakes, almost as human beings do.
The development comes from a new open-source algorithm called Hindsight Experience Replay (HER), which OpenAI researchers released earlier this week. As its name suggests, HER helps an AI agent “look back” in hindsight, so to speak, as it completes a task. Specifically, the AI reframes failures as successes, according to OpenAI’s blog.
Think back to when you learned how to ride a bike. On the first couple of tries, you actually failed to balance properly. Even so, those attempts taught you how to not ride properly, and what to avoid when balancing on a bike. Every failure brought you closer to your goal, because that’s how human beings learn.
With HER, OpenAI wants their AI agents to learn the same way. At the same time, this method will become an alternative to the usual rewards system involved in reinforcement learning models. To teach AI to learn on its own, it has to work with a rewards system: either the AI reaches its goal and gets an algorithm “cookie” or it doesn’t. Another model gives out cookies depending on how close an AI is to achieving a goal.
Submitted by: Arnfried Walbrecht