Many in the tech industry have been raining criticisms on Tesla CEO Elon Musk about his vocal warnings with regards to the reckless development of artificial intelligence. Advocating the creation of careful and measured methods, Musk helped create OpenAI, which is meant to be the shining example of how AIs can be developed responsibly. Recently, the non-profit has devised a method of making machine learning more efficient while keeping it safe.
In a recent blog post, OpenAI explain what something called a baseline implementation known as Actor Critic using Kronecker-factored Trust Region (ACKTR) helps AIs learn faster. It was the work of researchers from University of Toronto (UofT) and New York University (NYU).
“For machine learning algorithms, two costs are important to consider: sample complexity and computational complexity. Sample complexity refers to the number of timesteps of interaction between the agent and its environment, and computational complexity refers to the amount of numerical operations that must be performed,” the post reads.
“ACKTR has better sample complexity than first-order methods such as A2C because it takes a step in the natural gradient direction, rather than the gradient direction (or a rescaled version as in ADAM). The natural gradient gives us the direction in parameter space that achieves the largest (instantaneous) improvement in the objective per unit of change in the output distribution of the network, as measured using the KL-divergence. By limiting the KL divergence, we ensure that the new policy does not behave radically differently than the old one, which could cause a collapse in performance.”
As a result of this new method, AIs score higher when it comes to certain response tests, Future reports. Basically, OpenAI has created a new way for companies to develop AI without having to put everyone at risk or impede progress in any way.