These new guidelines signal a pivotal shift in OpenAI’s strategy, as it prioritizes enhanced safeguards against potential AI risks.
OpenAI, the artificial intelligence (AI) research organization, has revealed a significant development in its approach to AI safety. In a set of guidelines released on Monday, the company disclosed that its board now possesses the authority to halt the release of AI models, even if the leadership deems them safe. This move represents a noteworthy shift in the company’s strategy, as it empowers its directors to enhance safeguards against potential AI risks.
The guidelines were introduced to address extreme risks associated with the deployment of the company’s most powerful AI systems. The release of these guidelines comes after a period of internal turmoil, during which CEO Sam Altman was briefly ousted by the board, sparking discussions about the balance of power within the organization.The company’s newly established “preparedness” team will continually evaluate its AI systems across four distinct categories, including cybersecurity concerns, as well as potential threats related to chemical, nuclear, and biological scenarios. The company is particularly vigilant regarding “catastrophic” risks, defined as those that could result in substantial economic damage or significant harm or loss of life.
Aleksander Madry, leading the preparedness group, stated that his team will provide monthly reports to a newly formed internal safety advisory group. This advisory group will review the reports and offer recommendations to CEO and the company’s board, who will make decisions about the release of new AI systems based on these reports. However, the board retains the authority to override these decisions, as outlined in the guidelines.
The company’s “preparedness” team, introduced in October, is one of three groups dedicated to AI safety within the organization. The other two groups are responsible for assessing current products like GPT-4 (“safety systems”) and examining hypothetical, extremely powerful AI systems that may emerge in the future (“superalignment”). The company emphasized that the team will continuously evaluate the company’s most advanced, unreleased AI models, rating them based on perceived risks, and implement changes to mitigate potential dangers. According to the new guidelines, the company will only deploy models rated as “medium” or “low” in terms of risk. These guidelines formalize existing processes and have been developed over the past few months with input from within the organization. The company stressed that AI is not something that merely happens to society but is something actively shaped by those developing it.