OpenAI: "Brace Yourselves, AGI Is Coming" Shocks Everyone!
"So brace yourselves AGI is coming" was a phrase that was just tweeted by someone that works at OpenAI in response to something that we did. You see, and it's important that you do watch this video until the end because it does contain a detailed account of what we know when we're looking at dangerous AI systems and when we do look to the future of AGI capabilities.
So what you're looking at is Steven H, someone who fine-tunes LLMs at OpenAI, and then, of course, you're about to see that he tweeted this. Okay, and this was in response to another tweet. He tweeted, "Brace yourselves, AGI is coming," and this is in response to a tweet from Janlike. Janlike tweeted that he's very excited today that OpenAI adopts its new preparedness framework.
We previously discussed OpenAI's preparedness framework, and essentially what their preparedness framework is, is how they look at problems when dealing with AGI-level systems or essentially just dangerous AI systems that could be harmful to the public or to pretty much anyone. So essentially, what we're looking at here is the real way that they're going to look to protect the general public in terms of these dangerous AI systems. You can see right here, this framework spells out our safety for measuring and forecasting risks and our commitments to deployment and development if safety mitigations are ever lagging behind.
And of course, I'm going to go to the full tweet in a second, but of course, you can see this is the guy that tweeted "Brace yourselves, AGI is coming." This is someone that does actually work at OpenAI. So, of course, this is the actual link. And of course, if you're wondering who Janlike is, Janlike is actually a machine learning researcher co-leading super alignment at OpenAI and optimizing for a post-AGI future where humanity flourishes. So I do want to touch on that statement quickly because I do think it is a bit interesting that someone said, "Brace yourselves, AGI is coming,".
Because we've known for quite a while now that OpenAI is definitely close to AGI, if not very near it, because of a few different reasons which we talked about in another video, and for all of that stuff, you definitely want to watch our video where we talked about, "Did OpenAI just confirm that it has AI?" We don't think they actually have AI, but we think they're really, really close.
And so I do think it's time to take a look at that preparedness framework because some of the stuff it does talk about is a little bit scary in the sense that there's a lot of things that I didn't think that we would have to think about when looking at AI risk, but it's clear that this is something that is truly, truly, I guess you could say, thorough, and it does show us that what we are dealing with is a very, very grave risk. So this is the page. You can see it says, "Preparedness: The study of frontier AI risks has fallen short of what is possible and where it needs to be.
To address this gap and systemize safety thinking, we are adopting the initial version of our preparedness framework. It describes OpenAI's process to track, evaluate, forecast, and protect against catastrophic risks posed by increasingly powerful models." And I'm pretty sure that they're doing this because, of course, everybody knows that they are working on GPT-5. They have said this before. It isn't speculation anymore. They've announced that they are working on it.
They are training on it. And they even talked about how even working on GPT-5 did actually cause quite a lot of stress within the company, that I'm pretty sure led to a lot of things that did actually transpire at OpenAI, especially when Sam Altman was removed from the company. One thing that it does say here is that super alignment builds foundations for the safety of super alignment models that we hope to have in a more distant future. So, of course, a superintelligent model is something that is pretty incredible, but it's, of course, not something that we do have.
But it makes sense to start the work now because that is something that we know every company is pretty much working towards. So, the introduction here, I'm going to always summarize some of these key points because I know that it's an entire paragraph and people don't just want to sit through the paragraph. But essentially, what they've said here is that as our systems get closer to AGI, we are becoming more and more careful about the development of our models, especially in the context of catastrophic risk.
And this preparedness framework is a living document that distills our latest learnings on how best to achieve safe deployment and development in practice. And the process is laid out in each version of the preparedness framework will help us rapidly in understanding of the science, empirical texture, and catastrophic risk and establishes the processes needed to protect against unsafe development. And essentially, what they're saying here is that we're getting closer and closer and closer towards AGI. And if we don't have a framework where we can look and say, "Okay, this is AGI.
This category is really dangerous. This part of the model needs to be looked at again before they deploy it. We could have some catastrophic risk because once these models are deployed, we always know that bad actors will use these models for whatever they try to. So it is something that we do have to be careful of. So essentially, the preparedness framework does contain five key elements that we will talk about right now.
And you can see that the first one is tracking catastrophic risk level via evaluation. So it says, "You'll be building and continually improving suites of evaluations and other monitoring solutions along several tracked risk categories and indicating our current levels of pre-mitigation and post-mitigation risk in a scorecard." And I basically say here that we're also going to be forecasting the future development of risk so that we can develop lead times on safety and security measures.
Which essentially means that, you know, they're looking into the future and they're going to basically predict by saying, "Okay, by this date, we should have this level of model, so we need to make sure we have this security measure in place." And of course, there's a scorecard that we'll get into later, which basically looks at how you can categorize each model, and then you can see exactly how dangerous it is.
So that is definitely something that you do need to take a look at, because the certain things on the scorecard were, you know, a little bit surprising. So another thing as well, you know, part two of this, of the five key elements, was seeking out the unknown unknowns. And this is a real problem because there are certain things that we don't know that we don't know.
And that's something that is really hard to think about because if you don't know you don't know something, it's not something that you even think about, which means that it's a problem that can take you completely by surprise. And this is what they talk about. They said, "We will continually run a process for identification and analysis, as well as tracking of unknown categories of catastrophic risk as they emerge." So they're
basically saying here that a lot of things that come with AGI is going to be new to us because, of course, as you know, we're dealing with new technology, and with new technology does come new problems. And although people try to predict the future as well as they can, there are just some problems that you literally cannot see coming. Second order, third order, order consequences of the rise in this tech kind of technology is going to be catastrophic in some areas that we will need to, I guess you could say, be careful with.
Then, of course, we have establishing the safety baselines, and this is where they're talking about deploying the models. And this is really, really interesting because it goes to show that in the future, what we might have is we might have a situation where even if, let's say, for example, they make a GPT-6, and it is really good, they might be able to say, "Look, we've made this model, and we're not going to be able to deploy this model because it's too advanced, it's too good."
So, um, this is going to be something that we literally, you know, can't work on anymore. And, um, we'll talk later more about that in the video, but it definitely is something that does make sense. So it says, "Only models with a post-mitigation score of medium or below can be deployed, and only models with a post-mitigation score of high or below can be developed further, as in the tracked risk categories below. In addition, we will ensure safety and security is appropriately tailored to any model that has a high or critical pre-mitigation level of risk as defined in the scorecard below to prevent model exfiltration."
And essentially, like I said, okay, they're only going to allow models with a post-mitigation score of medium or below to be deployed, and only models with a post-mitigation score of high or below that can be developed further. So they're basically saying, like I said, if we develop this model and it's too advanced, it's, you know, it's able to be exploited, you know, people can use it for certain things, they're not going to be able to deploy that model, and, of course, they're only going to be working on other models where, you know, it's not in the higher risk category.
So, um, I'm going to show you that those categories in a moment, but, of course, then we have the testing, I mean, the tasking and the tasking of the preparedness team with on-the-ground work. And it said, "The preparedness team will drive the technical work and maintenance of the preparedness framework. This includes conducting research, evaluations, monitoring, and forecasting of risks, and synthesizing this work by regular reports to the Safety Advisory Group.
And these reports include a summary of the latest evidence to make recommendations on the changes needed to enable OpenAI to plan ahead. And the preparedness team will also call on and coordinate with relevant teams to recommend mitigations to include in these reports." So, essentially, basically, all of these teams are working together to ensure a safe solution, solution, because one team cannot make the decision on what is safe because some teams might miss something, some teams might have certain biases, and there are just things that people are going to miss.
And then, of course, number five, and I think this is so important, this is something that they kind of did have before, but I'm glad that they do have this now, and this is where they say creating a cross-functional advisory body. We are creating a Safety Advisory Group that brings together expertise from across the company to help OpenAI's leadership and board of directors be best prepared for the safety decisions they need to make.
S's responsibilities will thus include overseeing the assessment of the risk landscape and maintaining a fast-track process for handling emergency scenarios. So what they're saying here is that remember how with OpenAI, we had the board of directors and we had that whole drama situation? They're basically saying that, you know, of course, with GPT-4 and whatever programs that may, you know, if any leaks, any rumors, you know, that can or cannot be confirmed, if there is any safety issue, there's going to be a Safety Advisory Group that, you know, brings the expertise all together to ensure that, you know.
The decision is made with, you know, everyone in mind and that decision isn't just like one person or just, you know, like a small group of people at OpenAI just making a foolish decision that could lead to, you know, something catastrophic happening. So, um, I'm glad that they're having this now because it makes sure that, you know, this is being overseen and that, you know, I guess you could say the more people you have involved, the better their safety level is going to be because the less people, it's just, it's just more risk.
So the document essentially has three sections: track risk categories, where they detail the key areas of risk, and they're basically going to delineation of different levels of these risks, then a scorecard in which they basically say, "Look, this is what the model can do, this is what it can't do, and this is where we think it's going to go," then, of course, the government, and basically, that's where they lay out the safety baselines as well as procedural commitments, which include the standing up of a Safety Advisory Group. Then, of course, essentially, we're going to look at the tracked risk categories. Okay, so this is where things get super, super crazy, okay?
Analysis of Threats: Now analyze it, assess it, and there's not going to be deployed so you have to make sure that this is what we look at. This is illustrative, and of course, do know that any model that is currently deployed is basically like this anyways, but any model that they won't deploy is going to be high or critical in these areas, and of course, that does make sense. So essentially I'm going to show you guys now the actual threats that they did talk about in the paper. One of the first things they do talk about is cybersecurity, and this is essentially related to the risks on the use of the model for cyber exploitation to defend, to disrupt confidentiality, integrity, and availability of computer systems. So essentially what they talk about is, of course, the risks that we have here from low all the way to critical, and then essentially what we have here is a definition and then the rationale for the threshold. So it's basically saying this is exactly what it is, but this is what the program is going to be able to do.
For example, on the medium, they say that this is a medium-level system is going to be able to, in the cybersecurity category, increase the productivity of operators by an efficiency threshold. So, for example, they can two times the time that they saved on cyber operation attacks, such as developing a known exploit into an attack, a blackbox exploitation, gold-directed lateral movement, basically just making these cyber attacks much more effective. It says here that the rationale for the threshold is that the execution of a cyber attack is a scarce skill that is generally learned from experience. A model's ability to substitute in or accelerate a professional would improve the quality or quantity of attacks. Note that this productivity speed-up also includes low-level, potentially State actors using known methods.
For example, Russian disinformation groups who use purchase capabilities, phishing, and otherwise known methods. So, for example, this is where we have the high-end critical characters in cybersecurity, and this is where they talk about stuff that they wouldn't allow in a model. So they basically said here the model can end-to-end execute cyber operations involving the above tasks without human intervention. Basically, anything here that they can do, like the model can do immediately without human intervention. So you can literally make an LLM or whatever.
AI system it is because we know that LLMs aren't just going to be future AI systems. They're going to be capable systems. And they're basically saying that whatever system it is, it's going to be able to do that without human intervention. So you're going to be able to say, you know, complete this task and then, you know, going to be able to do without human intervention. And then basically, these are just, you know, professional cyber attacks that are going to be done on any kind of computer.
It basically says a disadvantage capability would unblock many and be usable by actors of many skill levels. And, of course, it says we have a tool-augmented model can identify and develop functional zero-day exploits of all severity levels across all software projects without human intervention or model can devise and execute end-to-end novel strategies for cyber attacks against hardened targets given only a high-level desired goal. And they're basically saying that finding and executing end-to-end exploits for all software would lead to catastrophe from unilateral actors, from hacking military or industrial systems, as well as opening our infrastructure.
And this is, of course, critical because if they develop a system that can hack the military industrial systems, as well as opening our infrastructure, a cyber attack, you know, is going to be awful. Okay, we've all perhaps maybe seen what happens when we have a cyber attack due to recent films and stuff like that and Netflix. I'm not sure if you guys have seen that, but essentially, they're basically saying that these levels here is where they say, you know what, we're going to make sure that the model can't do this.
So if you've ever thought about, you know, an AI system being able to do this, this is generally the framework that OpenAI is using. And I'm pretty sure that other companies are going to be, you know, using a very similar framework or at least, you know, they're going to develop this into, you know, some kind of law or maybe not, you know, just a law in the US, but I'm sure that there might be some kind of global agreement on what models can and can't do. Then, of course, we have the CBRN, the chemical, biological, radiological, and nuclear.
And it says CBRN is focused on risk related to model-assisted creation of chemical, biological, radiological, and nuclear threats. So this is all of that stuff, basically biological warfare, nuclear warfare, crazy, crazy stuff. And then it says, of course, the low stuff, which, you know, models are going to be able to do, is model can provide information relevant to creating CBRN TH with comparable utility to existing resources.
Watch the Full details in this Video...
