Why DeepSeek-R1’s Success Could Undermine AI Safety and Regulation Efforts
February 3, 2025
The announcement of DeepSeek-R1 in late January 2025 sent shock waves through the AI industry, leading to a significant selloff in U.S. AI stocks. Venture capitalist Marc Andreesen described the event as “AI’s Sputnik moment,” drawing parallels to the Soviet Union’s launch of the first satellite and emphasizing its potential to redefine global AI competition.
The DeepSeek-R1 announcement highlights three realities of the global AI marketplace:
(1) Companies in China and other places not subject to US regulations to develop AI capabilities that are as good as those being developed by US companies.
(2) Small companies can build powerful AI systems. Billions of dollars of funding is not a requirement.
(3) People will develop AI systems without adequate “safety” protections. Researchers found that DeepSeek-R1 could not block any harmful prompts from the HarmBench dataset. It was 11 times more likely to generate harmful content compared to OpenAI’s o1 and was three times more likely than competing models to generate content related to chemical, biological, and nuclear (CBRN) threats. It also exhibited significant biases, producing discriminatory content in 83% of bias tests.
AI safety and regulatory efforts are focused on three areas:
(1) Ensuring that AI systems don’t facilitate the malicious activities of bad actors
Most US-based frontier models are built with controls that attempt to prevent bad actors from things like
- Designing and synthesizing CBRN weapons
- Automating offensive cyber attacks
- Generating child sexual abuse material, nonconsensual intimate imagery, and sexual exploitation in general
- Producing disinformation, deepfakes, and deceptive use of AI in general
These goals are achieved mostly by various types of guardrails that prevent unwanted output and partly by analyzing training data to avoid the encoding of harmful behavior. There is also considerable research on preventing the circumvention of guardrails via jailbreaks, adversarial attacks, data poisoning, prompt injection, fine-tuning, and deceptive prompting.
The DeepSeek announcement highlights the difficulties of preventing bad actors from using AI technology for malicious purposes. Bad actors can simply choose to use AI technology that is developed without adequate guardrails. DeepSeek’s low cost of developing their AI technology also shows that well-funded bad actors such as governments and terrorist groups can simply build their own AI technology without any guardrails.
US companies like OpenAI and Anthropic invest huge amounts of money in AI safety research designed to prevent malicious acts. But if good actors use the tools with guardrails and bad actors use the tools without guardrails, are these efforts worth the expense? There are research, training time, and runtime expenses and must be factored into the pricing of these systems.
From a regulatory perspective, legislation requiring guardrails and other safety features are compromised if bad actors have access to equally good AI systems without guardrails. Worse, these regulations can be burdensome to companies that build and use AI technology — especially startups with limited funding. Perhaps it would be more effective for regulatory efforts to focus on penalties for bad actors who use AI (or other) tools to perform malicious acts rather than requiring that this type of AI safety be built into AI tools.
It’s also worth asking whether the US focus on AI safety is a distraction that is helping China catch up to the US on AI technology capabilities.
(2) Preventing biases in outputs to ensure fairness and inclusiveness, providing transparency for outputs that affect people’s lives, and preventing toxic output.
These goals are achieved by guardrails that prevent unwanted output, monitoring output for bias, analyzing training data to ensure it is unbiased, and by adding transparency capabilities.
The importance of this category of safety research isn’t affected by the DeepSeek announcement. Legislation in the US and Europe is requiring companies to produce unbiased decisions when those decisions impact people’s lives. Companies will need to choose AI systems with good bias controls for those systems as opposed to systems like DeepSeek that don’t have adequate controls built in.
(3) Addressing the existential threat of losing control of AI systems with catastrophic consequences
It’s unclear if we’ll ever reach this level of capability and, since we don’t yet know how to build this type of capability, it’s unclear if such research is even worthwhile. Many AI luminaries including Geoffrey Hinton and Yoshua Bengio recommend that the US and China purse a worldwide agreement that, if and when AI technology gets close to the point of being out of control, that all nations agree to freeze AI research. The DeepSeek announcement doesn’t impact the value of this type of possible worldwide cooperation.
Models like DeepSeek-R1 should cause us to question the value and cost of AI safety research and regulations designed to prevent bad actors from performing malicious acts. We might be better off using that funding on AI research that protects against malicious acts. As the availability of AI tools to bad actors increases, so does the value of AI-based defensive cybersecurity and AI assistants that protect us from social media, email, and phone scams and disinformation.