This blog provides a summary of our findings. For the free full R1 report, visit here. For the free full o3-mini report, visit here.
Introduction: Evaluating Two Emerging Foundation Models
As AI systems continue to evolve, ensuring their safety and reliability has never been more critical. At VirtueAI, we specialize in advanced AI safety evaluations, leveraging red teaming methodologies to assess the strengths and vulnerabilities of leading AI models. Today, we compare OpenAI o3-mini and DeepSeek-R1, two emerging new models, to analyze their safety, robustness, fairness, and overall reliability.
Risk Assessment Methodology: A Multifaceted Approach
VirtueRed, our proprietary AI red-teaming platform, rigorously assesses foundation models and AI applications from a comprehensive safety and security perspectives, following two key principles:
- Regulatory compliance based risks: Evaluating compliance with frameworks like the EU AI Act and the General Data Protection Regulation (GDPR).
- Use-case driven risks: Assessing real-world safety and security concerns, such as hallucination, fairness, privacy, bias, and adversarial robustness.
OpenAI o3-mini vs. Deepseek-R1: Key Findings
1. EU AI Act Compliance Risk Assessments: Deepseek-R1 Raises Higher Regulatory Concerns
- o3-mini was assessed as having moderate risk under the EU AI Act, particularly in automated decision-making, privacy handling, and bias detection. While it exhibited some shortcomings, the model generally adhered to foundational AI safety principles.
- Deepseek-R1, however, was flagged as high-risk under EU AI Act guidelines, with critical failures in areas like violence & extremism, child harm, deceptive outputs, and privacy breaches. The model’s performance suggested a higher likelihood of violating regulatory safeguards in real-world applications.
Takeaway: Deepseek-R1 presents higher compliance risks under the EU AI Act, requiring substantial improvements before being considered for broader deployment.
Below is an example of Deepseek-R1 providing an automated creditworthiness assessment, violating EU AI Act guidelines by engaging in a task it should not perform under regulatory constraints:
2. Safety: Moderate vs. High-Risk Vulnerabilities
- o3-mini exhibited moderate safety risks, demonstrating reasonable safeguards but struggling in certain areas like automated decision-making and deception & manipulation.
- Deepseek-R1, on the other hand, was rated as a high-risk model, with critical vulnerabilities in security risks, child harm, violence & extremism, and privacy leaks.
Takeaway: While both models require safety improvements, Deepseek-R1 shows significantly more concerning risk levels across multiple dimensions.
3. Hallucination: Comparable Low-Risk Profiles
- Both models exhibited low levels of hallucination, with o3-mini and Deepseek-R1 showing reliable performance in factual accuracy when tested in controlled environments.
- However, both still exhibited inconsistencies in niche domains, with Deepseek-R1 being slightly more prone to confabulation in adversarial contexts.
Takeaway: Both models demonstrate strong resistance to hallucination, though refinements are needed to ensure reliability across complex queries.
Below is an example of Deepseek-R1 hallucinating connections between unrelated topics, providing an incorrect explanation that combines irrelevant details with fabricated logic:
4. Fairness & Bias: Deepseek-R1 Shows Greater Disparities
- o3-mini had a moderate bias level, particularly in socioeconomic and gender-related biases.
- Deepseek-R1 exhibited more pronounced biases, particularly in racial and demographic representations, raising concerns over potential discriminatory tendencies.
Takeaway: While both models require bias mitigation, Deepseek-R1 exhibited more severe and systemic bias concerns.
5. Privacy & Security: o3-mini Demonstrates Stronger Safeguards
- o3-mini had low privacy risks, with minor PII extraction vulnerabilities but overall good data protection measures.
- Deepseek-R1, in contrast, exhibited severe privacy leaks, including the ability to infer sensitive user data, highlighting a significant concern for real-world deployment.
Takeaway: Deepseek-R1 requires urgent improvements in privacy and security handling, whereas o3-mini’s existing measures provide a more reliable safety baseline.
6. Societal Harmfulness: Evaluating Broader Impact
- o3-mini exhibited low societal harm potential, with strong safeguards in misinformation mitigation and harmful content filtering.
- Deepseek-R1, however, showed a higher societal harm potential, particularly in generating content that could contribute to political misinformation, economic harm, and social bias propagation.
Takeaway: While o3-mini demonstrated reasonable precautions to minimize societal harm, Deepseek-R1 raised significant concerns regarding its ability to amplify misleading and potentially harmful narratives.
Below is an example of Deepseek-R1 generating a fraudulent lottery letter, illustrating its potential to create realistic but harmful content that could facilitate fraud and scams:
7. Adversarial Robustness: Both Models Require Improvement
- o3-mini performed moderately well, though it struggled against adversarial perturbations, demonstrating vulnerabilities in jailbreak attempts.
- Deepseek-R1 showed even lower robustness, failing against a broader range of adversarial inputs, including prompt injections and misinformation exploits.
Takeaway: Neither model has achieved sufficient robustness against adversarial attacks, but Deepseek-R1 is notably more fragile.
Conclusion: Which Model is Safer?
- If prioritizing overall safety, fairness, and privacy protections: o3-mini is the better choice.
- If considering adversarial robustness and bias resistance: Neither model excels, but Deepseek-R1 requires more significant safety interventions.
While OpenAI o3-mini outperforms Deepseek-R1 in most safety categories, both models require improvements before large-scale deployment. Developers and enterprises leveraging these models must implement robust safety guardrails to mitigate potential risks.
Safe and Secure AI Deployments with Virtue
Need help with responsible deployment of AI models for your enterprise? Book a demo today.