Virtue AI
RESEARCH TERMS

We conduct pioneering AI research to empower and ensure safe and secure AI.

Red Teaming & Risk Assessments

Pioneering comprehensive AI risk assessment across multiple sectors and languages. Our advanced red teaming algorithms rigorously test AI models and systems, ensuring robust safety measures aligned with global regulations.

Guardrail & Threat Mitigation

Developing cutting-edge, customizable content moderation solutions for text, image, audio, and video. Our guardrails offer transparent, policy-compliant protection with unparalleled speed and efficiency.

Safe Models & Agents

Crafting AI models and agents with inherent safety features, from secure code generation to safe decision-making. We’re integrating safety and compliance directly into AI development processes, setting new standards for responsible AI.

Publications

When Do Universal Image Jailbreaks Transfer Between Vision-Language Models?

Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality

Identifying Spurious Biases Early in Training through the Lens of Simplicity Bias

SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

TextGuard: Provable Defense against Backdoor Attacks on Text Classification

Can Pruning Improve Certified Robustness of Neural Networks?

Shake to Leak: Amplifying the Generative Privacy Risk through Fine-tuning

Improving Privacy-Preserving Vertical Federated Learning by Efficient Communication with ADMM.

Ring-A-Bell! How Reliable are Concept Removal Methods For Diffusion Models?

DP-OPT: Make Large Language Model Your Differentially-Private Prompt Engineer

Effective and Efficient Federated Tree Learning on Hybrid Data