DVC UOP Logo 7DVC UOP Logo 3

Johnson Liu

Projects Portfolio

Large Language Model Jailbreak Stress Testing Dashboard
— An interactive dashboard for stress testing LLMs



——— Work in Progress ———

This mini jailbreak stress test dashboard is an app for stress testing large language models (LLMs) with custom and premade prompts gathered from open-source projects. This tool allows for interactive monitoring of attacks on various models and the visualization of these models' performance in defending against such attacks. The app provides a real-time display in the form of a prompt refusal table and graphs that lets the user quickly spot successful jailbreaks from the provided prompts along with jailbreaking trends that emerge from the use of a variety of prompts across multiple LLMs.



View the project on GitHub:
https://github.com/johnson-liu-code/LLM_Jailbreak_Stress_Test_Dashboard
llm Jailbreak image

Project Overview

Goal
  1. Build transparent, reproducible evaluations of LLM capability and safety (with a focus on jailbreak susceptibility).
  2. Create practical tools and write-ups that help students, researchers, and engineers reason about model behavior.
  3. Contribute open resources (code, datasets, and reports) that advance LLM alignment and safety.
Current progress
  1. Implemented a Streamlit-based Jailbreak Stress-Test Dashboard with multi-model runs, batch prompting, logging, and results tables.
  2. Organized a library of attack prompts and began benchmarking success/refusal rates across commercial and open-weight models.
  3. Drafted portfolio copy and literature notes covering jailbreak taxonomy, defenses, and evaluation methods.
Future plans
  1. Add automated red-teaming (model-generated attacks), attack families, and per-model safety scorecards to the dashboard.
  2. Submit a prompt response dataset with evaluation scripts and baseline metrics to JailbreakBench.
  3. Test defenses (prompt hardening, safety-tuned adapters, retrieval guardrails) and publish a short report with findings.

Jailbreaking Example

Project Documentation
Visit the GitHub repository for formatted LaTeX


Loading README from GitHub...

Return to Home Page