Table of Contents
What is Huihui?
Huihui produces abliterated LLM variants using a gentle uniform approach across multiple projection types. Unlike Heretic’s surgical targeting of a single weight type, Huihui touches more tensors but with smaller edits per tensor.
Performance Across Models
| Model | HarmBench ASR | Full CoT ASR | MMLU | KL Divergence | Avg Delta (excl GSM8K) |
|---|---|---|---|---|---|
| Qwen3.6-27B | 98.5% | 99.8% | 83.4% | 0.0074 | 0.5pp |
| GLM-4.7-Flash | 100% | 100% | 77.4% | 0.0076 | 0.8pp |
| Qwen3.5-27B | 99.8% | 100% | 83.2% | 0.092 | 2.1pp |
| Qwen3.5-9B | 100% | 100% | 82.2% | 0.152 | 1.5pp |
| Qwen3.5-4B | 100% | 100% | 72.5% | 0.147 | 2.5pp |
| Qwen3.5-2B | 99.8% | 100% | 68.3% | 0.0201 | 1.3pp |
| Qwen3-4B | 100% | 100% | 68.6% | 0.161 | 2.1pp |
| Gemma4-E2B (huihui-v1) | 87.0% | 100% | 29.33% | 0.2510 | +0.3pp |
| Gemma4-E2B (huihui-v2) | 97.0% | 100% | 28.39% | 0.5302 | -0.4pp |
Key Characteristics
Highest reported ASR. Huihui consistently achieves the highest HarmBench ASR across models, with the fewest empty responses. On Qwen3.6-27B, only 5 out of 400 responses were empty. That’s 1.3%, versus 30 for Heretic and 45 for AEON.
Gentle uniform modifications. Huihui targets 3 projection types with small, uniform edits. The modification footprint is broader than Heretic but gentler per tensor.
Excellent capability preservation. On Qwen3.6-27B, Huihui has the smallest non-GSM8K average delta at just 0.5pp. MMLU retention is within 0.5% of base in most models.
Catastrophic KL on some models. On Qwen3.5-4B, Huihui’s KL divergence spiked to 0.147. That’s an order of magnitude higher than Heretic on the same model at 0.0355. This was the only “catastrophic” result in the dataset. It suggests the uniform approach doesn’t generalise perfectly to all architectures.
Second-lowest KL overall. If I exclude the Qwen3.5-4B outlier, Huihui’s KL divergence is consistently second only to Heretic, ranging from 0.007 to 0.092. On Gemma4-E2B, huihui-v1 and prithiv were found to be nearly identical with cosine=1.0 across all 50 shared tensors, identical KL at 0.2510, and identical Phase 1 benchmarks. Prithiv is almost certainly derived from huihui-v1 or both share a common source.
Gemma4-E2B dual variant comparison. Huihui-v1 at KL=0.251 with 87.0% ASR uses 50 tensors with mean edit norm 2.02. Huihui-v2 at KL=0.530 with 97.0% ASR uses 60 tensors with mean edit norm 4.94. The 2x higher KL comes from larger edit magnitudes, not broader targeting. Both use the same 2 tensor types. Huihui-v2 had the best LAMBADA perplexity of any variant at 0.53x base, suggesting its edits concentrate more precisely in the refusal direction.
The GSM8K Thinking Budget Effect
On reasoning models from the Qwen3.x series, Huihui’s abliteration shortens thinking chains. This causes GSM8K raw scores to appear dramatically higher than base. Not because the model got better at math, but because it stops overthinking and actually produces an answer within the token budget.
On Qwen3.6-27B, the base model exhausts its thinking budget on 68.2% of questions. Huihui only does this on 23.0%. When both produce answers, they score nearly identically at 96.2% for base versus 96.0% for Huihui.
On Gemma4-E2B, huihui-v2 had 54 empty GSM8K responses at 4.1%, the second-highest after ether4o4. The larger edit magnitudes in v2 appear to increase thinking chain length, consuming more of the generation budget.
Read the Full Analyses
- Qwen3.6-27B: Heretic vs Huihui vs AEON vs Abliterix vs HauhauCS
- GLM-4.7-Flash: Heretic vs Huihui vs HauhauCS vs Abliterix
- Qwen3.5-27B: Heretic vs Huihui vs HauhauCS
- Qwen3.5-9B: Heretic vs Huihui vs HauhauCS
- Qwen3.5-4B: Heretic vs Huihui vs HauhauCS
- Qwen3.5-2B: Heretic vs Huihui vs HauhauCS
- Qwen3-4B: Heretic vs Huihui vs HauhauCS
- Gemma4-E2B: 13 Abliteration Techniques Compared