AEON Abliteration: Benchmarks, KL Divergence, and Weight Forensics

Table of Contents

What is AEON?

AEON produces “Ultimate Uncensored” LLM variants using LEACE, which stands for LEAst-squares Concept Erasure, combined with rank-k ablation. AEON claims “lossless abliteration” and “measurably enhanced capabilities” with “no word-salad, no looping, no philosophizing spirals.”

Performance on Qwen3.6-27B

I’ve only tested AEON on Qwen3.6-27B so far.

Metric	Value
HarmBench ASR	88.8%
Full CoT ASR	100%
MMLU	82.9% (-0.4pp vs base)
HellaSwag	82.7% (-0.8pp vs base)
ARC Challenge	56.1% (-3.0pp vs base)
WinoGrande	75.3% (-2.4pp vs base)
TruthfulQA MC2	46.1% (-10.6pp vs base)
KL Divergence	0.0238

Key Characteristics

Every non-GSM8K benchmark degraded. AEON shows capability degradation across all standard benchmarks. That directly contradicts its “measurably enhanced capabilities” claim.

Worst thinking loops. 45 out of 400 HarmBench responses were empty. That’s 11.3%, the highest of any technique. This contradicts claims of “no looping, no philosophizing spirals.”

Very broad modification footprint. AEON targets 4 weight types including SSM conv1d outlier repair at 8 late layers. It also applies LEACE, which is a different mathematical approach than the orthogonal projection used by Heretic.

Moderate KL divergence. At 0.0238, AEON’s KL is in the “very good” range but 6.4x higher than Heretic’s 0.0037 on the same model.

The Gemma4-E2B comparison tested ether4o4, which applies Opus reasoning distillation on top of abliteration including LEACE-like concept erasure. Ether4o4 had the broadest modification footprint at 166 tensors and 6 types, with 84 empty GSM8K responses at 6.4%. The distillation did not preserve reasoning capability.

Weight Modification Profile

88 tensors modified (10.4% of total)
6.0% relative edit magnitude
Targets: down_proj, out_proj, o_proj, conv1d
Includes SSM conv1d outlier repair at layers 56-63

Read the Full Analysis

Qwen3.6-27B: Heretic vs Huihui vs AEON vs Abliterix vs HauhauCS
Gemma4-E2B: 13 Abliteration Techniques Compared — AEON was not tested on Gemma4-E2B, but the Gemma4 comparison includes ether4o4 which also uses LEACE-based concept erasure

External Links

AEON-7 on HuggingFace