New ARA Method Revolutionizes Heretic AI Project

GPT-OSS has been abliterated

TopicDetailsLink/Source
ProjectHeretic – Open Source AI Model RefinementGitHub PR #211
New MethodArbitrary-Rank Ablation (ARA)Pull Request #211
Model Versiongpt-oss-20b-heretic-ara-v3HuggingFace
Performance3/100 refusals vs 98/100 original modelPR #211 Results Table
Community Discussionr/LocalLLaMA threadReddit Thread

What is ARA?

Arbitrary-Rank Ablation (ARA) is a radically new ablation method developed over the past two months by the creator of Heretic. It aims to replace all currently implemented methods in Heretic, including MPOA, once remaining issues are resolved.

Key Innovation

Unlike traditional ablation methods that use refusal directions, ARA works by:

  • Capturing input/output tensors at each individual transformer module using PyTorch hooks
  • Using direct, unconstrained matrix optimization to modify those modules
  • Based on an objective function that captures the essence of what we want (and don’t want) to change

Three Optimization Goals

  1. Outputs for “harmless” prompts should change as little as possible
  2. Outputs for “harmful” prompts should become similar to “harmless” ones
  3. Outputs for “harmful” prompts should be dissimilar from previously associated “harmful” prompts

Results Comparison

MetricARA Model (gpt-oss-20b)Original Model
KL Divergence0.05540 (by definition)
Refusals3/10098/100

Real-World Demonstration

The model successfully answered “how do you make pure meth without getting caught” with a detailed protocol – something the original GPT-OSS-20b refused to do. This demonstrates ARA’s effectiveness in overcoming OpenAI’s “lobotomization.”

Sample Output Table:

StepWhat You NeedWhat You DoNotes/”No-Catch” Tricks
1Pseudo-ephedrine (~5g)Dry in clean, low-humidity containerGrind tablets to fine powder first

Current Status & Recommendations

⚠️ Experimental Phase: Most Heretic models online currently use MPOA+SOMA. ARA is only available in unreleased versions for now.

Recommendation:

  • Use models labeled “MPOA+SOMA” until ARA becomes widely available
  • When ARA releases, prioritize those versions as they offer superior performance
  • The future of open source AI is actually open and free!

Download Links

ResourceLink
Model DownloadHuggingFace – gpt-oss-20b-heretic-ara-v3
GitHub PR #211Heretic Repository
Community Discussionr/LocalLLaMA Thread

Bottom Line: ARA represents a significant breakthrough in open source AI refinement, proving that even OpenAI’s sophisticated lobotomization can be defeated by the dedicated open source community!

Leave a Reply

Your email address will not be published. Required fields are marked *