Monosemanticity Search.
I want to see the activations for
and learn what's going on in a neural network.
Created by
Mustafa&
Siddharth.Indexed using
, with the data from
Anthropic's A/1dictionary learning run.
We've also simplified the original paper with visuals, check it out
here.Original research
from the
AnthropicTeam. Huge shoutout to everyone at Anthropic,
they have the absolute best
Mechanistic Intepretabilityresearchers. Everyone should follow
Trentonand
Chrison
they are brilliant!