Skip to main content

Claude Code vs. Tumblr Trust & Safety: How One Evening of AI-Assisted Forensics Exposed What Platforms Refused to See

April 13, 2026

I. The Numbers That Matter

  • 9 months of coordinated harassment across Tumblr, AO3, Bluesky, Medium, and DeviantArt.
  • 1,412 posts from a single dedicated hate blog, analyzed by a 15-category regex engine.
  • 217 images classified by Claude Haiku's multimodal vision API. Cost: $0.06.
  • 5 persona profiles — each with a distinct rhetorical fingerprint, all caught by the same analyzer.
  • ~100 media outlets and NGOs contacted. Fewer than 5 replied. All rejected. Zero journalists reached out.
  • 62 pages of forensic analysis, written by the target herself in July 2025. Submitted to AO3 and Tumblr. No action taken.
  • One evening with Claude Code to independently verify every finding in that document.

The gap between "the target already did the work" and "platforms acted" is still counting — 9 months and growing.

The gap between "the target already did the work" and "a general-purpose AI coding assistant automated the verification" was one evening.

This article is not about proving that harassment happened. The harassment was proven in July 2025. This article is about proving that not acting was a choice.


II. What Happened

In late 2024, I began collaborating with another Hakuouki fanfic writer on AO3. By January 2025, anonymous attacks appeared in her comment section. By March, a plagiarism accusation — later proven false — had fractured the fandom. By June, I'd been falsely identified as her sockpuppet account and targeted for systematic exclusion.

I had never interacted with most of the people attacking me. My "crimes" were: writing about a character they didn't like, using AI tools in my creative process, and refusing to disappear when told to.

By August 2025, the network had:

  • Gotten my Bluesky account permanently banned (3-hour escalation, no appeal)
  • Gotten my Medium account deleted (by reporting me for "using AI to write")
  • Temporarily gotten my Tumblr deleted
  • Created at least three dedicated hate accounts
  • Published a 16,000-word Google Doc framing me as mentally ill and dangerous
  • Left hostile bookmarks on my AO3 works that I couldn't delete

I filed 9 abuse reports with AO3. No response. I contacted Tumblr with 30+ URLs of evidence. Dismissed. I wrote to Bluesky's CEO directly. My account was never restored. I submitted a full documentation package to the game's copyright holder. No reply.

I contacted approximately 100 journalists, academics, media outlets, and NGOs. Fewer than 5 replied. Every single one declined.

So I did the only thing left: I built the tools myself.


III. What Claude Code Built in One Evening

On April 12, 2026, I sat down with Claude Code — Anthropic's AI coding assistant — and described what I needed. Over the course of one session, we built:

A regex-based harassment analyzer (hatewatch-core.mjs) with 15 attack categories:

  • Insults — Direct curses and name-calling
  • Exile Speech — "Touch grass," "get a life," "seek therapy" — disguised-as-advice attacks
  • Personal Attacks — Appearance, intelligence, fake psychiatric diagnoses
  • DARVO — Deny-Attack-Reverse Victim/Offender patterns, including block-shield and identity-shield
  • Performative Rationality — Bureaucratic-polite dossier framing ("Proof: 1, 2, 3", "please stay safe")
  • Image Attack — Structural detection of posts using images/GIFs as primary attack payload
  • AI Stigma — Accusations of being AI-generated, "soulless," "not human"
  • Sexual Shaming — Slut-shaming, sexual humiliation
  • Fake Testimony — Staged "reformed fan" Q&A templates
  • Comparative Suffering — "Your problems are nothing compared to real victims"
  • And 5 more: Hate Speech, Cultural Taboos, Collective Shaming, Harassment, Suspicious

A Tumblr API scraper that pulled the full 1,412-post archive of the primary hate blog.

A Claude Haiku vision classifier that downloaded and analyzed 217 embedded images, returning structured assessments of violent_action, mocking_action, attack_likelihood, and plain-text descriptions — so I never had to look at the hostile images myself.

A stylometric comparison proving that two suspected alt accounts were actually two different people collaborating (profanity density 13x different, sentence length patterns divergent, zero shared use of the target's insulting nickname).

A multi-persona profiling system that applies the same analyzer to different members of the harassment network, revealing completely different rhetorical fingerprints within the same coordinated group.

Total API cost: approximately $0.06.


IV. What the Data Shows

The network self-identifies as "Wolves of Mibu"

The Shinsengumi's historical nickname — adopted by the harassers as their group identity. One member personally created the tag on AO3. Another's blog title is "100% Free Range Ronin." Another's username literally claims the Shinsengumi vice-commander role. They cosplay as 19th-century samurai enforcers while their actual behavior consists of post-block callout accounts, reaction GIFs, and 228 repetitions of a single "get a life" tag.

One signature tag, 228 times

The hate blog tagged 228 of its 1,412 posts with the same exile slogan — 16% of its entire output is a single copy-pasted insult. The analyzer's distinct_matches field exposes this: 494 Exile Speech hits, but only 11 unique phrases.

Meta-DARVO

One network member wrote a 16,000-word Google Doc accusing me of DARVO — using the term itself 15 times as an accusation. The document — which names me 56 times, references "blocking" 106 times, and claims "death threats" 14 times — is itself a DARVO artifact. She claims to have moved on from someone she wrote 16,000 words about.

The "death threat" that wasn't

Those 14 "death threat" citations all trace to a single line I wrote in frustration, directed at an anonymous machine-translation sockpuppet: "Damn it, do you believe that once I find you, I'll kill you? (in Okita Souji's tone)." I annotated it myself as character-voice borrowing. The parenthetical was deliberately stripped; the follow-up fandom in-joke was deleted; the counterquestion was erased. Three layers of context removed to manufacture a weapon.

The anti-AI stance is a costume

The attacks began in January 2025 — before I had ever used AI in my writing. The person who started the entire chain (through a plagiarism accusation later proven false) is the only network member who never explicitly took an anti-AI stance. The "principled opposition to AI" was bolted on months later as a publicly palatable justification for attacks that had already been running on entirely different fuel.

What was that fuel?

The network's most productive writer was publishing near-daily updates. Then I appeared — and in one month, 17 chapters of accumulated work went live. A creative collaboration formed. New interpretations of established characters emerged. The existing hierarchy felt threatened.

The entire 9-month, 10-node, 5-platform harassment campaign traces back to a territorial anxiety about who gets to dominate a small fandom's creative output. Everything else — the plagiarism accusations, the identity conflation, the anti-AI framing, the DARVO, the platform reports — was scaffolding built on top of that foundation.


V. What the Platforms Didn't Do

Every platform involved had the tools, the data, and the responsibility to detect this. They chose not to.

A basic aggregation of the hate blog's ask/answer data would have revealed: 238 asks from only 7 named accounts + 178 anonymous asks. The top two named askers — LightMelodyVA (29 asks) and vicecommanderhijikata (23 asks) — were already identified by the target as members of a coordinated network in her July 2025 forensic report.

A basic time-zone analysis would have shown all participants sharing the same North American circadian rhythm.

A basic vocabulary comparison would have shown the anonymous asks sharing 25% distinctive-word overlap with the blog owner's own writing — suggesting self-authored anonymous amplification.

None of this required AI. None of this required novel methodology. All of it was visible to any trust & safety analyst who bothered to look.

The pipeline I built to detect all of this costs $0.06 in API calls and a few hours of engineering. The platforms had 9 months.


VI. The Human Cost

I used to write prolifically. Over two years, I produced a body of work I was proud of — character analysis, literary criticism, creative fiction. I engaged with a fandom I loved.

Today, every one of those works sits abandoned. I can't write long-form anymore. The energy is gone — not because I lost the argument, but because I spent 9 months fighting for the right to exist in a space where the people who attacked me faced zero consequences, and the platforms designed to protect me chose silence.

AO3 banned me for 15 days and offered an appeal. I looked at the appeal process and thought: crawling back to beg for permission to exist on a platform that ignored my 9 abuse reports is not justice. It is humiliation. I deleted my account.

This is what coordinated harassment actually costs. Not the 1,412 posts. Not the banned accounts. Not the rejected media pitches. It costs you the thing you came here to do in the first place.


VII. Why This Article Exists

This article is not a plea for sympathy. It is not an attempt to relitigate fandom drama. It is a technical demonstration with a moral argument:

The tools to detect coordinated harassment are trivial to build. A regex engine, a vision classifier, a scraper, and basic statistical analysis — assembled in one evening, costing less than a cup of coffee. The methodology is open. The code is reusable. Any target of similar harassment can adapt this pipeline to their own case.

The failure is not technical. It is institutional. Platforms have the data. They have the engineers. They have the policies. What they lack is the will to apply those policies when the target is a single person in a small fandom who doesn't generate enough revenue to matter.

Silence in the face of documented abuse is not neutral. It is a policy decision. And it should be named as one.

I wrote a 62-page forensic analysis in July 2025. I submitted it to every relevant platform. I contacted ~100 media outlets. I built a hatewatch analyzer. I documented everything on my own website.

If you work in trust & safety, platform governance, or digital rights — the evidence is there. It has been there for 9 months. The question was never whether it existed.

The question is whether you'll look.


This article was drafted with the assistance of Claude Code (Anthropic). The irony is not lost on me: the same AI technology that was used as a pretext to attack me is now the tool that documented the attack. The analyzer, the scraper, the vision classifier, and this article itself were produced in collaboration with AI. If that bothers you, I invite you to consider what bothers you more: a writer using AI, or a platform ignoring 1,412 posts of targeted harassment.

— ygpgsgl, April 2026

Full timeline: 51 events · Hate Speech Monitor: 5 persona profiles · Full report: 50 chapters