The Problem: Spam Invasion
My blog was drowning in spam. Not just any spam—cryptocurrency referral link spam. Comment after comment from outlook.com email addresses promoting Binance and Gate.io with identical templated messages:
“Your article helped me a lot, is there any more related content? Thanks!”
“Thanks for sharing. I read many of your blog posts, cool, your blog is very good.”
“I don’t think the title of your article matches the content lol. Just kidding, mainly because I had some doubts after reading the article.”
Every. Single. Comment. Identical. Robotic. Infuriating.

Why Existing Solutions Didn’t Work
I looked at Akismet and other popular spam filters. They’re good, but they cost money—and they don’t learn from your specific spam patterns. My blog had unique spam that wasn’t hitting their radar. Plus, I’m cheap. I wanted to build my own.
The Solution: EAI Anti-BS Filter
I decided to create a custom WordPress plugin that would:
- Detect spam using pattern matching and fingerprints
- “Roast” the spam instead of deleting it—replace it with funny, sanitized content
- Learn continuously from every spam comment caught
- Allow manual training so I could teach it new spam patterns
- Be free (no subscription required)

Phase 1: The Core Plugin Structure
I started with a modular plugin architecture:
eai-anti-bs-filter/
├── eai-anti-bs-filter.php (main file)
├── includes/
│ ├── logger.php (logging spam data)
│ ├── settings.php (admin settings)
│ ├── roast-manager.php (roast personas & phrases)
│ ├── fingerprint-matcher.php (learned patterns)
│ ├── sanitizer.php (the core spam handler)
│ ├── manual-actions.php (admin interface)
│ └── dashboard.php (stats & monitoring)
└── logs/
└── spam_log.json (spam database)This structure made it easy to:
- Add features independently
- Debug specific functionality
- Train the plugin on new spam patterns
- Keep admin interfaces clean
Phase 2: Initial Testing
I started testing with a simple detection strategy:
Quick Pattern Checks:
- ✅ Outlook.com emails (100% of my spam used this)
- ✅ Binance/Gate.io domain mentions
- ✅ Referral link parameters (ref=, register?)
- ✅ Generic spam phrases (“your article helped me”, “cool blog”, etc.)
- ✅ Too many links in a single comment
Results: Caught about 85% of spam automatically.
Phase 3: The Learning System
Instead of just blocking spam, I implemented a fingerprint matching system that learned from every caught comment:
- Extract domains mentioned in spam
- Save suspicious email domains (@outlook.com)
- Build keyword database (binance, crypto, usdt, telegram, etc.)
- Track patterns across multiple spam comments
- Continuously retrain from the spam & moderation queue
The fingerprint system improved detection to ~95% accuracy.
Phase 4: Manual Roasting
But some spam still got through. So I added a manual roast button in the WordPress admin interface:
- Every comment gets a “🔥 Roast This” button
- Click it to:
- Save the original spam data
- Replace with a random roast persona and phrase
- Automatically train from this example
- Approve or hold for moderation (configurable)
- Bulk roasting for multiple comments at once
Now when new spam patterns appeared, I could manually train the plugin in seconds.
The Issue: Unintended Consequences
Everything was working perfectly until… I accidentally sanitized all the pingbacks.
What Went Wrong
The sanitizer checks had a logic error:
php
// WRONG - This skips if type IS empty (sanitizes pingbacks!)
if (($commentdata['comment_type'] ?? '') !== '') {
return $commentdata;
}Pingbacks and trackbacks have an empty comment_type initially, so they got caught and roasted instead of being skipped.
Casualties: Multiple legitimate pingbacks turned into roasted comments.
The Panic
My heart sank. All those legitimate external sites linking to my blog, and I’d replaced their pingbacks with silly roasted messages. But wait…
I had saved the original data.
The Recovery: Built-in Safety Net
Because the plugin stored every original comment’s data in WordPress comment metadata, I could recover everything:
php
$original = get_comment_meta($comment_id, 'eai_original_spam', true);
// { "author": "...", "email": "...", "content": "..." }I quickly added a Pingback Recovery Tool right into the admin:
- Go to Tools → “Recover Pingbacks”
- Click “Restore Pingbacks Now”
- All sanitized comments restored to original state
- Metadata cleaned up
Recovery: 100% successful.
The Fix
I updated the detection logic:
php
// CORRECT - Skip if type is NOT "comment"
if (($commentdata['comment_type'] ?? '') !== 'comment') {
return $commentdata;
}Now pingbacks, trackbacks, and other non-comment types are properly skipped and never touched by the sanitizer.
Lessons Learned
1. Always Save Original Data
When you transform user-generated content, always keep a backup in metadata. It saved me here and provides audit trails for learning algorithms.
2. Test Edge Cases
I tested normal comments but didn’t think about pingbacks/trackbacks until they were already roasted. Test edge cases early:
- Pingbacks
- Trackbacks
- Comments from logged-in users
- Comments with multiple links
- Different comment types
This is similar to how I approach building other custom solutions—whether it’s learning game development with AI or any automated system. Build in feedback loops, test boundaries, and plan for recovery before you need it.
3. Type Checking is Critical
WordPress has multiple comment types: comment, pingback, trackback, reviews, custom types. Always explicitly handle them.
4. Multi-Layer Detection Works
The plugin survived this mistake because:
- Quick pattern checks caught obvious spam first
- Fingerprint learning adapted over time
- Manual roasting allowed custom training
- Original data preservation enabled recovery
No single detection method is perfect. Layered approaches are more robust.
5. Document Your Architecture
Creating modular, documented code made this recovery possible. I could quickly find the issue and fix it without breaking everything else.
Current Status
The plugin is now:
- ✅ Detecting spam at multiple stages (pre-insert, post-insert, transitions)
- ✅ Learning from every caught comment
- ✅ Recovering from mistakes automatically
- ✅ Allowing manual training via admin interface
- ✅ Protecting legitimate comments including pingbacks
- ✅ Providing audit trails via metadata and logging
Quick update:
I had to deactivate the plugin, it still flags legitimate pingbacks as spam. After rolling back the changes, even previously sanitized comments reverted to their spam state and got auto-approved, effectively increasing spam comments instead of eliminating them.
Spam Stats
Before the plugin: 50+ spam comments per week
With EAI Anti-BS Filter: 1-2 spam comments per week (auto-roasted)
~98% reduction in manual spam moderation work.
The Future
What’s next for the plugin:
- 🎯 IP reputation tracking (block habitual spammers)
- 🎯 Language detection (catch non-English spam)
- 🎯 Image spam detection (if images appear in comments)
- 🎯 Sentiment analysis (positive but spammy comments)
- 🎯 Export fingerprints (share learned patterns across blogs)
- 🎯 Stats dashboard (see what’s being caught)
Key Takeaways
Building a custom spam filter taught me:
- Off-the-shelf solutions don’t always fit custom needs
- Data preservation enables recovery and learning
- Layered detection beats single-method approaches
- Mistakes happen—plan for recovery
- Testing edge cases prevents expensive bugs
- Documentation matters—especially when debugging at 2 AM
If you’re fighting spam on your blog, consider building a custom solution tailored to your specific spam. It’s easier than you think, and you’ll learn a ton in the process.
Have you built custom WordPress solutions? Share your story in the comments below! (Don’t worry—if it’s spam, the EAI Anti-BS Bot will roast it. 🔥)
Resources
- GitHub: EAI Anti-BS Filter
- WordPress Plugin Development Handbook
- Comment Hooks & Filters Reference
- WordPress Meta Data API


