By Kyle Napierkowski — Oct 4, 2022

Let's Solve Elon's Twitter Bot Problem

In case you live under a rock, or rather, are smart enough to disconnect yourself from social media: Elon Musk has given in today and agrees to buy Twitter ($TWTR) at his original agreement's purchase price of $54.20/share. After months of Twitter insults to the executive team, accusations of fraud, and even lawsuits, he has thrown in the towel.

So now that Elon's really buying Twitter, warts and all, we were wondering – what's he actually going to do about the bot problem?

If our twitter bid succeeds, we will defeat the spam bots or die trying!
— Elon Musk (@elonmusk) April 21, 2022

The Bot Problem

Zooming out a bit, Twitter has more than "a" bot problem. It has many bot problems.

Some problems it may want to solve – for example, visible problems that degrade the experience of power users like scammer/impersonators. Other problems like sockpuppet accounts and troll farms are in the broadest sense bad, but a hard-to-detect user that boosts Twitter's MAU numbers may attract a little less attention from the executive team.

Let's zoom back in, and just pick this one: the impersonator accounts.

What does an impersonator account look like?

Impersonator accounts are a specific "bot" problem that we'll focus on for this project. Impersonators masquerade as popular Twitter users, following their followers and trying to scam them.

For example, #FinTwit personalities usually have a problem with these impersonators who try to scam followers into crypto schemes. I've heard of other niches where this occurs to – astrology is one that comes to mind, where the impersonator pretends to be the astrologer and scams their followers into setting up a fake astrology reading session for $$.

Here's one example – an impersonator of Michael Green, Chief Strategist at Simplify Asset Management. The real account:

The fake account:

The fake looks pretty obvious when you look side by side, but the scam works by being "good enough" when they message a user and start a discussion. Like most spammy scams, only some tiny fraction of people need to fall for it for the fraud to pay off.

The thesis: this can be stopped

Twitter has thousands of engineers, including a lot of data staff. So presumably if they wanted to stop this problem, they already would have. There's probably some nuance we're missing.

But that's no fun.

Let's go ahead and try to solve this problem ourselves. Can we build a model that reliably detects impersonator accounts, detecting nearly all of them and without many false positives?

Our Gameplan

We're doing this on the fly here, so we may hit a dead-end or take some twists and turns. Let's lay out a high level gameplan though. Here's what this series will look like, if all goes according to plan (which data projects never do):

Collecting a dataset
Engineering features for model
Building a baseline model
Improving on our model
Presenting results

Interested in following along? Subscribe to our blog for updates on this project as they come!

Coming Up Next: Collecting a dataset

The Bot Problem

What does an impersonator account look like?

The thesis: this can be stopped

Our Gameplan

Subscribe to Kaleidoscope Data