Last week I wrote an analysis of the polling data on YouGov has compiled Donald Trump’s individual tweets. Since then, thing kept nagging at me is this: Trump’s most popular tweets (most of which were uncontroversial, ceremonial messages) don’t sound like the kind of thing we most associate with the way Trump uses his Twitter account, while his least popular tweets sound very much like the kind of stuff the president would say in real life.
So I decided to take a second dive into the data and see if this vague sense could be firmed up empirically. And I found that the theory basically holds up: that Trump-ier tweets tend to be less popular than tweets that sound like they’re staff-written.
The Weird Math That Lets Me Do This Analysis (Feel Free To Skip This Section)
If you want to get straight to the fun stuff, skip this section.
To understand how I did this, you need to rewind a couple of years. In 2016, savvy political observers noted that Trump’s more “Trump-ish” tweets (think of the insults he threw at his primary and general election opponents) were posted from an Android phone, while other less Trump-y tweets came from his iPhone or other sources.
Trump hasn’t been tweeting from his Android recently. But if we assume that he was the author of the Android Tweets and not the Other Tweets, then we can grab data from those old tweets and use them to guess who is authoring his more recent tweets. This isn’t a perfect method—Trump’s style might change over time, as do the topics of his tweets. But it’s a practical assumption that at least lets us ballpark the “Trumpiness” of any given tweet.
So I used a bag-of-words approach on these tweets, which means I compared the frequencies with which Trump used specific words in his Android tweets and his tweets from other sources. I could then use those frequencies to get a sense of whether Trump or his staff authored some newer tweet. I used a random forest (for the real stats nerd, more info here) to process those word frequencies and come up with probabilities that Trump did or didn’t write a specific tweet. Note that this method borrows heavily from the Atlantic’s Andrew McGill, who created the excellent Trump or Not Bot using Naïve Bayes. My method worked reasonably well on past data—it guessed whether the tweet came from Trump’s Android (a proxy for being written by Trump) or somewhere else over 80 percent of the time.
So I applied the classifier to Trump’s newer tweets and connected it to YouGov data. Reminder: YouGov polled a representative sample of Americans on each of Trump’s tweets, allowing them to rate it “Great,” “Good,” “OK,” “Bad,” or “Terrible.” They then translated that to an overall score ranging from -200 (everyone hated the tweet) to 200 (everyone loved it). You can find more information here, but the basic idea is that positive scores mean a tweet was popular and negative scores signal that people didn’t like it.
What Makes a Tweet Trumpy?
The best way to get a handle on this question is to look at some examples. The classifier thought each of these tweets had a 85 percent or higher chance of being written by Trump rather than his staff:
With all of the Fake News coming out of NBC and the Networks, at what point is it appropriate to challenge their License? Bad for country!
— Donald J. Trump (@realDonaldTrump)
October 11, 2017
Remember, Republicans are 5-0 in Congressional Races this year. The media refuses to mention this. I said Gillespie and Moore would lose (for very different reasons), and they did. I also predicted “I” would win. Republicans will do well in 2018, very well! @foxandfriends
— Donald J. Trump (@realDonaldTrump)
December 18, 2017
And the classifier thought these tweets were probably written by his staff:
The classifier gets at some of the conventional wisdom about Trump’s tweets. Trump is more likely to use certain catch phrases, bash the media, mention Fox News, etc. and his staff is more likely to talk about procedural issue or repeat talking points in a way that just doesn’t quite sound like Trump. My classifier also found that the hashtag symbol (the # used to denote different topics on Twitter) and the number of spaces used (one instead of two) are both useful for differentiating between Trump and staff tweets (the Atlantic found the same thing).
There’s more to the classifier, but you get the basic idea—Trump likes certain words, topics, and punctuation marks. And we can use that data to guess if Trump wrote a specific tweet.
People Actually Preferred the Staff-written Tweets to the Trumpy Tweets
The first step was to make my classifier give an up-or-down guess of whether Trump or his staff wrote a specific tweet. Here’s how people felt about the tweets that seemed to be authored by Trump:
This histogram has a wide spread, but there’s a big peak to the left of zero. Basically that means that Trump seems to have authored some tweets that people like, but also has written a bunch of poorly-rated tweets.
The staff-written tweets were noticeably less negative
There’s a wide range, but it seems like many of them fall near or slightly to the right of the neutral point on YouGov’s popularity scale.
A similar pattern showed up when we looked at the probabilities instead of simple Trump-or-staff projections.
This graphic shows the relationship between the probability that Trump wrote a tweet and the overall popularity score it got based on YouGov’s polling (positive means popular, negative means unpopular).
There’s a big glob of points in the lower right hand corner of the graphic. That means that when my classifier really believed that Trump wrote a tweet, the American people often disliked it. There’s a lot of noise here, but that cluster of low-rated, Trump-y tweets stands out.
We can also break this down by partisanship and what you see is that even Republicans generally felt a little bit cooler towards more Trump-ish tweets than the staff-ier ones.
To be sure, Republicans rate most of Trump’s tweets (including the ones in his voice) favorably—nearly every tweet gets a rating above 50). But even within that context you can see a cluster of tweets in the lower right-hand side of the graphic, meaning that the more Trump-y (to the right) a tweet is, the less well-rated it is even by his Republican admirers.
Democrats, on the other hand, really didn’t like the Trump-iest tweets.
If the classifier was pretty sure that Trump had written a tweet then it often got a low rating. In fact, there’s a cluster of points with a greater than 50 percent chance that Trump wrote them and a -100 rating (equivalent to literally every Democrat saying the tweet was “Bad”).
Some Basic Takeaways
First, the American people don’t love Trump’s Trump-iest tweets. Random forests are hard to interpret, but people don’t seem to like the thing my classifier picks up on. That’s not to say that Trump’s Twitter style is necessarily a net negative (it has other macro advantages, such as helping him control news cycles). But we shouldn’t pretend that everything he does on social media plays well with the American people.
Second, the “Trump-iness” of the tweet isn’t the only thing that matters. The scatter plots shown above all had some sort of pattern, but the pattern wasn’t overwhelmingly strong. So if we’re trying to figure out whether the public will like a specific tweet, we need to expand beyond tone and maybe think about aspects such as topic or timing.
Third, it shows that Democrats really dislike tweets that sound like Trump (or at least tweets that my classifier thinks Trump sounds like) but Republicans are somewhat cooler towards those sorts of tweets, too. Again, tone only explains so much here—there’s a real correlation between Republican and Democratic assessments of tweets—but that pattern seems to be real.