Obama VS Trump Analysis

1st Apr 2018

Project Background

The objective of this article is to analyze the tweets from Pres. Obama and Pres. Trump. I am interested to see the similarities and differences between 44th & 45th POTUS:flag_us:.

I will be using natural language processing (NLP) libraries (NLTK, TextBlob) to conduct text and sentiment analysis.

What is natural language processing?

Natural language processing is the ability of computers to understand human languages.

Useful applications/services built using NLP:

  • Chatbots
  • LawTech to automated contract review
  • HRTech to analyze resumes and conduct background check

I like to read a summary and grasp the main points. Reading a 200 pages document is not enjoyable for me. NLP can help me extract keywords and give me an overview of the document.

Some notes:

  • For the format of this article, I will always present the numbers and figures from Pres. Obama, then Pres. Trump.
  • I am unable to collect all their tweets because I am unsure of which tweets are missing.

Here is the workflow for this data science project:

Data mining/collection :point_right: Data wrangling/cleansing :point_right: Data science

Here’s how I split my efforts over 3 weekends to complete this case study. Data mining (45%), data wrangling (45%) and actual data science (10%). You can refer to this to see how I cleaned the data and the problem I faced when filling up the missing tweets.

Pres. Obama tweets frequency:

Based on twitter, Pres. Obama has a total of 15.5k tweets. I managed to data mine 13.2k tweets. From this heatmap, I have checked these months without tweets, unfortunately, they do not account for the missing tweets.

Obama's tweet frequency

Pres. Trump tweets frequency:

Based on twitter, Pres. Trump has a total of 37.2k tweets. I managed to data mine 33.9k tweets.

Trump's tweet frequency

Tweet frequency comparison:

It is obvious that Pres. Trump tweet more than Pres. Obama

Pres. Obama metrics overview:

Metrics overview refers to replies, retweets and likes Statistics

Obama's tweet metrics
Obama's tweet metrics table

Pres. Trump metrics overview:

Trump's tweet metrics
Obama's tweet frequency

Metrics overview comparison:

In terms of average/mean, Pres. Trump has a stronger social media presence as compared to Pres. Obama.

However, in terms of maximum, Pres. Obama (1,711,231 retweets; 4,591,647 likes) are higher than that of Pres. Trump (265,230 retweets; 610,866 likes)

Pres. Obama sentiment overview:

Sentiment analysis consists of polarity and subjectivity.

  1. Polarity has a range between -1 (extremely negative) to 1 (extremely positive).
  2. Subjectivity has a range between 0 (extremely objective) to 1 (extremely subjectively)
Obama's sentiment
Obama's sentiment table

Pres. Trump sentiment overview:

Trump's sentiment
Trump's sentiment table

Sentiment overview comparison:

In terms of average, Pres. Obama is slightly less positive and less subjective than Pres. Trump.

Pres. Trump has a higher variance as compared to Pres. Obama.

Based on the distribution of subjectivity, I think that Pres. Obama’s Twitter account is more professional (objective), while Pres. Trump’s account is more personal (subjective).

Pres. Obama top 100 mentions:

Obama's top 100 mentions

Pres. Trump top 10 replies:

Trump's top 100 mentions

Pres. Obama top 100 hashtags:

Obama's top 100 hashtags

Pres. Trump top 100 hashtags:

Trump's top 100 hashtags

Pres. Obama top 100 keywords:

All the keywords are cleaned. While some of them are not perfect, they are good enough for this project.

Obama's top 100 keywords

Here are their original spellings (original word, cleaned word):

[('your', 'ymy')], [('congress', 'congres')], [("n't #", 'nt')]; [('//wh.gov/live #', 'whgovlive')];

Pres. Trump top 100 keywords:

Trump's top 100 keywords

Here are their original spellings (original word, cleaned word):

[('was', 'wa'), ('wa', 'wa'), ('w/a', 'wa')]; [('via', 'vium')];

Pres. Obama top 100 Noun-phrases (NP):

Obama's top 100 NP

Here are their original spellings (original word, cleaned word):

[('vp biden', 'vpbiden')]; [("'s plan", 'splan'), ("'s plans", 'splan'), ('’ s plan', 'splan'), ('’ s plans', 'splan')];

Pres. Trump top 100 Noun-phrases (NP):

Trump's top 100 NP

Here are their original spellings (original word, cleaned word):

[('russia', 'russium')];

Pres. Obama tweet frequency (Day of week VS Year):

Obama's day of week frequency

Pres. Trump tweet frequency (Day of week VS Year):

Trump's day of week frequency

Pres. Obama tweet frequency (Day of month VS Year):

Obama's day of month frequency

Pres. Trump tweet frequency (Day of month VS Year):

Trump's day of month frequency

Pres. Obama tweet frequency (Hour (GMT) VS Year):

Obama's hour frequency

Pres. Trump tweet frequency (Hour (GMT) VS Year):

Trump's hour frequency

Final thoughts

Based on the result, I would say that Pres. Obama's account gives me a professional vibe, and I would tend to take his tweets more seriously than Pres. Trump. However, Pres. Trump's account allows me to understand him as an individual better than Pres. Obama.

Since these 2 accounts are so different in terms of vibe (professional vs personal). I don't have a table to summarize the similarities and differences between these 2 accounts.