RESOURCES / Articles

Digging into Tweets:
Mining Twitter Comments and Gauging Influence

May 14, 2020

The business mindset about social media

There has emerged a trend where the two giants of the social media sphere have been typecast (at least in the mind) into different silos:


Most popular brands have “Fan Pages” and “Twitter Handles” catering to audiences in both the mediums. But with two-way exchange, experience, and user behavior, brands (and celebrities) have found Twitter as the ‘short and sweet’ spot to cater to their target audience. Be it campaigns, grievance addressal, contests or promotions, the preferred medium has weighed down onto Twitter. Well this isn’t a critique of social media so let me get back to the point.

The social media situation of one of our clients

One of the world leaders in logistics with over four decades of premier existence in the industry, headquartered in North America and spanning across 150+ countries, approached Decision Foundry with a gamut of requests like site auditing, dashboard creation, and mining of Twitter feeds. Our client also had multiple Twitter handles, one as a main handle, followed by country specific handles and a help handle.

The Twitter handles existed since early 2010, running regular campaigns, tweeting about new developments, news and updates of upcoming seasons, and running various contests. But they never had any one rack their brains or crunch the numbers. They never did a deep dive into the analytics of all the social media or specifically “Twitter” media efforts that they were putting in.

Starting with exploration

The engagement started on an exploratory mode to identify to what lengths insights can be mined from the Twitter data. There were multiple iterations of the exercise leading to a final consolidated outcome.

Prior to embarking on this engagement we had worked successfully on various text mining projects that were also presented as case studies to a wider audience (you can view the event here). We had worked for apparel brands, giant B2B IT consulting companies and now a top logistics company.

Coming onto the latest endeavor, we were provided with multiple 20k rows of datasheets (20k because the client had a tool which had the limitation of a max 20k pull at a time), with columns of data like the tweet, the author of the tweet, time stamp, followers, etc. The tweets were extracted to gauge the user behavior for a particular brand of services that they were running. The data wasn’t very rich in information but enough to draw some actionable insights latent in the data.

Digging deeper

Looking at the variables available, we shortlisted four main approaches to bring deeper insights:

  • Sentiment Analysis : I like calling this the classical analysis as that is what hits the mind first when we think about social media and is what most people do. For the sentiment analysis we mainly relied on R, where we have built on codes that deploy POS tags to each of the tweets. These tweets are then mined using a propriety code that we have developed in VBA that parses the data based on the POS tags and identifies the sentiments. Our algorithm worked quite well with nearly 85% accuracy and with strong precision and recall capabilities.
  • Hashtag Behavior : Taking the analysis to the next level we moved to understand the trends, what types of hashtags worked, and studied various hashtag associations. Were there certain hashtag pairs, not necessarily designed to be together, that people are using, e.g.: #BrandX, #Sucks? The hashtag behavior was focused on trending and understanding the behavior and the ‘lifespan’ of the hashtags (when a hashtag has been introduced, when does the hashtag peak, when does it fall bleak, does it resurge, if so then when and why?) We also worked to understand hashtag associations using R and mining hashtags exploiting association rules. This unraveled some very startling hashtags appearing together which entailed a negative association (a small but significant section). It required a quick action to counter this and prevent it from becoming viral, impacting the sentiments of many Twitter users.
  • Influencer Analysis : With a lot of quality information this analysis can be a goldmine for success on Twitter. For our case the only way we could measure this was re-tweets (not bad though). Influencer Analysis entails understanding that, apart from the known Twitter handles that support the brand, who else has the ability to influence their followers because of their positive or negative outlook towards the brand, and how can that information be used to our advantage. There are generally two major chunks of such users, one who tend to tweet quite often but have a relatively small threshold following, while the second one being with a relatively stronger following and less tweets spanning a great reach. Our analysis unearthed some excellent examples of Twitter users who related themselves to specific campaigns and retweeted about the brand. The client got some of these users in its kitty and there are plans to approach them and sign them up as brand ambassadors.
  • Campaign Success Parameters : There were various campaigns that the client was running with various celebrity level sports stars and thus had a good advertising investment. It was important to measure the success levels of these campaigns to understand which campaigns were performing better; and what kind of campaigns, celebs, and vertical, people relate to. For example, there were NFL players, NASCAR champs, and golf aces hooked for the campaign but what were their success levels?

    We delved deep again into tweet hashtags, looking into how many unique users did the campaigns end up engaging (given here only retweets were available as engagement parameters), and for how many days was the impact of the campaigns still looming once the campaigns were over. This revealed insights on how well did the brand involve users in conversations when there are no campaigns. It turned out that while campaign days resulted in as much as 300 tweets a day, non-campaigns days were as low as one tweet a day. This has implications as to what should be the optimal time gap between campaigns. It can also help in analyzing what should be the cost and frequency of running expensive campaigns to the point people naturally start conversing about the brand and it does not lose steam.

To conclude

The text mining exercise of the Twitter feeds turned from being exploratory to a very robust set of outcome and recommendations for the client, allowing them to take well-defined actionable steps to achieve a stronger brand presence on Twitter. They have also made Twitter analytics a regular part of their intelligence activities.


Connectors Reviews