One of the most popular use cases for Big Data technology is sentiment analysis, the process of combing social media to determine how the general pubic feels about a product, company, political candidate, or any other topic.
If this sounds like it shouldn’t be that complicated, you’re correct; it can be done without a huge amount of resources. There are basically two ways to search Twitter; searching their database (which is rate-limited) or using their streaming API to store tweets in your own database. We’ll use the second method because once the tweets are stored on your server you can search them as often as you want.
Have your team follow these steps:
- Create an account on Twitter so you can login to the Twitter API (REST-based) and authenticate using OAuth.
- Install MongoDB on a server. Responses from the Twitter API are in JSON format, and MongoDB is ideal for storing them.
- Have a Java coder implement Twitter4J to interface with the Twitter API, start collecting tweets, and store them to MongoDB.
- Query MongoDB to get results for a given search term.
- Sort the results, excluding articles like “a, an, the” and others.
- Load the results into some kind of visualization tool like wordle.net to see the results.
You can find all the gory details, including Java code examples, here.
If you implement these steps, please comment. We’d all love to hear from you.