Information about major the Big Data technologies categories.
Let us explore some Big Data technologies that form the basis of the Big Data world today.
If you read our article on Understanding the Big Data Phenomenon, you will know that Big Data is helping companies transcend previously insurmountable barriers and move towards greater business growth. Big Data is not just a collection of infinitely large data sets, but also the tools, techniques, and frameworks used to analyze them.
Big Data forms and types
Before we venture into frameworks, we need to understand what types of data need to be captured, stored and analyzed. There are three main types of data:
Structured: Traditional, relational data
Semi-structured: XML data
Unstructured: Word, pdf, text, media, logs, etc.
Big Data is a combination of all the above types of data. Some examples of Big Data include:
Social Media Data: Facebook posts and tweets contain the preferences and opinions of millions of users around the globe.
Search Engine Data: All the keywords your customers may have searched for on your social media, in your website or about you on Google. This data can give profound insights into what your customers want from you.
Financial data: This include buy decisions made by your customers when they check out, save cart, wishlist, free trial, subscribe etc. It also includes your spending and operations cost data like salaries, purchases, energy, etc.
Product data: This includes every single detail of your products including make and model, features, barcode details, cost history, etc.
Big Data Technologies
To harness the power of Big Data we need an infrastructure that is scalable, secure and can process all the types of data discussed above. These technologies are primarily classified under:
These systems provide real-time operational capabilities for capturing, storing and analyzing data. For example, a gaming website may want to study the current moves made by players and suggest new moves. Or a retail site may want to give real-time “suggested products” to returning customers based on past purchases. Operational technologies are used to support such functionality.
These technologies take advantage of cloud computing architecture to run massive computations inexpensively and efficiently. They manage large operational workloads faster without increasing spending costs. They provide insights and patterns in real-time data with minimal coding and support. Some examples include NoSQL databases like MongoDB and Couch.
While the end users of operational technologies are directly the customers, your data scientists use analytical technology for retrospective analysis. These systems provide sophisticated analyzing techniques that go beyond the scope of NoSQL. For example, companies may want to explore the best discount schemes that worked over the past year. Or they may want to analyze factors that improved employee efficiency. Such study can be done using analytical Big Data technologies.
These technologies use Massive Parallel Processing (MPP) databases for faster number crunching in parallel processes. They also use MapReduce to select subsets of data, group them and analyze them further. Some examples include Hadoop, Hive, and Pig.
It is important to note that operational and analytical technologies are not competitive but complementary. It is not a question of either-or. Both technologies serve different requirements, and most companies use a combination from both types to create a Big Data solution that meets their business goals.