01 - Data Mining - Overview

Introduction To Data Mining

Data mining is the process of turning raw data into useful information. Any numbers, text, facts, web pages or documents that can be processed by a computer are considered data and mining is the process of extracting something useful. Hence, as the name indicates, data mining is the process of extracting useful information from large volumes of data. Data mining has become a buzzword for the last few years and businesses are trying to stand out from the crowd by making the best use of this technique.

Why Data Mining?

The simple answer to “Why is data mining required?” is that data which is the core of any business is anywhere and everywhere. Yes, it is a fact. We are living in a world where anything and everything is getting converted to data. Every click, tap, swipe, like, tweet, share, phone call, etc generates lots and lots of data. The amount of data getting collected and stored is exploding. Just consider the case of a telecom service provider, or a banking service provider. In short, data explosion is one of the reasons that necessitate data mining.

Secondly, the technology is so advanced today that it is really easy and cheap to collect, store and retrieve large volumes of data. Data storage costs have declined dramatically which result in big data. Also, the processing power of computers is exponentially increasing. All these technological advancements help organizations in collecting, storing and retrieving large amount of data from different sources easily and quite cheaply.

Thirdly, competition necessitates the availability of information at your finger tips in the blink of an eye. Your business might be storing terabytes of data in your databases by spending lots of effort, time and money. In addition to the data available within an organization, Internet is also a great data source. But, data in its pure form might not be useful in many situations. So in today’s competitive business world, there should be processes in place in order to get useful information from raw data that might help you in critical decision making and development of new strategies.

What is Data Mining?

Data mining is the process of digging through large volumes of data and extracting previously unidentified and potentially useful information. In other words, data mining comes up with information that queries or reports cannot discover normally. By finding out useful patterns and trends about different aspects of the company, businesses can come up with new strategies that are helpful in gaining competitive advantage.

Data mining also predicts behaviors and future trends that help businesses to become more proactive and make more accurate, information-driven decisions. In short, data mining makes the whole process of information management faster, easier and efficient. It also answers business questions more accurately and efficiently.

What is NOT Data Mining?

Many people still think that data mining is simple query processing. In fact, data mining is much more than that. Getting a matching record from a huge database with the help of a simple or complex query is not data mining. On the other hand, if you look for the most popular names in certain US locations, then you need to apply data mining techniques.

Applications of Data Mining

Data mining helps businesses identify important facts, trends, patterns, relationships and exceptions that are normally unnoticed or hidden. Thus, data mining techniques are applied in a wide range of industries including healthcare, insurance, finance, retail, manufacturing and so on.

Retailers make use of data mining techniques to spot sales trends. By analyzing the purchase patterns of customers, retailers can come up with smarter marketing promotions and campaigns which will in turn increase the sales. With market segmentation, retailers can identify the customers who purchase the same products. So, they can come up with new products at the right time by analyzing the interests and demographics of customers. Data mining can also be used to predict customers who are most likely start purchasing from your competitors.

Fraud detection is a major headache for finance and insurance companies. Studies show that customer demographics can be effectively used to predict their fraudulent nature. Nowadays, data mining is used to identify transactions that are most likely to be fraudulent. In the healthcare industry, data mining techniques are mainly used for most accurate disease diagnosis and most effective treatments. It is also helpful in predicting health insurance fraud, healthcare cost and length of stay (LOS) of hospitalization.


Data mining enables analysis of data from different perspectives and summarization of data into useful information. This information can be used in different ways to cut cost, increase profit, increase the business reach and so on. In short, data mining provides you potentially useful information that enables you to stand out from the crowd.

Like us on Facebook