DEFINING DATA MINING

We define Data Mining as a process of analysing data by means of intelligent algorithms. These algorithms will automatically recognize interesting patterns in large data sets. A data set is ‘large’ if at least several thousands of records exist.

Human brains are not able to analyse large data sets because our brains are limited to calculate and memorise operations. For example, a relatively small data set of 10,000 records (lines) and 50 variables (columns) becomes quite complex for a human brain to analyse. Suppose each variable has only two values, our human brain must analyse 10,000x50x2 = 1 million combinations! This means that 1 million calculations must be made followed by evaluating and interpreting all these data values. We are not finished yet; we still need to extract interesting patterns from these mass of calculations by means of reasoning. This example illustrates that a human brain is not able to perform such type of analysis.

If a human brain is transformed into a computer algorithm, the computer incorporates human reasoning and -intelligence. Moreover, the computer is much less limited to calculate and memorise operations. This is what we call Data Mining.

For more practical information please visit Data Mining Business Cases.

 

Check out the basic
methods of Data Mining