What Is Data Mining?

Data Mining is the art and science of discovering and exploiting new, useful, and profitable relationships in data.

In any business, we study what happens in order to improve. We study our potential customers. We study our actual customers. We study what we do and how we do it.

Our objective is to find patterns of behavior — predictable outcomes — and turn these into profitable opportunities.

In the past this study of patterns was severely limited by the amount of human effort involved, and by the expense of gathering the necessary data.

In the last few years, the balance has changed. More data is readily available to most businesses than ever before. More computing power is available to process that data, and automated techniques that can be used to find patterns in that data with limited human intervention have evolved and matured.

Data Mining Techniques

Most data mining techniques fall into one of two related categories: model building, and clustering.

Model Building seeks to create a predictive model related to a business question. For example, we could try and model how likely different customers would be to be interested in some particular offer, and how much profit we would expect to earn if they accepted our proposal. If we succeed in doing this, then we can make a rational decision as to which customers to approach based upon (a) the cost of making the offer, (b) the estimated probability of acceptance, and (c) the estimated profit if that customer accepts the offer.

Depending upon the techniques chosen, a model may be either opaque (it works but we aren't exactly sure how or why) or transparent (we understand exactly how the model arrives at any prediction). Either may be acceptable, depending upon the application. An opaque model that predicts production defect rates is perfectly acceptable if our interest is limited to production planning, but we would certainly prefer a transparent model if we were interested in increasing productivity. (See also Market Basket Analysis).

Clustering attempts to segment a population into one or more groups that have (as far as we are concerned) similar characteristics and are therefore expected to behave in a similar manner. Unlike model building there is typically no specific outcome or attribute that must be predicted. The objective is often to group similar things together so that we can think about them better.

For marketing purposes, some form of clustering is essential. It's difficult to find a single marketing message that appeals to all customers, or a single advertising medium that will reach all our customers at a reasonable price.

Clustering can also yield opportunities: find a group of similar customers and there may be a way that a product can be extended or combined specifically for that group of similar customers.

Within the model building and clustering areas there are many available techniques. Which technique to apply will depend upon the specific business objectives, as well as on the availability and structure of the available data.

See Also: Books on Data Mining
Next: Why Data Mining?

Albion Research Ltd. is based in Ottawa, Canada.
Please contact us for more information about our services.

© Albion Research Ltd. 2017