Data Mining: How to Get Started

The first stage is to critically examine your data and your business objectives. Albion Research Ltd can assist you in determining the feasibility of a data mining project, in proof-of-concept development, and by working with your staff to integrate data mining into your day-to-day operations.

A data mining project is normally be broken into three phases:

  1. Targeting Study
  2. Proof Of Concept
  3. Deployment

We can help you in all three of these phases.

1. Targeting Study

The objective of the targeting study is to identify and prioritize potential data mining projects.

During this phase potential users of new information are interviewed to determine both the types of patterns that would be of interest, and the likely benefits from finding such patterns.

The available data is assessed as to both quality, quantity, and scope.

An assessment is also made as to which data mining techniques are likely to be appropriate to each project type: this gives an early indication of the costs involved and the technical risks associated with the project.

At the end of this phase you should have a technical and financial report which identifies and assesses potential data mining projects based on:

  • their technical feasibility,
  • their likely costs and potential financial returns,
  • the technical risks involved.

2. Proof of Concept

The proof of concept phase has three objectives:

  • estimating the expected actual return on investment for data mining,
  • eliminating any technical risks,
  • and defining any full-scale data mining process which might form a permanent part of your business.

A project with low investment, high return, and limited technical risk (as identified during the Targeting Study) is implemented.

The initial stages of the proof of concept involve refining the target objectives and cleaning, preparing, and normalizing data. One or more appropriate data mining techniques are selected and implemented, either using off-the-shelf or custom code. An evaluation is made of the results obtained, either against selected test data or against live data.

At the end of the phase you should have:

  • a working prototype system;
  • a predictive model or set of clusters developed using that system;
  • an evaluation of the model or clusters, against test data;
  • accurate predictions of the financial and operational performance of a deployed system;
  • a plan for integrating the prototype system into your operations on a regular basis.

Outside technical help may be used during proof of concept to supplement in-house expertise. This can provide a useful early opportunity to train your staff on the ideas and concepts involved, as well as providing guidance on potential techniques and pitfalls to avoid.

3. Deployment

Once the proof of concept phase is completed, it is possible to decide how to deploy the data mining system. The objective is not just to make a one-off improvement in your organizations operations, but to implement a repeatable process.

  • Data extraction and cleaning procedures are developed. These may need to automate procedures which, for the purposes of the study, were carried out manually.
  • Procedures are developed to either renew or confirm the validity of the models developed during data mining. Things change, and what works today cannot be expected to work indefinitely.
  • Development work may be moved from desktop PCs to production servers, reflecting the amount of data involved and how the developed models are to be fitted into the production environment.
  • Staff are trained to understand the processes involved (and their limitations).

At the end of the day you will have incorporated a new feedback loop into your organization. A feedback loop which identifies new opportunities, assesses their merits, and develops and implements processes to exploit them.

Next: Contact Us
© 1993-2024 Albion Research Ltd.
Relevant Books
If you purchase a book using one of these links, we receive a small payment from Amazon, which helps pay for this site.