At some point Data Mining began to be called Data Science and marketers became data scientists.
Unfortunately, except when it comes to testing, data mining is rarely that scientific… The real revolutions have been in the amount of data that can be collected and in the computing power that can be used to train neural networks to predict variables.
Case studies and practical guidance. Good introductory text. A personal favorite.
Good algorithm descriptions. Covers the major areas in reasonable technical detail, with several alternative algorithms presented for classification, prediction, association rule induction and cluster analysis.
Strictly a specialist book. The title says it all.
If you are working in artificial intelligence or data mining you are often in an unusual position: you are automatically generating one or more hypotheses based upon a sample of data, then testing the resulting hypothesis to see if it is true.
Most statistics books adequately cover hypothesis testing. They cover the basic use of Null Hypothesis (is this hypothesis really needed), tests for Normal Distributions, etc. (Basic Business Statistics is definitely one of the better books if you need detailed coverage of these areas.)
They do not, unfortunately, cover the material you really need for assessing the performance of programs which automatically generate their own hypothesis or interact in some way with their environment. This book does.
A practical and technical introduction to algorithms for data mining. Includes Java implementations of some of the major algorithms.
CRM (Customer Relationship Management) is a major application area for data mining. Some interesting chapters on the business applications and cost justifications. Good book if you are trying to figure out how data mining might fit into your business.
Introduction to the methodology, techniques, and applications of data mining from a management perspective. The chapter on why data mining projects fail is well worth reading.
Aimed at executive (read non-specialist) level. Good introduction to some of the things you can do with web log data. The key message here is that it helps a lot if you know more about your visitors than just what pages they clicked on.
Good introduction to machine learning, although I found the latter part of the book (from which it gets its title) a bit disappointing. The book's real strength is in placing existing machine learning methods in a good technical (and philosophical) perspective.