These are some of the books on probability and statistics that we've found interesting or useful.
If you are working in artificial intelligence or data mining you are often in an unusual position: you are automatically generating one or more hypotheses based upon a sample of data, then testing the resulting hypothesis to see if it is true.
Most statistics books adequately cover hypothesis testing. They cover the basic use of Null Hypothesis (is this hypothesis really needed), tests for Normal Distributions, etc. (Basic Business Statistics is definitely one of the better books if you need detailed coverage of these areas.)
They do not, unfortunately, cover the material you really need for assessing the performance of programs which automatically generate their own hypothesis or interact in some way with their environment. This book does.
I liked this book so much that, after reading my library's copy, I went out and bought it!
Haigh offers a clear introduction for the general reader into the world of probability, covering most of the situations in which we encounter probability in every day life. It thus includes chapters on the optimal strategies for casino gambling, dice games, lottery betting, TV game shows, horse racing, and the like.
Want to know the optimal strategy for The Weakest Link or Who Wants to be a Millionaire? Want to know the chances of a goal in the last minute of a soccer match? Will a professional foul (which saves a goal but gets a team member sent off) save the match? Need to know how much of your gambling money should you stake on a favorable bet?
Read this book to find out. (And look here).
If you've ever dreamed of creating a computer system to beat the bookies or the stock market (and who hasn't?) then Steven Skiena's book is for you. Skiena describes his own (and his team's) efforts to create an automated system which would place winning bets on jai alai, a Basque game which is also played in parts of France and some cities in North America. The book describes the game itself (a game with similarities to tennis, squash and rugby fives), the convoluted methods by which tournaments are held (a sort of round robin) and the pari-mutuel gambling system which is used to place bets on the outcome.
To create the perfect system, Skiena and his team needed to model the tournament structure, the effects of player skill on match outcomes,and (since this the odds are offered using a pari-mutuel system) the betting habits of the general public. They then needed to identify bets which would (on average) be profitable.
The team succeeded, the program they developed ultimately succeeding in returning about 500% on its initial gambling stake in a single year. (The bad news, as Skiena points out, is that it would not be possible to use such a system to bet large amounts of money on jai alai, since such bets would significantly depress the odds available).
A must read for anyone seriously interested in "beating the system".
There can't be many books about statistics which are really enjoyable to read. The Cartoon Guide covers the statistics you've probably learned and forgotten (or not quite understood) in an interesting and entertaining way. You won't find great depth here, nor elaborate proofs. But you will find most topics explained clearly: which is what most people need most of the time.
The Chernoff faces on the back cover are really neat too!
If you prefer a more textbook-like or a more complete approach, you may want to consider Basic Business Statistics which covers the same material in a more formal manner.
(Note that these comments are based upon the Sixth Edition. The current edition is at least the Seventh.)
This is a good up-to-date text book on statistics. It has a clear layout, a good index, and a good coverage of the major statistical tests in use today. There is a good practical emphasis (statistics is essentially a practical art), with examples and review questions drawn from engineering and business processes.
For data mining you should also consider Cohen. He isn't as clear a writer, but he does cover material you won't find in this (or any other) textbook.