Synopsis: An explanation of common data science tasks like text analysis and regressions, coupled with an intro to the R programming language.
Difficulty: Beginner
There must be a hundred resources online called “Introduction to Data Science”, but let’s face it–most of them are crap. I included Dr. Stanton’s name in the title to make it easy for you to find the right book.
If you’ll recall from my recent review of “Modeling Techniques in Predictive Analytics”, Dr. Miller does a good job explaining the most common models. He even includes a complete set of R code at the end of each chapter. The problem with that text is that a) It doesn’t really explain the process of developing that code and b) It’s expensive. “Introduction to Data Science” solves both of those problems beautifully; R is given as much coverage as the statistical models and the price, free, is hard to beat.
Is it perfect? No. I hate the front cover (a petty complaint, I know) and occasionally Dr. Stanton follows what seems like a less-than-logical progression through the book.Some chapters seem a little……random. And finally, if you aren’t careful to get the 3rd edition the authentication for Twitter in one of the examples won’t work. Still, I didn’t find this enough of a problem to knock his work. When compared to other resources available for newbies, this material hits a sweet spot that is hard to find.
More advanced readers will want to avoid this book (indeed, several have given it bad reviews online for being “too basic”), but for someone who is starting out it is just right (cue Goldilocks endorsement). You get coverage of the basic models along with a tutorial of how to construct those models within R. For an “Introduction”, this is exactly the skill set you should be looking for.
So what’s the bottom line? This is hands-down the one book I would recommend to those just beginning a journey down this career path. You can always get more in-depth material later.
Download on Apple’s iBooks store or go here.