Full description not available
J**W
Excellent Book, For What It Is
I'm a Python software developer with an interest in applied statistics. This is an excellent book on data analysis, but for review purposes, it's worth initially pointing out what this book is not.It is not a comprehensive survey of open source tools that are available, and it does not contain many examples of working code to implement the techniques he talks about, though there are some. For this reason, I'd strike the "with Open Source Tools" from the title in evaluating whether you want to purchase the book.The author greatly favors mathematical notation over code examples in describing the data analysis techniques he presents. While this is not a bad thing per se, you'll have to struggle to comprehend the content if you're a programmer without an academic familiarity with math, or if you've been away from mathematics for a long time.As other reviewers have pointed out, the organization of the content is somewhat disjointed. Going from chapter to chapter, there is little in the way of causality, and the early chapters are pretty math-heavy. The reader is advised to consult appendices at the back of the book to refresh themselves on the basics, if required.Wait! I didn't say you shouldn't buy it.Despite a few shortcomings, this book does offer a good introduction and overview of several basic techniques. It's an excellent survey of the current data analysis landscape for anyone who's not familiar with it. If a topic seems irrelevant to you, it's pretty easy to skip that chapter and move forward.On top of that, the author's writing style and ways of explaining relatively esoteric concepts is generally very good. As with many good books, you get the sense the author is a co-worker, trying to explain something to you in terms you can understand. It's very example-based, even if those examples don't always involve code.All in all, to get the most out of this book, the best approach is careful and methodical study. The author covers many topics quickly, and not any one in depth, so if one chapter interests you, I'd plan on consulting other resources on particular topics. Luckily, the author does offer several "Further Reading" recommendations for each topic.Most books containing information on these techniques are far harder to read, and they generally cost at least twice as much. Highly recommended. Thanks for this one, Philipp.
J**S
Stunning! And unexpected
I bought this book hoping for a reference on open source tools. But the open source tools are a minor aspect of this book. The core is about data analysis--and it is fantastic. I should have known this from the title I suppose: the "data analysis" is in big font with a colorful background, and "with open source tools" is in small font--and it is literally about the same ratio with the book. Each chapter has small section that works one example with an open source tool. And there is a chapter at the end about the array of open source tools available.But the data analysis aspects of the book--most outstanding. I have a master's in computer science, and do data and analytics for a living, so I have many books on the topic. Some books with more of a theoretical and rigorous foundation, some with more of a hands-on slant. I was expecting this book to be the latter, but it is quite the former.Yet it is still very practical. It is not a "theory" work as such, just a rigorous book useful in practice (there is a big difference!). Throughout the book the author points out the value of solving the problem at hand, rather than being excessively precise--which is the bigger risk in this domain. Examples would be: using visuals to get a feel for data but not trying to use visuals to give precise answers (which they fundementally cannot), and using techniques that get "close enough" such as perturbation.And it is extremely well written. The writing is in reasonably simple English, relative to the topic, yet not insulting or goofy the way the "Dummies" series can be for example. It is easy to read yet content rich--a fantastic combination.
A**.
It has its flaws, but on general a great overview
I've read some of the other reviews, and I do agree with most of the criticisms. There are quite a few errors in formulas and in the text, and it would've been really nice if the source codes and data files were given in a CD or were available on a website.That being said, the book addresses a lot of different topics - ranging from the introductory, freshman-level statistics to more advanced data mining and machine learning techniques, and passing through notions of design. It doesn't go in depth into each of them, but offers a fairly good overview, and references in case you're interested. Furthermore, the author gives some useful hints on how to do outside-the-box thinking and how to apply these techniques into business.Being a physics grad student, I've found many of the topics pretty much basic, but even so, I've learned a lot. Overall, a great introduction; I really hope the flaws are corrected on a future 2nd edition.
Trustpilot
3 weeks ago
2 weeks ago