Cover of Practical Statistics for Data Scientists

Practical Statistics for Data Scientists

50+ Essential Concepts Using R and Python

By: Peter Bruce, Peter C. Bruce, Andrew Bruce, Peter Gedeck

Publisher: O'Reilly Media
Published: 2020
Language: Unknown
Format: BOOK
Pages: N/A
ISBN: 9781492072942

About This Book

Statistical methods are a key part of data science, yet few data scientists have formal statistical training. Courses and books on basic statistics rarely cover the topic from a data science perspective. The second edition of this practical guide--now including examples in Python as well as R--explains how to apply various statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not. Many data scientists use statistical methods but lack a deeper statistical perspective. If you're familiar with the R or Python programming languages, and have had some exposure to statistics but want to learn more, this quick reference bridges the gap in an accessible, readable format. With this updated edition, you'll dive into: Exploratory data analysis Data and sampling distributions Statistical experiments and significance testing Regression and prediction Classification Statistical machine learning Unsupervised learning

AI Overview

Title: "Practical Statistics for Data Scientists: 50+ Essential Concepts" by Peter Bruce, Andrew Bruce, and Peter Gedeck

Overview: The book "Practical Statistics for Data Scientists: 50+ Essential Concepts" is designed to provide data scientists and machine learning engineers with a practical, hands-on guide to statistical concepts. The book focuses on applying statistical methods in data science, emphasizing practical implementation over theoretical proofs.

Key Themes:

  1. Practical Application: The book is structured to cover key statistical concepts in a concise manner, making it easy to use as a quick reference or memory refresher.
  2. Programming Languages: The second edition includes Python code in addition to R code, catering to a broader audience of programmers and data scientists.
  3. Exploratory Data Analysis: The book covers how to perform exploratory data analysis, including sampling, regression, and outlier detection.
  4. Machine Learning: It introduces both supervised and non-supervised machine learning methods, providing a comprehensive overview of statistical techniques in data science.

Plot Summary:

The book does not have a narrative plot but rather a structured approach to teaching statistical concepts. It covers a wide range of topics, including:

  • Statistical Concepts: Basic ideas and key terms essential for data science.
  • Implementation Considerations: Important considerations for implementing statistical methods.
  • Code Snippets: R and Python code snippets to illustrate practical applications.

Critical Reception:

Positive Reviews:

  • Goodreads: The book has received positive reviews, with many readers appreciating its concise and practical approach to teaching statistics for data science. It is noted for not overwhelming readers with detailed proofs, making it accessible to both beginners and experts.
  • How to Learn Machine Learning: The book is praised for delivering what it promises – a practical guide to statistics for data science. It is recommended for programmers and data scientists looking to enhance their statistical knowledge.

Negative Reviews:

  • Goodreads: Some reviewers felt that the book is more suited for those new to data science or programming, as it may not offer much new information for experienced data scientists. It is also noted that the title might be misleading, suggesting it covers more advanced topics than it actually does.

Editions:

The book has a second edition, which includes Python code in addition to R code, making it more versatile for different programming backgrounds.

In summary, "Practical Statistics for Data Scientists" is a valuable resource for those looking to apply statistical concepts in data science. It provides a practical, hands-on approach with code snippets, making it an excellent reference manual for data scientists and machine learning engineers. However, it may not offer significant new insights for those already well-versed in the field.