data-science-book-hub

Data Science Book Hub

Data Science Book Hub

Pull Requests MIT License Stars Web

Welcome to the Data Science Book Hub, a curated collection of the most pivotal and insightful open-source books in the Python Data Science ecosystem. My aim is to serve as a comprehensive resource for data scientists, analysts, and enthusiasts.

List of Books

Hosted on GitHub

User Repository License Downloads Stars Followers
jakevdp PythonDataScienceHandbook
amueller introduction_to_ml_with_python
wesm pydata-book
mattharrison ml_pocket_reference
tylerjrichards Streamlit-for-Data-Science
PacktPublishing Interpretable-Machine-Learning-with-Python
stefmolin Hands-On-Data-Analysis-with-Pandas
AllenDowney ThinkBayes2
maitbayev the-elements-of-statistical-learning
thomasnield oreilly_getting_started_with_sql
jeroenjanssens data-science-at-the-command-line
PacktPublishing Algorithmic-Short-Selling-with-Python-Published-by-Packt
empathy87 The-Elements-of-Statistical-Learning-Python-Notebooks
jorditorresBCN python-deep-learning
jorditorresBCN deep-learning-with-python-notebooks
Dany503 Algoritmos-Geneticos-en-Python-Un-Enfoque-Practico
handcraftsman GeneticAlgorithmsWithPython
blueberrymusic Deep-Learning-A-Visual-Approach
pamoroso free-python-books
PacktPublishing MicroPython-Projects
jcrodriguez1989 EstadisticaParaCienciasSocialesConR
RohanAlexander telling_stories
cdr-book cdr-book.github.io
mml-book mml-book.github.io
d2l-ai d2l-en
rasbt machine-learning-book

Available Online

Advanced R (Hadley Wickham) This resource might delve into the intricacies of R programming, covering topics like expressions, environments, and object-oriented programming, suitable for programmers looking to deepen their R programming skills.

An Introduction to Data Science (Rafael Irizarry) This text likely serves as an accessible introduction to data science, emphasizing statistical and computational tools. The book might cover a broad range of topics from probability and statistical inference to machine learning and data visualization.

Data Visualization (Kieran Healy) This text likely offers practical strategies for visualizing qualitative and quantitative data, aiming to enhance the reader’s ability to design meaningful and interpretable visualizations.

Data Visualization: A Practical Introduction (Kieran Healy) Already discussed, this text likely continues to guide readers through the essential tools and principles of effective data visualization, using real-world examples and concise explanations.

Fundamentals of Data Visualization (Claus O. Wilke) Already described, this book likely continues to explore the theory and practice of visualizing data, with a focus on creating clear, accurate, and insightful graphical representations of data.

ggplot2 (Hadley Wickham) This book probably dives deep into the capabilities and features of the ggplot2 package for R, aimed at improving users’ mastery of graphical presentations in R.

Hands-On Data Visualization (Jack Dougherty and Ilya Ilyankou) This resource might offer a practical guide to creating interactive and engaging data visualizations using web-based technologies. It’s ideal for those interested in enhancing their storytelling skills with data.

Improving Your Statistical Inferences (Daniel Lakens) This resource might focus on enhancing readers’ understanding of statistical concepts and methods. It likely offers a deeper look into hypothesis testing, p-values, confidence intervals, and power analysis, aiming to improve the quality of statistical inference in research.

Introduction to Data Science (Rafael A. Irizarry) This book likely provides a comprehensive overview of data science, covering essential tools and methods. The introduction probably sets the stage for newcomers to understand the fundamental concepts and applications of data science in various fields.

Introducción a la Ciencia de Datos (Rafael A. Irizarry) This book began as the notes used to teach the HarvardX Data Science Series classes. The Rmarkdown code used to generate the book is available on GitHub.

Introduction to Modern Statistics (Mine Çetinkaya-Rundel and Johanna Hardin) Already covered, this textbook likely continues to provide a modern approach to statistics with an emphasis on data analysis and computational practice.

Learning Statistics with jamovi (David Foxcroft) This text likely introduces statistical concepts using the jamovi software, making it suitable for those seeking an alternative to R for performing statistical analyses in an intuitive, user-friendly environment.

Learning Statistics with R (Dani Navarro) This book might cater to beginners and intermediate users who wish to understand statistical concepts deeply while applying them directly in R, covering descriptive and inferential statistics comprehensively.

Mastering Shiny (Hadley Wickham) This book probably focuses on using Shiny for building interactive web applications with R. It’s likely aimed at both novice and experienced R users who want to deliver their analyses as engaging and powerful web applications.

Modern Statistics with R (Søren Højsgaard, Ulrich Halekoh, and Jūnius S. Prūse) This source probably offers a comprehensive guide to understanding and applying statistical methods using R, suitable for those seeking to integrate statistical analysis deeply into their data science projects.

R for Data Science (Hadley Wickham and Garrett Grolemund) This book likely emphasizes practical skills necessary for data analysis using R and the tidyverse suite. It probably guides the reader through data import, tidy-up, transformation, visualization, and modeling techniques, aiming to provide a comprehensive toolkit for R programmers.

R Graphics Cookbook (Winston Chang) This cookbook likely provides practical solutions to common tasks and problems in plotting with R. It might focus on using ggplot2 and other graphics packages to enhance the visual appeal and functionality of data visualizations.

R in Action, 2nd Edition (Robert Kabacoff) This book likely provides a thorough overview of statistical analysis and application development in R, from simple summaries to complex data mining techniques.

R Programming for Data Science (Roger D. Peng) Already included in your list, this book probably remains a solid introduction to R programming with a focus on applications in data science, covering both basic and advanced topics.

The Art of Data Science (Roger D. Peng and Elizabeth Matsui) This book might focus on the process of data analysis, from formulating research questions to extracting insights from data. It’s likely aimed at helping readers develop a thoughtful approach to data analysis that transcends mere technical skills.

Contributing

Your contributions are what make the Data Science Book Hub a dynamic and community-driven resource. If you have suggestions for adding new books or improving the existing content, your insights are incredibly valuable to me and the broader data science community.

How to Contribute

Contribution Process

  1. :mailbox: Open an Issue: Start by opening an issue in this GitHub repository. Describe the contribution you want to make, whether it’s adding a new book, improving an existing one, or providing additional study resources.

  2. :fork_and_knife: Fork and Edit: Fork this repository, make your changes, and then submit a pull request with your contributions. Pull Requests

  3. :mag: Review: I will review your submission, and if everything is in order, your contributions will be merged into the project.

  4. :trophy: Credit: All contributors will be duly credited for their work. We believe in recognizing the efforts of the community members.

We welcome contributions from everyone, regardless of your level of experience. Every bit of information helps, and collective knowledge makes this resource better for everyone.

Contact

For more information, suggestions, or questions, you can contact me via

LinkedIn  GitHub  X