Pandas for Everyone: Python Data Analysis

(PYTHON-PANDAS.AP1) / ISBN : 978-1-64459-413-1
This course includes
Lessons
TestPrep
Hands-On Labs
AI Tutor (Add-on)
Get A Free Trial

About This Course

Pandas is an open-source Python library for data analysis. The Pandas for Everyone: Python Data Analysis course focuses on loading data into Python with the help of the Pandas library. This course contains interactive lessons with knowledge checks, quizzes, and hands-on labs to get a deeper understanding of the concepts such as Pandas DataFrame and Data Structure Basics, Plotting Basics, Tidy Data, Data Assembly, Data Normalization, linear regression, survival models, and so on.

Skills You’ll Get

Get the support you need. Enroll in our Instructor-Led Course.

Lessons

47+ Lessons | 100+ Exercises | 90+ Quizzes | 109+ Flashcards | 109+ Glossary of terms

TestPrep

50+ Pre Assessment Questions | 50+ Post Assessment Questions |

Hands-On Labs

30+ LiveLab | 20+ Video tutorials | 43+ Minutes

1

Preface

  • Breakdown of the Course
  • How to Read This Course
  • Setup
2

Pandas DataFrame Basics

  • Introduction
  • Load Your First Data Set
  • Look at Columns, Rows, and Cells
  • Grouped and Aggregated Calculations
  • Basic Plot
  • Conclusion
3

Pandas Data Structures Basics

  • Create Your Own Data
  • The Series
  • The DataFrame
  • Making Changes to Series and DataFrames
  • Exporting and Importing Data
  • Conclusion
4

Plotting Basics

  • Why Visualize Data?
  • Matplotlib Basics
  • Statistical Graphics Using matplotlib
  • Seaborn
  • Pandas Plotting Method
  • Conclusion
5

Tidy Data

  • Columns Contain Values, Not Variables
  • Columns Contain Multiple Variables
  • Variables in Both Rows and Columns
  • Conclusion
6

Apply Functions

  • Primer on Functions
  • Apply (Basics)
  • Vectorized Functions
  • Lambda Functions (Anonymous Functions)
  • Conclusion
7

Data Assembly

  • Combine Data Sets
  • Concatenation
  • Observational Units Across Multiple Tables
  • Merge Multiple Data Sets
  • Conclusion
8

Data Normalization

  • Multiple Observational Units in a Table (Normalization)
  • Conclusion
9

Groupby Operations: Split-Apply-Combine

  • Aggregate
  • Transform
  • Filter
  • The pandas.core.groupby. DataFrameGroupBy object
  • Working With a MultiIndex
  • Conclusion
10

Missing Data

  • What Is a NaN Value?
  • Where Do Missing Values Come From?
  • Working With Missing Data
  • Pandas Built-In NA Missing
  • Conclusion
11

Data Types

  • Data Types
  • Converting Types
  • Categorical Data
  • Conclusion
12

Strings and Text Data

  • Introduction
  • Strings
  • String Methods
  • More String Methods
  • String Formatting (F-Strings)
  • Regular Expressions (RegEx)
  • The regex Library
  • Conclusion
13

Dates and Times

  • Python's datetime Object
  • Converting to datetime
  • Loading Data That Include Dates
  • Extracting Date Components
  • Date Calculations and Timedeltas
  • Datetime Methods
  • Getting Stock Data
  • Subsetting Data Based on Dates
  • Date Ranges
  • Shifting Values
  • Resampling
  • Time Zones
  • Arrow for Better Dates and Times
  • Conclusion
14

Linear Regression (Continuous Outcome Variable)

  • Simple Linear Regression
  • Multiple Regression
  • Models with Categorical Variables
  • One-Hot Encoding in scikit-learn with Transformer Pipelines
  • Conclusion
15

Generalized Linear Models

  • About This Lesson
  • Logistic Regression (Binary Outcome Variable)
  • Poisson Regression (Count Outcome Variable)
  • More Generalized Linear Models
  • Conclusion
16

Survival Analysis

  • Survival Data
  • Kaplan Meier Curves
  • Cox Proportional Hazard Model
  • Conclusion
17

Model Diagnostics

  • Residuals
  • Comparing Multiple Models
  • k-Fold Cross-Validation
  • Conclusion
18

Regularization

  • Why Regularize?
  • LASSO Regression
  • Ridge Regression
  • Elastic Net
  • Cross-Validation
  • Conclusion
19

Clustering

  • k-Means
  • Hierarchical Clustering
  • Conclusion
20

Life Outside of Pandas

  • The (Scientific) Computing Stack
  • Performance
  • Dask
  • Siuba
  • Ibis
  • Polars
  • PyJanitor
  • Pandera
  • Machine Learning
  • Publishing
  • Dashboards
  • Conclusion
21

It’s Dangerous To Go Alone!

  • Local Meetups
  • Conferences
  • The Carpentries
  • Podcasts
  • Other Resources
  • Conclusion

Appendix A: Concept Maps

Appendix B: Installation and Setup

  • B.1 Install Python
  • B.2 Install Python Packages
  • B.3 Download Book Data

Appendix C: Command Line

  • C.1 Installation
  • C.2 Basics

Appendix D: Project Templates

Appendix E: Using Python

  • E.1 Command Line and Text Editor
  • E.2 Python and IPython
  • E.3 Jupyter
  • E.4 Integrated Development Environments (IDEs)

Appendix F: Working Directories

Appendix G: Environments

  • G.1 Conda Environments
  • G.2 Pyenv + Pipenv

Appendix H: Install Packages

  • H.1 Updating Packages

Appendix I: Importing Libraries

Appendix J: Code Style

  • J.1 Line Breaks in Code

Appendix K: Containers: Lists, Tuples, and Dictionaries

  • K.1 Lists
  • K.2 Tuples
  • K.3 Dictionaries

Appendix L: Slice Values

Appendix M: Loops

Appendix N: Comprehensions

Appendix O: Functions

  • O.1 Default Parameters
  • O.2 Arbitrary Parameters

Appendix P: Ranges and Generators

Appendix Q: Multiple Assignment

Appendix R: Numpy ndarray

Appendix S: Classes

Appendix T: SettingWithCopyWarning

  • T.1 Modifying a Subset of Data
  • T.2 Replacing a Value
  • T.3 More Resources

Appendix U: Method Chaining

Appendix V: Timing Code

Appendix W: String Formatting

  • W.1 C-Style
  • W.2 String Formatting: .format() Method
  • W.3 Formatting Numbers

Appendix X: Conditionals (if-elif-else)

Appendix Y: New York ACS Logistic Regression Example

Appendix Z: Replicating Results in R

  • Z.1 Linear Regression
  • Z.2 Logistic Regression
  • Z.3 Poisson Regression

1

Pandas DataFrame Basics

  • Performing Grouped and Aggregated Calculations Using the .groupby() Method
2

Pandas Data Structures Basics

  • Creating a DataFrame and Making Changes to it
3

Plotting Basics

  • Creating a Scatter Plot Using Multivariate Data
  • Creating a Density Plot Using Bivariate Data
4

Tidy Data

  • Using Functions and Methods to Process and Tidy Data
5

Apply Functions

  • Performing Calculations Across DataFrames
  • Vectorizing Functions
6

Data Assembly

  • Performing Concatenation Using the concat() Function
  • Merging Multiple Data Sets Using the .merge() Function
7

Data Normalization

  • Understanding Multiple Observational Units in a Data Set
8

Groupby Operations: Split-Apply-Combine

  • Performing Data Summarization Using Group-by Operations
  • Performing Boolean Subsetting on the Data
  • Performing Operations on Grouped Objects
9

Missing Data

  • Finding and Cleaning Missing Data
10

Data Types

  • Performing Data Type Conversion
11

Strings and Text Data

  • Finding and Substituting a Pattern
12

Dates and Times

  • Converting an Object Type into a datetime Type
  • Extracting Date Components from the Data
  • Getting Stock Data and Subsetting it Based on Dates
  • Resampling Dates Using the .resample() Method
13

Linear Regression (Continuous Outcome Variable)

  • Performing Linear Regression
  • Performing Multiple Regression
14

Generalized Linear Models

  • Performing Logistic Regression
  • Performing Poisson Regression Using the poisson() Function
15

Survival Analysis

  • Performing Survival Analysis Using the KaplanMeierFitter() Function
16

Model Diagnostics

  • Comparing Models Using Cross-Validation
17

Regularization

  • Performing L1 Regularization Using the Lasso() Function
  • Performing L2 Regularization Using the Ridge() Function
18

Clustering

  • Performing k-Means Clustering
  • Using Hierarchical Clustering Algorithms

Pandas for Everyone: Python Data Analysis

$279.99

Buy Now

Related Courses

All Course
scroll to top