Learning Pandas pdf

图书网 2018年10月8日11:58:43
评论
1.6K

Learning Pandas content validity

If you are a Python programmer who wants to get started with performing data analysis using pandas and Python, this is the book for you. Some experience with statistical analysis would be helpful but is not mandatory.

This learner’s guide will help you understand how to use the features of pandas for interactive data manipulation and analysis.

This book is your ideal guide to learning about pandas, all the way from installing it to creating one- and two-dimensional indexed data structures, indexing and slicing-and-dicing that data to derive results, loading data from local and Internet-based resources, and finally creating effective visualizations to form quick insights. You start with an overview of pandas and NumPy and then dive into the details of pandas, covering pandas’ Series and DataFrame objects, before ending with a quick review of using pandas for several problems in finance.

With the knowledge you gain from this book, you will be able to quickly begin your journey into the exciting world of data science and analysis.

Learning Pandas Catalog

Learning pandas

Credits

About the Author

About the Reviewers

www.PacktPub.com

Support files, eBooks, discount offers, and more

Why subscribe?

Free access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions

1. A Tour of pandas

pandas and why it is important

pandas and IPython Notebooks

Referencing pandas in the application

Primary pandas objects

The pandas Series object

The pandas DataFrame object

Loading data from files and the Web

Loading CSV data from files

Loading data from the Web

Simplicity of visualization of pandas data

Summary

2. Installing pandas

Getting Anaconda

Installing Anaconda

Installing Anaconda on Linux

Installing Anaconda on Mac OS X

Installing Anaconda on Windows

Ensuring pandas is up to date

Running a small pandas sample in IPython

Starting the IPython Notebook server

Installing and running IPython Notebooks

Using Wakari for pandas

Summary

3. NumPy for pandas

Installing and importing NumPy

Benefits and characteristics of NumPy arrays

Creating NumPy arrays and performing basic array operations

Selecting array elements

Logical operations on arrays

Slicing arrays

Reshaping arrays

Combining arrays

Splitting arrays

Useful numerical methods of NumPy arrays

Summary

4. The pandas Series Object

The Series object

Importing pandas

Creating Series

Size, shape, uniqueness, and counts of values

Peeking at data with heads, tails, and take

Looking up values in Series

Alignment via index labels

Arithmetic operations

The special case of Not-A-Number (NaN)

Boolean selection

Reindexing a Series

Modifying a Series in-place

Slicing a Series

Summary

5. The pandas DataFrame Object

Creating DataFrame from scratch

Example data

S&P 500

Monthly stock historical prices

Selecting columns of a DataFrame

Selecting rows and values of a DataFrame using the index

Slicing using the [] operator

Selecting rows by index label and location: .loc[] and .iloc[]

Selecting rows by index label and/or location: .ix[]

Scalar lookup by label or location using .at[] and .iat[]

Selecting rows of a DataFrame by Boolean selection

Modifying the structure and content of DataFrame

Renaming columns

Adding and inserting columns

Replacing the contents of a column

Deleting columns in a DataFrame

Adding rows to a DataFrame

Appending rows with .append()

Concatenating DataFrame objects with pd.concat()

Summarized data and descriptive statistics

Summary

6. Accessing Data

Setting up the IPython notebook

CSV and Text/Tabular format

The sample CSV data set

Reading a CSV file into a DataFrame

Specifying the index column when reading a CSV file

Data type inference and specification

Specifying column names

Specifying specific columns to load

Saving DataFrame to a CSV file

General field-delimited data

Handling noise rows in field-delimited data

Reading and writing data in an Excel format

Reading and writing JSON files

Reading HTML data from the Web

Reading and writing HDF5 format files

Accessing data on the web and in the cloud

Reading and writing from/to SQL databases

Reading data from remote data services

Reading stock data from Yahoo! and Google Finance

Adding rows (and columns) via setting with enlargement

Removing rows from a DataFrame

Removing rows using .drop()

Removing rows using Boolean selection

Removing rows using a slice

Changing scalar values in a DataFrame

Arithmetic on a DataFrame

Resetting and reindexing

Hierarchical indexing

Retrieving data from Yahoo! Finance Options

Reading economic data from the Federal Reserve Bank of St. Louis

Accessing Kenneth French’s data

Reading from the World Bank

Summary

7. Tidying Up Your Data

What is tidying your data?

Setting up the IPython notebook

Working with missing data

Determining NaN values in Series and DataFrame objects

Selecting out or dropping missing data

How pandas handles NaN values in mathematical operations

Filling in missing data

Forward and backward filling of missing values

Filling using index labels

Interpolation of missing values

Handling duplicate data

Transforming Data

Mapping

Replacing values

Applying functions to transform data

Summary

8. Combining and Reshaping Data

Setting up the IPython notebook

Concatenating data

Merging and joining data

An overview of merges

Specifying the join semantics of a merge operation

Pivoting

Stacking and unstacking

Stacking using nonhierarchical indexes

Unstacking using hierarchical indexes

Melting

Performance benefits of stacked data

Summary

9. Grouping and Aggregating Data

Setting up the IPython notebook

The split, apply, and combine (SAC) pattern

Split

Data for the examples

Grouping by a single column’s values

Accessing the results of grouping

Grouping using index levels

Apply

Applying aggregation functions to groups

The transformation of group data

An overview of transformation

Practical examples of transformation

Filtering groups

Discretization and Binning

Summary

10. Time-series Data

Setting up the IPython notebook

Representation of dates, time, and intervals

The datetime, day, and time objects

Timestamp objects

Timedelta

Introducing time-series data

DatetimeIndex

Creating time-series data with specific frequencies

Calculating new dates using offsets

Date offsets

Anchored offsets

Representing durations of time using Period objects

The Period object

PeriodIndex

Handling holidays using calendars

Normalizing timestamps using time zones

Manipulating time-series data

Shifting and lagging

Frequency conversion

Up and down resampling

Time-series moving-window operations

Summary

11. Visualization

Setting up the IPython notebook

Plotting basics with pandas

Creating time-series charts with .plot()

Adorning and styling your time-series plot

Adding a title and changing axes labels

Specifying the legend content and position

Specifying line colors, styles, thickness, and markers

Specifying tick mark locations and tick labels

Formatting axes tick date labels using formatters

Common plots used in statistical analyses

Bar plots

Histograms

Box and whisker charts

Area plots

Scatter plots

Density plot

The scatter plot matrix

Heatmaps

Multiple plots in a single chart

Summary

12. Applications to Finance

Setting up the IPython notebook

Obtaining and organizing stock data from Yahoo!

Plotting time-series prices

Plotting volume-series data

Calculating the simple daily percentage change

Calculating simple daily cumulative returns

Resampling data from daily to monthly returns

Analyzing distribution of returns

Performing a moving-average calculation

The comparison of average daily returns across stocks

The correlation of stocks based on the daily percentage change of the closing price

Volatility calculation

Determining risk relative to expected returns

Summary

Index

Learning Pandas Wonderful Digest

pandas and IPython Notebooks

A popular means of using pandas is through the use of IPython Notebooks. IPython Notebooks provide a web-based interactive computational environment, allowing the combination of code, text, mathematics, plots, and right media into a web-based document. IPython Notebooks run in a browser and contain Python code that is run in a local or server-side Python session that the notebooks communicate with using WebSockets. Notebooks can also contain markup code and rich media content, and can be converted to other formats such as PDF, HTML, and slide shows.

图书网:Learning Pandas pdf

继续阅读

→→→→→→→→→→→→→→→→→→→→查找获取

匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: