出版社: Packt Publishing - ebooks Account
出版年: 2014-9-29
页数: 448
定价: USD 29.99
装帧: Paperback
ISBN: 9781783980246
内容简介 · · · · · ·
Data's value has grown exponentially in the past decade, with 'Big Data' today being one of the biggest buzzwords in business and IT, and data scientist hailed as 'the sexiest job of the 21st century'. Practical Data Science Cookbook helps you see beyond the hype and get past the theory by providing you with a hands-on exploration of data science. With a comprehensive range of ...
Data's value has grown exponentially in the past decade, with 'Big Data' today being one of the biggest buzzwords in business and IT, and data scientist hailed as 'the sexiest job of the 21st century'. Practical Data Science Cookbook helps you see beyond the hype and get past the theory by providing you with a hands-on exploration of data science. With a comprehensive range of recipes designed to help you learn fundamental data science tasks, you'll uncover practical steps to help you produce powerful insights into Big Data using R and Python.
Use this valuable data science book to discover tricks and techniques to get to grips with your data. Learn effective data visualization with an automobile fuel efficiency data project, analyze football statistics, learn how to create data simulations, and get to grips with stock market data to learn data modelling. Find out how to produce sharp insights into social media data by following data science tutorials that demonstrate the best ways to tackle Twitter data, and uncover recipes that will help you dive in and explore Big Data through movie recommendation databases.
Practical Data Science Cookbook is your essential companion to the real-world challenges of working with data, created to give you a deeper insight into a world of Big Data that promises to keep growing.
作者简介 · · · · · ·
Tony Ojeda
Tony Ojeda is an accomplished data scientist and entrepreneur, with expertise in business process optimization and over a decade of experience creating and implementing innovative data products and solutions. He has a Master's degree in Finance from Florida International University and an MBA with concentrations in Strategy and Entrepreneurship from DePaul University...
Tony Ojeda
Tony Ojeda is an accomplished data scientist and entrepreneur, with expertise in business process optimization and over a decade of experience creating and implementing innovative data products and solutions. He has a Master's degree in Finance from Florida International University and an MBA with concentrations in Strategy and Entrepreneurship from DePaul University. He is the founder of District Data Labs, a cofounder of Data Community DC, and is actively involved in promoting data science education through both organizations.
Sean Patrick Murphy
Sean Patrick Murphy spent 15 years as a senior scientist at The Johns Hopkins University Applied Physics Laboratory, where he focused on machine learning, modeling and simulation, signal processing, and high performance computing in the Cloud. Now, he acts as an advisor and data consultant for companies in SF, NY, and DC. He completed his graduation from The Johns Hopkins University and his MBA from the University of Oxford. He currently co-organizes the Data Innovation DC meetup and cofounded the Data Science MD meetup. He is also a board member and cofounder of Data Community DC.
Benjamin Bengfort
Benjamin Bengfort is an experienced data scientist and Python developer who has worked in military, industry, and academia for the past 8 years. He is currently pursuing his PhD in Computer Science at the University of Maryland, College Park, doing research in Metacognition and Natural Language Processing. He holds a Master's degree in Computer Science from North Dakota State University, where he taught undergraduate Computer Science courses. He is also an adjunct faculty member at Georgetown University, where he teaches Data Science and Analytics. Benjamin has been involved in two data science start-ups in the DC region: leveraging large-scale machine learning and Big Data techniques across a variety of applications. He has a deep appreciation for the combination of models and data for entrepreneurial effect, and he is currently building one of these start-ups into a more mature organization.
Abhijit Dasgupta
Abhijit Dasgupta is a data consultant working in the greater DC-Maryland-Virginia area, with several years of experience in biomedical consulting, business analytics, bioinformatics, and bioengineering consulting. He has a PhD in Biostatistics from the University of Washington and over 40 collaborative peer-reviewed manuscripts, with strong interests in bridging the statistics/machine-learning divide. He is always on the lookout for interesting and challenging projects, and is an enthusiastic speaker and discussant on new and better ways to look at and analyze data. He is a member of Data Community DC and a founding member and co-organizer of Statistical Programming DC (formerly, R Users DC).
目录 · · · · · ·
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
· · · · · · (更多)
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. Preparing Your Data Science Environment
Introduction
Understanding the data science pipeline
How to do it...
How it works...
Installing R on Windows, Mac OS X, and Linux
Getting ready
How to do it...
How it works...
See also
Installing libraries in R and RStudio
Getting ready
How to do it...
How it works...
There's more...
See also
Installing Python on Linux and Mac OS X
Getting ready
How to do it...
How it works...
There's more...
See also
Installing Python on Windows
How to do it...
How it works...
See also
Installing the Python data stack on Mac OS X and Linux
Getting ready
How to do it...
How it works...
There's more...
See also
Installing extra Python packages
Getting ready
How to do it...
How it works...
There's more...
See also
Installing and using virtualenv
Getting ready
How to do it...
How it works...
There's more...
See also
2. Driving Visual Analysis with Automobile Data (R)
Introduction
Acquiring automobile fuel efficiency data
Getting ready
How to do it...
How it works…
Preparing R for your first project
Getting ready
How to do it...
How it works...
See also
Importing automobile fuel efficiency data into R
Getting ready
How to do it...
How it works...
There's more...
There's more...
See also
Exploring and describing fuel efficiency data
Getting ready
How to do it...
How it works...
There's more...
Analyzing automobile fuel efficiency over time
Getting ready
How to do it...
How it works...
See also
Investigating the makes and models of automobiles
Getting ready
How to do it...
How it works...
There's more...
See also
3. Simulating American Football Data (R)
Introduction
Requirements
Acquiring and cleaning football data
Getting ready
How to do it…
How it works…
See also
Analyzing and understanding football data
Getting ready
How to do it…
How it works…
There's more…
See also
Constructing indexes to measure offensive and defensive strength
Getting ready
How to do it…
How it works…
See also
Simulating a single game with outcomes decided by calculations
Getting ready
How to do it…
How it works…
Simulating multiple games with outcomes decided by calculations
Getting ready
How to do it…
How it works…
There's more…
4. Modeling Stock Market Data (R)
Introduction
Requirements
Acquiring stock market data
How to do it...
Summarizing the data
Getting ready
How to do it...
How it works...
There's more...
Cleaning and exploring the data
Getting ready
How to do it...
How it works...
See also
Generating relative valuations
Getting ready
How to do it...
How it works...
Screening stocks and analyzing historical prices
Getting ready
How to do it...
How it works...
5. Visually Exploring Employment Data (R)
Introduction
Preparing for analysis
Getting ready
How to do it…
How it works…
See also
Importing employment data into R
Getting ready
How to do it…
How it works…
There's more…
See also
Exploring the employment data
Getting ready
How to do it…
How it works…
See also
Obtaining and merging additional data
Getting ready
How to do it…
How it works…
Adding geographical information
Getting ready
How to do it…
How it works…
See also
Extracting state- and county-level wage and employment information
Getting ready
How to do it…
How it works…
See also
Visualizing geographical distributions of pay
Getting ready
How to do it…
How it works…
See also
Exploring where the jobs are, by industry
How to do it…
How it works…
There's more…
See also
Animating maps for a geospatial time series
Getting ready
How to do it…
How it works…
There is more…
Benchmarking performance for some common tasks
Getting ready
How to do it…
How it works…
There's more…
See also
6. Creating Application-oriented Analyses Using Tax Data (Python)
Introduction
An introduction to application-oriented approaches
Preparing for the analysis of top incomes
Getting ready
How to do it...
How it works...
Importing and exploring the world's top incomes dataset
Getting ready
How to do it...
How it works...
There's more...
See also
Analyzing and visualizing the top income data of the US
Getting ready
How to do it...
How it works...
Furthering the analysis of the top income groups of the US
Getting ready
How to do it...
How it works...
Reporting with Jinja2
Getting ready
How to do it...
How it works...
There's more...
See also
7. Driving Visual Analyses with Automobile Data (Python)
Introduction
Getting started with IPython
Getting ready
How to do it…
How it works…
See also
Exploring IPython Notebook
Getting ready
How to do it…
How it works…
There's more…
See also
Preparing to analyze automobile fuel efficiencies
Getting ready
How to do it…
How it works…
There's more…
See also
Exploring and describing fuel efficiency data with Python
Getting ready
How to do it…
How it works…
There's more...
See also
Analyzing automobile fuel efficiency over time with Python
Getting ready
How to do it…
How it works…
There's more…
See also
Investigating the makes and models of automobiles with Python
Getting ready
How to do it…
How it works…
See also
8. Working with Social Graphs (Python)
Introduction
Understanding graphs and networks
Preparing to work with social networks in Python
Getting ready
How to do it...
How it works...
There's more...
Importing networks
Getting ready
How to do it...
How it works...
Exploring subgraphs within a heroic network
Getting ready
How to do it…
How it works...
There's more...
Finding strong ties
Getting ready
How to do it...
How it works...
There's more...
Finding key players
Getting ready
How to do it...
How it works...
There's more…
The betweenness centrality
The closeness centrality
The eigenvector centrality
Deciding on centrality algorithm
Exploring the characteristics of entire networks
Getting ready
How to do it...
How it works...
Clustering and community detection in social networks
Getting ready
How to do it...
How it works...
There's more...
Visualizing graphs
Getting ready
How to do it...
How it works...
9. Recommending Movies at Scale (Python)
Introduction
Modeling preference expressions
How to do it…
How it works…
Understanding the data
Getting ready
How to do it…
How it works…
There's more…
Ingesting the movie review data
Getting ready
How to do it…
How it works…
Finding the highest-scoring movies
Getting ready
How to do it…
How it works…
There's more…
See also
Improving the movie-rating system
Getting ready
How to do it…
How it works…
There's more…
See also
Measuring the distance between users in the preference space
Getting ready
How to do it…
How it works…
There's more…
See also
Computing the correlation between users
Getting ready
How to do it…
How it works…
There's more…
Finding the best critic for a user
Getting ready
How to do it…
How it works…
Predicting movie ratings for users
Getting ready
How to do it…
How it works…
Collaboratively filtering item by item
Getting ready
How to do it…
How it works…
Building a nonnegative matrix factorization model
How to do it…
How it works…
See also
Loading the entire dataset into the memory
Getting ready
How to do it…
How it works…
There's more…
Dumping the SVD-based model to the disk
How to do it…
How it works…
Training the SVD-based model
How to do it…
How it works…
There's more…
Testing the SVD-based model
How to do it…
How it works…
There's more…
10. Harvesting and Geolocating Twitter Data (Python)
Introduction
Creating a Twitter application
Getting ready
How to do it...
How it works...
See also
Understanding the Twitter API v1.1
Getting ready
How to do it...
How it works...
There's more...
See also
Determining your Twitter followers and friends
Getting ready
How to do it...
How it works...
There's more...
See also
Pulling Twitter user profiles
Getting ready
How to do it...
How it works...
There's more...
See also
Making requests without running afoul of Twitter's rate limits
Getting ready
How to do it...
How it works...
Storing JSON data to the disk
Getting ready
How to do it...
How it works...
Setting up MongoDB for storing Twitter data
Getting ready
How to do it...
How it works...
There's more...
See also
Storing user profiles in MongoDB using PyMongo
Getting ready
How to do it...
How it works...
Exploring the geographic information available in profiles
Getting ready
How to do it...
How it works...
There's more...
See also
Plotting geospatial data in Python
Getting ready
How to do it...
How it works...
There's more...
See also
11. Optimizing Numerical Code with NumPy and SciPy (Python)
Introduction
Understanding the optimization process
How to do it…
How it works…
There's more…
Identifying common performance bottlenecks in code
How to do it…
How it works…
Reading through the code
Getting ready
How to do it…
How it works…
See also
Profiling Python code with the Unix time function
Getting ready
How to do it…
How it works…
See also
Profiling Python code using built-in Python functions
Getting ready
How to do it…
How it works…
See also
Profiling Python code using IPython's %timeit function
How to do it…
How it works…
Profiling Python code using line_profiler
Getting ready
How to do it…
How it works…
There's more…
See also
Plucking the low-hanging (optimization) fruit
Getting ready
How to do it…
How it works…
Testing the performance benefits of NumPy
Getting ready
How to do it…
How it works…
There's more…
See also
Rewriting simple functions with NumPy
Getting ready
How to do it…
How it works…
Optimizing the innermost loop with NumPy
Getting ready
How to do it…
How it works…
There's more…
Index
· · · · · · (收起)
喜欢读"Practical Data Science Cookbook - Real-World Data Science Projects to Help..."的人也喜欢 · · · · · ·
Practical Data Science Cookbook - Real-World Data Science Projects to Help You Get Your Hands On Your Data的书评 · · · · · · ( 全部 2 条 )
> 更多书评 2篇
论坛 · · · · · ·
在这本书的论坛里发言这本书的其他版本 · · · · · · ( 全部2 )
-
人民邮电出版社 (2016)6.2分 14人读过
以下书单推荐 · · · · · · ( 全部 )
- python高分书 (cp4)
- you R ready 2014/2015 (阿道克)
- 数据科学 (跑跑小羊)
- [晒书第四单]R语言统计和数据分析 (cs001632)
- python data statistics (vermouth86)
谁读这本书? · · · · · ·
二手市场
· · · · · ·
- 在豆瓣转让 有35人想读,手里有一本闲着?
订阅关于Practical Data Science Cookbook - Real-World Data Science Projects to Help You Get Your Hands On Your Data的评论:
feed: rss 2.0
0 有用 阿道克 2014-12-06 21:16:20
案例教学,不太简单不太难,高年级本科生水平。
0 有用 oilbeater 2015-03-29 22:48:31
利益相关,参与了后四分之一的翻译。优点例子很生动实践性很强,缺点理论部分偏弱,并且难度偏低,废话偏多。里面介绍了很多Python以及R相关工具和类库,可以帮助入门者迅速构建自己的工具链并找到实际应用的例子,这大概是这本书最大的贡献了吧
0 有用 雯雯老师 2016-02-17 11:59:38
案例太冗长,难度适中,适合认真型小白自学 @jessiejcjsjz
0 有用 tadisi 2017-01-02 17:40:41
非常适合数据科学入门,也是R和python入门的补充,跟随实际项目去了解数据分析方法和思路。
0 有用 tadisi 2017-01-02 17:40:41
非常适合数据科学入门,也是R和python入门的补充,跟随实际项目去了解数据分析方法和思路。
0 有用 雯雯老师 2016-02-17 11:59:38
案例太冗长,难度适中,适合认真型小白自学 @jessiejcjsjz
0 有用 oilbeater 2015-03-29 22:48:31
利益相关,参与了后四分之一的翻译。优点例子很生动实践性很强,缺点理论部分偏弱,并且难度偏低,废话偏多。里面介绍了很多Python以及R相关工具和类库,可以帮助入门者迅速构建自己的工具链并找到实际应用的例子,这大概是这本书最大的贡献了吧
0 有用 阿道克 2014-12-06 21:16:20
案例教学,不太简单不太难,高年级本科生水平。