Chapter 1, Importing Data for Analysis, will cover how to read data from a variety of sources, including CSV files, web pages, and linked semantic web data.
Chapter 2, Cleaning and Validating Data, will present strategies and implementations for normalizing dates, fixing spelling, and working with large datasets. Getting data into a useable shape is an important, but often overlooked, stage of data analysis.
Chapter 3, Managing Complexity with Concurrent Programming, will cover Clojure's concurrency features and how we can use them to simplify our programs.
Chapter 4, Improving Performance with Parallel Programming, will cover using Clojure's parallel processing capabilities to speed up processing data.
Chapter 5, Distributed Data Processing with Cascalog, will cover using Cascalog as a wrapper over Hadoop and the Cascading library to process large amounts of data distributed over multiple computers. The final recipe in this chapter will use Pallet to run a simple analysis on Amazon's EC2 service.
Chapter 6, Working with Incanter Datasets, will cover the basics of working with Incanter datasets. Datasets are the core data structure used by Incanter, and understanding them is necessary to use Incanter effectively.
· · · · · ·
(
更多)
Chapter 1, Importing Data for Analysis, will cover how to read data from a variety of sources, including CSV files, web pages, and linked semantic web data.
Chapter 2, Cleaning and Validating Data, will present strategies and implementations for normalizing dates, fixing spelling, and working with large datasets. Getting data into a useable shape is an important, but often overlooked, stage of data analysis.
Chapter 3, Managing Complexity with Concurrent Programming, will cover Clojure's concurrency features and how we can use them to simplify our programs.
Chapter 4, Improving Performance with Parallel Programming, will cover using Clojure's parallel processing capabilities to speed up processing data.
Chapter 5, Distributed Data Processing with Cascalog, will cover using Cascalog as a wrapper over Hadoop and the Cascading library to process large amounts of data distributed over multiple computers. The final recipe in this chapter will use Pallet to run a simple analysis on Amazon's EC2 service.
Chapter 6, Working with Incanter Datasets, will cover the basics of working with Incanter datasets. Datasets are the core data structure used by Incanter, and understanding them is necessary to use Incanter effectively.
Chapter 7, Preparing for and Performing Statistical Data Analysis with Incanter, will cover
a variety of statistical processes and tests used in data analysis. Some of these are quite simple, such as generating summary statistics. Others are more complex, such as performing linear regressions and auditing data with Benford's Law.
Chapter 8, Working with Mathematica and R, will talk about setting up Clojure to talk to Mathematica or R. These are powerful data analysis systems, and sometimes we might want to use them. This chapter will show us how to get these systems to work together, as well as some tasks we can do once they are communicating.
Chapter 9, Clustering, Classifying, and Working with Weka, will cover more advanced machine learning techniques. In this chapter, we'll primarily use the Weka machine learning library, and some recipes will discuss how to use it and the data structures its built on, while other recipes will demonstrate machine learning algorithms.
Chapter 10, Graphing in Incanter, will show how to generate graphs and other visualizations in Incanter. These can be important for exploring and learning about your data and also for publishing and presenting your results.
Chapter 11, Creating Charts for the Web, will show how to set up a simple web application to present findings from data analysis. It will include a number of recipes that leverage the powerful D3 visualization library.
· · · · · · (
收起)
0 有用 M. Tong 2013-07-14 01:47:40
作为一本Cookbook,还是比较合格的。
0 有用 散关清渭 2015-02-23 19:47:20
读了一些 确实是没看完 准确说是我看不下去了 这本书主要写了使用Clojure操作一些常用数据统计工具的方法 或许是因为我并不是做数据分析方向的 并不知道写这些有啥用 总感觉这是一本操作指南而不是我认识中的书…… 故 差评!