作者:
[美]
Philipp K·Janert 出版社: O'Reilly Media 副标题: A hands-on guide for programmers and data scientists 出版年: 2010-11-25 页数: 540 定价: USD 39.99 装帧: Paperback ISBN: 9780596802356
Real World Data Analysis shows you how you think about data and the results you want to achieve with it. Author Philipp Janert teaches you how to effectively approach data analysis problems, and how to extract all the available information from your data. Many people can apply a data analysis formula. This book shows you how to look at the results and know whether t...
Real World Data Analysis shows you how you think about data and the results you want to achieve with it. Author Philipp Janert teaches you how to effectively approach data analysis problems, and how to extract all the available information from your data. Many people can apply a data analysis formula. This book shows you how to look at the results and know whether they're meaningful.
These days it seems like everyone is collecting data. But all of that data is just raw information -- to make that information meaningful, it has to be organized, filtered, and analyzed. Anyone can apply data analysis tools and get results, but without the right approach those results may be useless.
In Real World Data Analysis, author Philipp Janert teaches you how to think about data: how to effectively approach data analysis problems, and how to extract all of the available information from your data. Janert covers univariate data, data in multiple dimensions, time series data, graphical techniques, data mining, machine learning, and many other topics. He also reveals how seat-of-the-pants knowledge can lead you to the best approach right from the start, and how to assess results to determine if they're meaningful.
Data Analysis with Open Source Tools的创作者
· · · · · ·
After previous careers in physics and software development, Philipp K. Janert currently provides consulting services for data analysis, algorithm development, and mathematical modeling. He has worked for small start-ups and in large corporate environments, both in the U.S. and overseas. He prefers simple solutions that work to complicated ones that don't, and ...
After previous careers in physics and software development, Philipp K. Janert currently provides consulting services for data analysis, algorithm development, and mathematical modeling. He has worked for small start-ups and in large corporate environments, both in the U.S. and overseas. He prefers simple solutions that work to complicated ones that don't, and thinks that purpose is more important than process. Philipp is the author of "Gnuplot in Action - Understanding Data with Graphs" (Manning Publications), and has written for the O'Reilly Network, IBM developerWorks, and IEEE Software. He is named inventor on a handful of patents, and is an occasional contributor to CPAN. He holds a Ph.D. in theoretical physics from the University of Washington. Visit his company website at www.principal-value.com.
原文摘录
· · · · · ·
If a distribution is symmetric and well behaved, then mean and median will be quite close together, and there is little difference in using either. Once the distribution becomes skewed, however, the basic assumption that underlies the mean as a measure or the location of the distribution is no longer fulfilled, and so you are better off using the median. (This is why the average wage is usually given in official publications as the median family income, not the mean; the latter would be significantly distorted by the few households with extremely high incomes.) (查看原文)
In an unnormalized histogram, the value plotted for each bin is the absolute count of events in that bin.
In a normalized histogram, we divide each count by the total number of points in the data set, so that the value for each bin becomes the fraction of points in that bin. (查看原文)
Author keeps placing emphasis on insights instead of numbers while working with data. The ultimate goal of data analysis is to understand how the system works, not to show off how proficient you are a...Author keeps placing emphasis on insights instead of numbers while working with data. The ultimate goal of data analysis is to understand how the system works, not to show off how proficient you are at Math. That's the true spirit of professionalism. Some annoying jargon are well explained in a plain manner. Little sections on R.(展开)
Don’t let “data” get in the way of ethical decisions. The most important things in life can’t be measured. It is a fallacy to believe that, just because something can’t be measured, it doesn’t matter or doesn’t even exist. And a pretty tragic fallacy...
(展开)
英文标题叫 Data Analysis with Open Source Tools,中文副标题直译“基于开源工具的数据分析”。套用那句“神圣罗马帝国”的俏皮话来说,这书既没有Data,也没有Analysis,更没有什么Open Source Tools 洋洋洒洒,包罗万象:微分方程也有,概率分布也有,蒙特卡罗也有,量纲分...
(展开)
0 有用 已注销 2012-12-24 17:39:44
Author keeps placing emphasis on insights instead of numbers while working with data. The ultimate goal of data analysis is to understand how the system works, not to show off how proficient you are a... Author keeps placing emphasis on insights instead of numbers while working with data. The ultimate goal of data analysis is to understand how the system works, not to show off how proficient you are at Math. That's the true spirit of professionalism. Some annoying jargon are well explained in a plain manner. Little sections on R. (展开)
0 有用 蝉 2014-01-10 14:06:30
:无
0 有用 小风 2012-06-21 16:43:56
适合新手数据分析
1 有用 阿道克 2012-11-22 22:28:57
相当好的统计入门书,适应目前数据科学的变化。缺点是没有数据源,例子没法操作,效果很打折扣啊
0 有用 all the fish 2014-10-19 21:39:32
some names