In God we trust, all others bring data! –William Edwards Deming, Management Guru
The oft-quoted phrase above summarises the importance of data if we were to be objective and credible in what we say and propose. Data must form the bedrock all our decision making endeavours if the decisions were to have a ring of truth in them. Data, specifically numerical data, renders itself analysable beautifully to the universal language of science, Mathematics.
Statistics, the field of study that involves the collection and analysis of numerical facts or data of any kind, is one such tool. Statistics are amazingly persuasive, so much so, that someone has said “There are three kinds of lies: lies, damned lies, and statistics.” However, only those people can be subjected to lying with statistics who do not understand it. Now, to study statistics, two things must be available, a good text book and a nice set of data. I must recommend a book of statistics at this point called “Statistics in Plain English” by “Timothy C. Urdan”. The author of this relatively small book must be complemented for making the dry subject interesting enough and comprehendible enough so as to casually browse and study. Highly recommended.
It would be no coincidence that the flow of forthcoming analysis would mirror the flow of this book. The second requirement of a nice, interesting and large set of data was fulfilled when I serendipitously stumbled upon an open SQL server connection containing a offices arrival and departure times logs for the past several years(don’t ask). The data set is pretty large, more than a hundred thousand records so as to render itself to experiments with random sampling, and simple enough to be easily understood.
So, Let us start this journey of discovery and learning the basic tenets of descriptive and inferential statistics. Another resource, albeit a bit advanced, would be another of the sites of the excellent stack-exchange network- http://stats.stackexchange.com.
So hang on and hold tight. There we go!