Top scope of statistics
Top scope of statistics

This article is related to " Top scope of statistics " and scope of statistics in different fields. 

Top scope of statistics

Overview


The field of statistical thinking is the study of methods and patterns in data. The main objective of this article is to make you think about some practical examples of why it is important to know different types of statistical tools like basic probability and statistics, regression analysis, etc. In order to understand these examples, we should first give an idea of what the terms “Statistics” and other related words mean. First, let us talk about the history of Statistical Thinking and how it was created in the past days. Second, let us explain what concepts of Random Variables and/or Probability come under the name of statistics. Lastly, let us discuss a quick example of two random variables/probabilities in action, then discuss one more example for each concept. In my next article, I will start discussing what is meant by both concepts.


History of Statistical Thinking and its foundation


The term “Statistics”, or “Statistical Thinking” was coined by Henry D. Montgomery in 1906. According to him, Statistical Thinking had been developed during the Industrial Revolution and has remained the most efficient way of thinking in modern times. He also said that statistic had been the key to economic progress and social advancement of the world through the industrial revolution. But as far back as 1600 to 1800s, people were still using stone-based measuring instruments with human judgment involved in calculations. It took many years before people started using machines like telescopes, weighing machines, etc. This period can be taken as the background of “Statistics”. With help of such devices, they could now measure distances from celestial bodies and they could now calculate the volume of ocean inside the Earth.

In 1760s, Arthur Cayley who was a British physicist published his book Principia Naturalis, where he discussed various topics including mathematics, astronomy, philosophy, chemistry, mathematics, etc. Mathematics became the backbone of calculus and physics. During 1850s, William Rowan Hamilton published a work titled on Analytical Biology and Mathematical Physics. His paper named Law of Equinoxes (also known as Lagrange’s constant) helped to develop the theory of Gravity and eventually, electricity. Along with other scholars, he was able to prove the law of gravity. As time went by, the number of mathematical formulas increased exponentially. Now, it took only few decades to predict all possible possibilities of events in the Universe around our earth and from there, quantum physics has been evolved. From the very beginning, scientists have used numbers and numbers to predict future events. Later, their discoveries were used to estimate the probability of occurrence of natural calamities.

The first ever statistical test called T-Test was proposed by Karl Pearson in 1884–1885. His theorem for the sample dependence test called ‘The Normalized Summation Test’. 

According to the hypothesis, the test statistic should be less than or equal to 0.05 and it may take many decades for its values to converge towards one.

In 1960s, another concept called independence was introduced by E. Siegel. Independence means that we do not observe each other, and we have our own independent samples. At that time, computer programs were still being developed, and so we could calculate a correlation between the values of independent samples. So, our sample distribution becomes dependent upon all the samples. Our samples were considered as random samples, because we are making errors. Thus, when studying the properties of normal distribution, I will consider the sample distribution as another form of error. Hence, when we say our samples are independent, our samples are independent samples.

In 1964, I. Rubin published a journal The Bell Journal of Economics. In 1987, Robert Tibshirani published the journal Quantitative Finance. That time, computers became more powerful and so the theory of general econometric models did not use much physical computing power. So, the calculation of these models did not require a lot of computers or any special software. Today, quantitative finance has become almost impossible without the usage of supercomputers.

Later, R.H. Boller’s book Quantitative Methods was published in the year 1997. What is done with quantitative methods is just the same thing with our earlier methods, namely the specification of the model. However, we have to find a mathematical representation of the model and perform the empirical test.

Another thing added to mathematical modeling is the importance of probabilistic statements in real life. For instance, the statement “I am happy, whatever happens” comes into existence only if you have a chance to see another individual. But, if you say that you are satisfied whether something happens or not, it also becomes true. Hence, statements like “I am happy because I saw you” come into existence only if you have the evidence that someone actually saw you and so on. It is like saying that you are happy because you received a certain reward or status. That makes it similar to an event.

Problems related to statistical thinking

Just like the definitions above, there are certain problems related to these kinds of definitions. There are two major problems that have arisen due to statistical thinking:

Data misinterpretation is a common problem. People usually confuse the interpretation of distributions with those of probability distributions. They have different meanings of the word distribution. A distribution is just the total information regarding probabilities over a finite space. Whereas, a frequency distribution is basically a weighted set of probabilities such as frequencies of occurrence of random events. Generally, when we talk about distributions, we are mostly concerned with the mathematical relations in them. When we talk about probability distributions, we are mostly concerned with the theoretical relationship between probability models and the probability distribution. To solve such a problem, we need to understand that we convert those distributions into probabilities. So, if we consider random variable as another form of distribution, then it becomes a collection of probabilities, and if we speak about distributions, we are primarily trying to understand the relationships between the functions and functions in probability.