The Inspection Paradox 9/15/2015

The Inspection Paradox

Many are aware of the so-called friendship paradox of social networks, which was mentioned in one of the investigations in 1991 when social networks were offline. However, this paradox can be applied to modern online networks as well. This is actually a type of the famous inspection paradox.

The friendship paradox consists in the following: if we take any Facebook user and choose randomly any of his/her friends, we will find out with 80% possibility that the latter has more friends than the former.

In fact, there is nothing unusual. The essence of this paradox is that users who have more friends become a statistical sample more frequently. According to Stanford Large Network Dataset Collection an average Fb user out of 4000 users has 42 friends, but each friend out of these 42 has an average of 91 friends.

One may come across the inspection paradox quite often and it may even lead to delusions and mistakes. Here is a famous example: university students selected randomly are asked about the size of their group and according to their answers the arithmetic average is 56 people, though the university administration says there are only 31 students in each group on average. Surprisingly enough, but both the administration and students tell the truth. The thing is that people from larger groups have more chances to become a statistical sample just because there are more of them.

This mistake may seem quite trivial, but it is a reason for misunderstanding in many situations. The American professor of Mathematics Allen Downey provides an example of such a mistake. He observed the Red Line trains in Boston between 17:00 and 18:00 and arrived at a conclusion that the average interval between trains was 7.8 min. So, the average waiting time must have been about 3.9 min, but according to passengers the average waiting time was 4.4 min and the average interval between trains – 8.8 min. Thus, the difference was almost 15%.

The explanation is more than simple. During longer intervals, there are more people crowding on the platform and awaiting the next train, while during shorter intervals platforms and trains are less crowded. Obviously, there are more possibilities to select as a sample those people who are waiting during a longer period. The result is as follows: the majority of interviewed people complained about overcrowded trains and long interarrival time, whereas the company officials said that according to their data everything was within normal range.

