Swiss Passenger Frequency

At first we load the class Effects() and wrap the passenger frequency of 724 Swiss Railway stations in it.

>>> from effectus import Effects
>>> from effectus.data import swiss_rail_passengers
>>> pfreq = Effects(swiss_rail_passengers())

Overview

Is a pareto distribution present? If, which?

>>> pfreq
<pareto present [0.785]: 1/20 causes => 3/5 effects [total ∆: 1.3 % points]>

5 percent of Swiss railway stations make up 60 percent of all on- and offboarding passengers.

This 5 percent of railway stations are the vital few that are decisive.

Inversely we can determine the useful many:

>>> pfreq.the_rule()
'Rule 60/5 [total ∆: 1.3 % points]'

60 percent of Swiss railway stations make up only 5 percent of all passengers.

Note

If the number 60 percent of effects for the vital few re-occurs with the causes in this example, this is random.

Data Scalpell

The 60 percent are only the best estimate. The actual relation nearest to it (not for the_rule()) is available through:

>>> pfreq.actual
{'causes': Fraction(35, 724), 'effects': Fraction(215520, 365867)}

Some methods of Effects accept a parameter limit. If it is not set, the method pulls in the preset from actual:

>>> pfreq.attain_causes()
0.589

35/724th of the railway stations actually make up for 58.9 percent of the passengers.

I can determine that for any other part:

>>> from quicktions import Fraction
>>> pfreq.attain_causes(Fraction(36, 724)) # I use the module for fractions here because
...                                        # a Fraction is more accurate than a floating point number.
0.594

How do I know which rail stations make up for 58.9 percent of passengers?

>>> pfreq.separate_causes()
(19800.0, 1, 1)

All with 19800 or more on- or offboarding passengers.

Let’s take a further look on this group:

>>> pfreq.few
<[5% => 60%] count: 35, mean: 61577.143, stdev: 75413.826>

The values of the vital few are handy:

>>> few = pfreq.few.values; len(few)
35

Which is nothing but a shortcut for:

>>> few = pfreq.retrieve_effects(); len(few)
35

The same for the useful many:

>>> pfreq.most
<[95% => 40%] count: 689, mean: 2182.104, stdev: 3503.793>

Which is nothing different than the counterpart to the vital few:

>>> most = pfreq.retrieve_effects(counterpart=True)
>>> assert len(most)+len(few) == len(pfreq.all.values)

The groups in overview:

>>> pfreq.groups()
  Causes    Effects    Count       Mean      Stdev    Stdev/Mean    ∆ Mean
--------  ---------  -------  ---------  ---------  ------------  --------
    100%       100%      724   5053.412  21187.637           4.2
      5%        60%       35  61577.143  75413.826           1.2    +1119%
     95%        40%      689   2182.104   3503.793           1.6      -57%

Findings

60 percent of Swiss rail passengers get on or off at railway stations where on average 12 times the passengers of the average of all stations get on or off.

Inversely 40 percent of passengers get on or off at railway stations where not even half of the passengers of the average of all stations get on or off.

If we remove 35 railway stations, the standard deviation goes down for 689 stations by 83 percent compared to its previous level.

Although the standard deviation of the vital few (the second row) increases in absolute terms, one must keep in mind that the standard is the mean value that has increased considerably more than the standard deviation.

The Art of Differentiation

How can he who considers each group in its own regard to yield better results?

Following Wickham Skinner any system that tries to accomplish two different things cannot optimise for one as if it did, it would not be able to accomplush the other.

One vividly tries to eat a boiled egg with a table spoon and tries – with the same spoon – to dig a pit. The table spoon is too big and too small at the same time.

The trick is to use a shovel for the pit and a tea spoon for the boiled egg.

Although this is obvious for the boiled egg and the pit, it is not for many things in the real world: Many value carriers do not let appear a difference in their outer appearance.

If they did, where to separate them?

effectus tells you.