World Oil

For example, you could compare the world’s oil reserves against its consumption in 2015.

Oh, and let’s add co2() emissions for reference:

>>> from effectus.data import oil_reserves, oil_consumption, co2
>>> patterns = (oil_reserves, oil_consumption, co2)

Firstly, is there a pareto distribution present?

>>> bucket = [Effects(pattern()) for pattern in patterns]
>>> [pattern.pareto for pattern in bucket]
[True, True, True]

All three have a pareto distribution present.

Secondly, let’s check how do they compare against each other.

>>> [pattern.ratio for pattern in bucket]
[0.753, 0.797, 0.785]

The oil_reserves() have the highest pareto distribution (least entropy). Followed by the co2() emissions and least for oil_consumption().


  1. Which share of countries consume 75% of oil_consumption()?
  2. How much of co2() emissions account the biggest five emitting countries for?
  3. Which number of countries is required to attain 75% of oil_reserves()?
  4. How many years do 10% of the oil_reserves() last for 10% of oil_consumption() (note: reserves are in million barrels per year, consumption in thousand barrels per day)?

Euro Coins & Bank Notes

>>> from effectus.data import euro_coins, euro_banknotes
>>> coins = Effects(euro_coins())
>>> coins
<pareto present [0.919]: 1/4 causes => 2/3 effects [total ∆: 2.4 % points]>

A quarter of all euro_coins() in circulation makes up two thirds of the value they represent. How much is this?

>>> Effects(euro_coins()).summary['effects']*sum(euro_coins())/10**3

Almost 18 billion Euros.


  1. Is a pareto distribution present for euro_banknotes()?
  2. If, which distribution is stronger, that for the euro_coins() or for the euro_banknotes()?
  3. How much Euros do the largest 10% of euro_banknotes() represent?
  4. How much Euros do the largest 50% of euro_coins() represent?

Swiss Railway Passengers

How much of total passenger traffic do the least frequented 10% of stations have?

>>> from effectus.data import swiss_rail_passengers
>>> passengers = Effects(swiss_rail_passengers())
>>> passengers.interval_causes(0.9, 1.0)

Only 0.2 percent. How many of the least frequented stations are required to make up 10 percent of passenger traffic?

>>> least = passengers.interval_effects(0.9, 1.0)
>>> least*len(swiss_rail_passengers())

522 of 724 Swiss railway stations make up only 10 percent of passenger traffic.

Let’s check for rule 50/5:

>>> passengers.attain_effects(0.05, ascending=True)

Rule 50/5 is present.


  1. Suppose Swiss railways classifies railway stations into A (60 percent of passenger volume), B (60 to 80 percent) and C (80 to 100 percent of volume). Which share of causes would each class represent?
  2. Following your results from 1., how many stations fall into the categories A, B and C?
  3. Determine the corresponding threshold values of passenger frequency enabling you to allocate each single station to class A, B or C.


Exoplanets are planets not in our sun system.

>>> from effectus.data import exoplanets
>>> from effectus import Effects
>>> from math import pi
>>> eplanets = {name: {'mass': values['mass'], 'volume': 4/3*pi*values['radius']**3 } for name, values in exoplanets().items()}
>>> volumes = Effects([values['volume'] for values in eplanets.values()])
>>> masses = Effects([values['mass'] for values in eplanets.values()])
>>> [element for element in [volumes, masses]]
[<pareto present [0.923]: 1/5 causes => 3/5 effects [total ∆: 2.2 % points]>,
 <pareto present [0.84]: 1/5 causes => 4/5 effects [total ∆: 1.5 % points]>]

630 exoplanets have a pareto distribution present with both mass and volume. Let’s find out which causes represent the effect classes A (0 to 60 percent), B (60 to 80 percent) and C (80 to 100 percent).

Let’s start with class A:

>>> volumes.interval_effects(0, 0.6)

For all three classes we could do something like:

>>> from effectus.intervals import create_bounds
>>> from collections import defaultdict
>>> volumes_intervals = defaultdict(dict)
>>> for lower, upper in create_bounds([0.6, 0.8]):
...     volumes_intervals['{}..{}'.format(lower, upper)]['causes'] = volumes.interval_effects(lower, upper)
>>> [(interval, volumes_intervals[interval]['causes']) for interval in volumes_intervals]
[('0..0.6', 0.2),
 ('0.6..0.8', 0.16099999999999998),
 ('0.8..1', 0.45599999999999996)]

We see, the last 20 percent of the total volume of our exoplanets can be traced back to almost 46 percent of exoplanets.

Now, let’s retrieve the keys of the causes falling into a specific effects interval:

>>> from effectus.intervals import keys_in_effects_interval
>>> for lower, upper in create_bounds([0.6, 0.8]):
...     volumes_intervals['{}..{}'.format(lower, upper)]['planets'] = set(keys_in_effects_interval(eplanets, 'volume', lower, upper))

volumes_intervals[‘0..0.6’][‘causes’] carries the causes for the interval 0 through 0.6 of effects. volumes_intervals[‘0..0.6’][‘planets’] holds the names of the planets that fall into that interval of effects.

Now, let’s do the same for the mass of the exoplanets:

>>> masses_intervals = defaultdict(dict)
>>> for lower, upper in create_bounds([0.6, 0.8]):
...     masses_intervals['{}..{}'.format(lower, upper)]['causes'] = masses.interval_effects(lower, upper)
>>> for lower, upper in create_bounds([0.6, 0.8]):
...     masses_intervals['{}..{}'.format(lower, upper)]['planets'] = set(keys_in_effects_interval(eplanets, 'mass', lower, upper))
>>> [(interval, masses_intervals[interval]['causes']) for interval in masses_intervals]
[('0..0.6', 0.07), ('0.6..0.8', 0.127), ('0.8..1', 0.803)]

While 20 percent of exoplanets are required to make up for 60 percent of their total volume (see above), only 7 percent are required to make up for 60 percent of their total mass.

One might be interested in which exoplanets fall into the same interval. For class A (0 to 60 percent) this would be just the 7 percent. But for classes B and C the story would be a little more complicated.

>>> b_planets = masses_intervals['0.6..0.8']['planets'].intersection(volumes_intervals['0.6..0.8']['planets'])
>>> c_planets = masses_intervals['0.8..1']['planets'].intersection(volumes_intervals['0.8..1']['planets'])

As this is a tedious process, there is a convenience function just doing that:

>>> from effectus.intervals import effects_intersections
>>> planet_intervals = effects_intersections(eplanets, 'mass', 'volume', [0.6, 0.8])

However, if you want those planets that account for 80 to 100 percent of their total volume but not for 60 to 80 percent for their total mass, you have to do it on your own:

>>> exclusion = volumes_intervals['0.8..1']['planets'].difference(masses_intervals['0.6..0.8']['planets'])
>>> [len(planets) for planets in
... [volumes_intervals['0.8..1']['planets'],
...  masses_intervals['0.6..0.8']['planets'], exclusion] ]
[401, 80, 361]

While 401 exoplanets make up the interval 80 to 100 percent of total volume, only 80 make up the interval of 60 to 80% of the total mass. And half of the latter falls at the same time into the group making up 80 to 100 percent of total volume. If we subtract them from the first group (the 401 exoplanets), we are left with 361.


  1. How many exoplanets fall into the interval of 30 to 60 percent of total mass?
  2. Which exoplanets are this namely?
  3. Find out the exoplanets that fall into the interval of 70 to 80 percent of total mass but not into the interval of 10 to 90 percent of volume.

Fibonacci Sequences

In a fibonacci sequence the next number is the sum of the preceding two. It has been observed that natural growth processes follow the pattern of fibonacci sequences.

Use for your convenience the provided function to generate fibonacci series:

>>> from functools import lru_cache
>>> def fib_list(count):
...     """Returns list of `count` fibonacci numbers."""
...     @lru_cache(maxsize=None)
...     def fib(n):
...         if n < 2:
...             return n
...         return fib(n-1) + fib(n-2)
...     return [fib(i) for i in range(count)]
>>> fib_list(10)
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]


  1. Check whether a pareto distribution can be observed for fibonacci series. If, all or just from/to a specific fibonacci series?