6.1 C
London
Saturday, December 21, 2024
HomePythonDescriptive Statistics in PythonHow to Calculate Deciles in Python (With Examples)

How to Calculate Deciles in Python (With Examples)

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

In statistics, deciles are numbers that split a dataset into ten groups of equal frequency.

The first decile is the point where 10% of all data values lie below it. The second decile is the point where 20% of all data values lie below it, and so on.

We can use the following syntax to calculate the deciles for a dataset in Python:

import numpy as np

np.percentile(var, np.arange(0, 100, 10))

The following example shows how to use this function in practice.

Example: Calculate Deciles in Python

The following code shows how to create a fake dataset with 20 values and then calculate the values for the deciles of the dataset:

import numpy as np

#create data
data = np.array([56, 58, 64, 67, 68, 73, 78, 83, 84, 88,
                 89, 90, 91, 92, 93, 93, 94, 95, 97, 99])

#calculate deciles of data
np.percentile(data, np.arange(0, 100, 10))

array([56. , 63.4, 67.8, 76.5, 83.6, 88.5, 90.4, 92.3, 93.2, 95.2])

The way to interpret the deciles is as follows:

  • 10% of all data values lie below 63.4
  • 20% of all data values lie below 67.8.
  • 30% of all data values lie below 76.5.
  • 40% of all data values lie below 83.6.
  • 50% of all data values lie below 88.5.
  • 60% of all data values lie below 90.4.
  • 70% of all data values lie below 92.3.
  • 80% of all data values lie below 93.2.
  • 90% of all data values lie below 95.2.

Note that the first value in the output (56) simply denotes the minimum value in the dataset.

Example: Place Values into Deciles in Python

To place each data value into a decile, we can use the qcut pandas function.

Here’s how to use this function for the dataset we created in the previous example:

import pandas as pd

#create data frame
df = pd.DataFrame({'values': [56, 58, 64, 67, 68, 73, 78, 83, 84, 88,
                              89, 90, 91, 92, 93, 93, 94, 95, 97, 99]})

#calculate decile of each value in data frame
df['Decile'] = pd.qcut(df['values'], 10, labels=False)

#display data frame
df

	values	Decile
0	56	0
1	58	0
2	64	1
3	67	1
4	68	2
5	73	2
6	78	3
7	83	3
8	84	4
9	88	4
10	89	5
11	90	5
12	91	6
13	92	6
14	93	7
15	93	7
16	94	8
17	95	8
18	97	9
19	99	9

The way to interpret the output is as follows:

  • The data value 56 falls between the percentile 0% and 10%, thus it falls in decile 0.
  • The data value 58 falls between the percentile 0% and 10%, thus it falls in decile 0.
  • The data value 64 falls between the percentile 10% and 20%, thus it falls in decile 1..
  • The data value 67 falls between the percentile 10% and 20%, thus it falls decile 1.
  • The data value 68 falls between the percentile 20% and 30%, thus it falls decile 2.

And so on.

Additional Resources

How to Calculate Percentiles in Python
How to Calculate The Interquartile Range in Python

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories