How to Perform a Box-Cox Transformation in Python

Example: Box-Cox Transformation in Python

Suppose we generate a random set of 1,000 values that come from an exponential distribution:

#load necessary packages
import numpy as np 
from scipy.stats import boxcox 
import seaborn as sns 

#make this example reproducible
np.random.seed(0)

#generate dataset
data = np.random.exponential(size=1000)

#plot the distribution of data values
sns.distplot(data, hist=False, kde=True)

We can see that the distribution does not appear to be normal.

We can use the boxcox() function to find an optimal value of lambda that produces a more normal distribution:

#perform Box-Cox transformation on original data
transformed_data, best_lambda = boxcox(data) 

#plot the distribution of the transformed data values
sns.distplot(transformed_data, hist=False, kde=True)

We can see that the transformed data follows much more of a normal distribution.

We can also find the exact lambda value used to perform the Box-Cox transformation:

#display optimal lambda value
print(best_lambda)

0.2420131978174143

The optimal lambda was found to be roughly 0.242.

Thus, each data value was transformed using the following equation:

New = (old^0.242 – 1) / 0.242

We can confirm this by looking at the values from the original data compared to the transformed data:

#view first five values of original dataset
data[0:5]

array([0.79587451, 1.25593076, 0.92322315, 0.78720115, 0.55104849])

#view first five values of transformed dataset
transformed_data[0:5]

array([-0.22212062,  0.23427768, -0.07911706, -0.23247555, -0.55495228])

The first value in the original dataset was 0.79587. Thus, we applied the following formula to transform this value:

New = (.79587^0.242 – 1) / 0.242 = -0.222

We can confirm that the first value in the transformed dataset is indeed -0.222.

Additional Resources

How to Create & Interpret a Q-Q Plot in Python
How to Perform a Shapiro-Wilk Test for Normality in Python

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Highlights of the 2023 Union Budget: Announcements for 15 Key Sectors

Gold Prices May Rise as Import Duty on Gold raised by 5%

Relief to MSMEs as Mandatory GST Registration waived for online sellers

GST Council Meet Highlights, Full List of Items to get Costlier

Highlights of the 2023 Union Budget: Announcements for 15 Key Sectors

Gold Prices May Rise as Import Duty on Gold raised by 5%

Relief to MSMEs as Mandatory GST Registration waived for online sellers

GST Council Meet Highlights, Full List of Items to get Costlier

Learn About Opening an Automobile Repair Shop in India

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

Income Splitting in Canada for 2023

Can I Deduct Home Office Expenses on my Tax Return 2023?

Canadian Tax – Personal Tax Deadline 2022

Example: Box-Cox Transformation in Python

Additional Resources

Learn About Opening an Automobile Repair Shop in India

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

Income Splitting in Canada for 2023

Can I Deduct Home Office Expenses on my Tax Return 2023?

ABOUT US

Latest

Learn About Opening an Automobile Repair Shop in India

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

Income Splitting in Canada for 2023

Popular

How to Create a Stem-and-Leaf Plot in SPSS

How to Create a Correlation Matrix in SPSS

Excel: How to Use IF Function with Multiple Conditions

Sitemap