25.5 C
London
Thursday, June 19, 2025
HomePythonHypothesis Tests in PythonHow to Perform Multivariate Normality Tests in Python

How to Perform Multivariate Normality Tests in Python

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

When we’d like to test whether or not a single variable is normally distributed, we can create a Q-Q plot to visualize the distribution or we can perform a formal statistical test like an Anderson Darling Test or a Jarque-Bera Test.

However, when we’d like to test whether or not several variables are normally distributed as a group we must perform a multivariate normality test.

This tutorial explains how to perform the Henze-Zirkler multivariate normality test for a given dataset in Python.

Related: If we’d like to identify outliers in a multivariate setting, we can use the Mahalanobis distance.

Example: Henze-Zirkler Multivariate Normality Test in Python

The Henze-Zirkler Multivariate Normality Test determines whether or not a group of variables follows a multivariate normal distribution. The null and alternative hypotheses for the test are as follows:

H0 (null): The variables follow a multivariate normal distribution.

Ha (alternative): The variables do not follow a multivariate normal distribution.

To perform this test in Python we can use the multivariate_normality() function from the pingouin library.

First, we need to install pingouin:

pip install pingouin

Next, we can import the multivariate_normality() function and use it to perform a Multivariate Test for Normality for a given dataset:

#import necessary packages
from pingouin import multivariate_normality
import pandas as pd
import numpy as np

#create a dataset with three variables x1, x2, and x3
df = pd.DataFrame({'x1':np.random.normal(size=50),
                   'x2': np.random.normal(size=50),
                   'x3': np.random.normal(size=50)})

#perform the Henze-Zirkler Multivariate Normality Test
multivariate_normality(df, alpha=.05)

HZResults(hz=0.5956866563391165, pval=0.6461804077893423, normal=True)

The results of the test are as follows:

  • H-Z Test Statistic: 0.59569
  • p-value: 0.64618

Since the p-value of the test is not less than our specified alpha value of .05, we fail to reject the null hypothesis. The dataset can be assumed to follow a multivariate normal distribution.

Related: Learn how the Henze-Zirkler test is used in real-life medical applications in this research paper.

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories