We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details

Using Predictive Power Score to Pinpoint Non-linear Correlations

guest_blog 31 Oct, 2022
11 min read

Using Predictive Power Score to Pinpoint Non-linear Correlations

Image by Author

Predictive Power Score to Pinpoint Non-linear Correlations

Image by Author

correlation power plot

Image by Author
!pip3 install ppscore

 

Calculating the Predictive Power Score

Predictive Power Score VS Correlation

Predictive Power Score VS Correlation

Image by Github
import pandas as pd
import numpy as np
import ppscore as pps
df = pd.DataFrame()
df

Predictive Power Score numpy

Image by Author
df["x"] = np.random.uniform(-2, 2, 10000)
df.head()
Image for post

Image for post

df["error"] = np.random.uniform(-0.5, 0.5, 10000)
df.head()

Image for post

Image by Author
df["y"] = df["x"] * df["x"] + df["error"]
df.head()
df["x"].corr(df["y"])
-0.0115046561021449
df.corr()
Image for post
pps.score(df, "x", "y")
{'x': 'x',
 'y': 'y',
 'ppscore': 0.675090383548477,
 'case': 'regression',
 'is_valid_score': True,
 'metric': 'mean absolute error',
 'baseline_score': 1.025540102508908,
 'model_score': 0.33320784136182485,
 'model': DecisionTreeRegressor()}
Image for post

Image By Author

pps.score(df, "y", "x")
{'x': 'y',
 'y': 'x',
 'ppscore': 0,
 'case': 'regression',
 'is_valid_score': True,
 'metric': 'mean absolute error',
 'baseline_score': 1.0083196087945172,
 'model_score': 1.1336852173737795,
 'model': DecisionTreeRegressor()}
pps.predictors(df, "y")

pps score

pps.predictors(df, "x")

pps predictor

pps.matrix(df)

pps matrix

Analyzing & visualizing results

import seaborn as sns
predictors_df = pps.predictors(df, y="y")
sns.barplot(data=predictors_df, x="x", y="ppscore")

Predictive Power Score visualization

matrix_df = pps.matrix(df)[['x', 'y', 'ppscore']].pivot(columns='x', index='y', values='ppscore')matrix_df

Predictive Power Score error

sns.heatmap(matrix_df, vmin=0, vmax=1, cmap="Blues", linewidths=0.5, annot=True)

Predictive Power Score heatmap

Example with Categorical Features

categorical variable Predictive Power Score

Disclosure

Limitations

 

Conclusions

References

guest_blog 31 Oct, 2022

Responses From Readers

Clear