Guest Blog — December 25, 2020
Data Exploration Data Visualization Intermediate Machine Learning Python Statistics Structured Data Supervised

Using Predictive Power Score to Pinpoint Non-linear Correlations

Image by Author

Predictive Power Score to Pinpoint Non-linear Correlations

Image by Author

correlation power plot

Image by Author
!pip3 install ppscore

 

Calculating the Predictive Power Score

Predictive Power Score VS Correlation

Predictive Power Score VS Correlation

Image by Github
import pandas as pd
import numpy as np
import ppscore as pps
df = pd.DataFrame()
df

Predictive Power Score numpy

Image by Author
df["x"] = np.random.uniform(-2, 2, 10000)
df.head()
Image for post

Image for post

df["error"] = np.random.uniform(-0.5, 0.5, 10000)
df.head()
Image for post

Image for post

Image for post

Image by Author
df["y"] = df["x"] * df["x"] + df["error"]
df.head()
Image for post

Image for post

df["x"].corr(df["y"])
-0.0115046561021449
df.corr()
Image for post
pps.score(df, "x", "y")
{'x': 'x',
 'y': 'y',
 'ppscore': 0.675090383548477,
 'case': 'regression',
 'is_valid_score': True,
 'metric': 'mean absolute error',
 'baseline_score': 1.025540102508908,
 'model_score': 0.33320784136182485,
 'model': DecisionTreeRegressor()}
Image for post

Image By Author

pps.score(df, "y", "x")
{'x': 'y',
 'y': 'x',
 'ppscore': 0,
 'case': 'regression',
 'is_valid_score': True,
 'metric': 'mean absolute error',
 'baseline_score': 1.0083196087945172,
 'model_score': 1.1336852173737795,
 'model': DecisionTreeRegressor()}
pps.predictors(df, "y")

pps score

pps.predictors(df, "x")

pps predictor

pps.matrix(df)

pps matrix

Analyzing & visualizing results

import seaborn as sns
predictors_df = pps.predictors(df, y="y")
sns.barplot(data=predictors_df, x="x", y="ppscore")

Predictive Power Score visualization

matrix_df = pps.matrix(df)[['x', 'y', 'ppscore']].pivot(columns='x', index='y', values='ppscore')matrix_df

Predictive Power Score error

sns.heatmap(matrix_df, vmin=0, vmax=1, cmap="Blues", linewidths=0.5, annot=True)

Predictive Power Score heatmap

Example with Categorical Features

categorical variable Predictive Power Score

Disclosure

Limitations

 

Conclusions

References

About the Author

Our Top Authors

  • Analytics Vidhya
  • Guest Blog
  • Tavish Srivastava
  • Aishwarya Singh
  • Aniruddha Bhandari
  • Abhishek Sharma
  • Aarshay Jain

Download Analytics Vidhya App for the Latest blog/Article

Leave a Reply Your email address will not be published. Required fields are marked *