PLSRegression VIP score calculation

See original GitHub issue

I was looking for an implementation of VIP (Variable Importance in the Projection) scoring for PLS models as described in this publication.

The algorithm from that paper is implemented in this MATLAB code which is in The MATLAB code is available under a BSD License. Below is a Python implementation of this same code, which produces the VIP scores from scikit-learn PLSRegression models.

def vip(x, y, model):
    t = model.x_scores_
    w = model.x_weights_
    q = model.y_loadings_

    m, p = x.shape
    _, h = t.shape

    vips = np.zeros((p,))

    s = np.diag(t.T @ t @ q.T @ q).reshape(h, -1)
    total_s = np.sum(s)

    for i in range(p):
        weight = np.array([ (w[i,j] / np.linalg.norm(w[:,j]))**2 for j in range(h) ])
        vips[i] = np.sqrt(p*(s.T @ weight)/total_s)

    return vips

Firstly, is this something that would be considered for addition to scikit-learn, and if so, where would be a good place to add it? There was a discussion on the mailing list from 2015 about adding this, but it doesn’t look like it got any further.

Issue Analytics

  • State:open
  • Created 7 years ago
  • Reactions:2
  • Comments:12 (3 by maintainers)

github_iconTop GitHub Comments

6reactions
gilliMcommented, Nov 17, 2017

First, np.linalg.norm(w[:,j]) cannot be zero because it’s always 1! Weights are normalized (step 1 of NIPALS)

The function’s arguments can be simplified :

def vip(model):
  t = model.x_scores_
  w = model.x_weights_
  q = model.y_loadings_
  p, h = w.shape
  vips = np.zeros((p,))
  s = np.diag(t.T @ t @ q.T @ q).reshape(h, -1)
  total_s = np.sum(s)
  for i in range(p):
      weight = np.array([ (w[i,j] / np.linalg.norm(w[:,j]))**2 for j in range(h) ])
      vips[i] = np.sqrt(p*(s.T @ weight)/total_s)
  return vips
0reactions
adrinjalalicommented, Jan 15, 2019

@melvinev it’ll happen once somebody takes the issue and submits a PR for it 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

VIP score and PLS regression coefficient - mixOmics user group
Short answer: we calculate the VIP per component and this calculation includes the loading vectors from X. We use the formula similar to...
Read more >
Calculate VIP scores for PLS regression - R-Project.org
Calculate VIP scores for PLS regression. Description. This function calculates the Variable Importance in the Projection statistic for the Partial Least ...
Read more >
Variable Importance in Projection - ScienceDirect.com
VIP scores are defined for each X variable, j, as the sum, over latent variables (LV), of its PLS-weight value (wj) weighted by...
Read more >
How can I compute Variable Importance in Projection (VIP) in ...
The VIPecore can be calculated from the "loading weights X" and the "explained variance Y" on each factor retained in the PLS. All...
Read more >
How to calculate the Variable Importance in Projection from ...
Note that the PLSREGRESS function uses SIMPLS as its main algorithm ... The VIP scores can be calculated directly from the obtained outputs:....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found