Feature Weighting

Algorithms for assessing the quality of features.

Classifier-derived methods

See classification.

Iterative RELIEF (I-RELIEF)

class mlpy.Irelief(T=1000, sigma=1.0, theta=0.001)

Iterative RELIEF for Feature Weighting.

Example:

>>> from numpy import *
>>> from mlpy import *
>>> x = array([[1.1, 2.1, 3.1, -1.0],  # first sample
...            [1.2, 2.2, 3.2, 1.0],   # second sample
...            [1.3, 2.3, 3.3, -1.0]]) # third sample
>>> y = array([1, 2, 1])               # classes
>>> myir = Irelief()                   # initialize irelief class
>>> myir.weights(x, y)                 # compute feature weights
array([ 0.,  0.,  0.,  1.])

Initialize the Irelief class.

Input

  • T - [integer] (>0) max loops
  • sigma - [float] (>0.0) kernel width
  • theta - [float] (>0.0) convergence parameter
weights(x, y)

Return feature weights.

Input

  • x - [2D numpy array float] (sample x feature) training data
  • y - [1D numpy array integer] (two classes) classes

Output

  • fw - [1D numpy array float] feature weights
exception mlpy.SigmaError

Sigma Error

Sigma parameter is too small.

Feature Weighting/Selection Yijun Sun08

A feature weighting/selection algorithm described in [Sun08].

class mlpy.FSSun(T=1000, sigma=1.0, theta=0.001, lmbd=1.0, eps=0.001, alpha0=1.0, c=0.01, rho=0.5, debug=False)

Sun Algorithm for feature weighting/selection

Initialize the FSSun class

Parameters:
T : int (> 0)

max loops

sigma : float (> 0.0)

kernel width

theta : float (> 0.0)

convergence parameter

lmbd : float

regularization parameter

eps : float (0 < eps << 1)

termination tolerance for steepest descent method

alpha0 : float (> 0.0)

initial step length (usually 1.0) for line search

c : float (0 < c < 1/2)

costant for line search

rho : flaot (0 < rho < 1)

alpha coefficient for line search

New in version 2.1.0.

weights(x, y)

Compute the feature weights

Parameters:
x : 2d ndarray float (samples x feats)

training data

y : 1d ndarray integer (-1 or 1)

classes

Returns:
fw : 1d ndarray float

feature weights

Attributes:
FSSun.loops : int

number of loops

Raises:
ValueError

if classes are not -1 or 1

SigmaError

if sigma parameter is too small

Discrete Wavelet Transform based (DWT)

class mlpy.Dwt(specdiff='rpv')

Discrete Wavelet Transform (DWT).

Example:

>>> import numpy as np
>>> import mlpy
>>> xtr = np.array([[1.0, 2.0, 3.1, 1.0],  # first sample
...                 [1.0, 2.0, 3.0, 2.0],  # second sample
...                 [1.0, 2.0, 3.1, 1.0]]) # third sample
>>> ytr = np.array([1, -1, 1])             # classes
>>> mydwt = mlpy.Dwt()                   # initialize dwt class
>>> mydwt.weights(xtr, ytr)              # compute weights on training data
array([ -2.22044605e-14,  -2.22044605e-14,   6.34755463e+00,  -3.00000000e+02])

Initialize the Dwt class.

Input

  • specdiff - [string] spectral difference method (‘rpv’, ‘arpv’, ‘crpv’)
weights(x, y)

Return ABSOLUTE feature weights.

Parameters:
x : 2d ndarray float (samples x feats)

training data

y : 1d ndarray integer (-1 or 1)

classes

Returns:
fw : 1d ndarray float

feature weights

[Sun07]Yijun Sun. Iterative RELIEF for Feature Weighting: Algorithms, Theories, and Applications. IEEE Trans. Pattern Anal. Mach. Intell. 29(6): 1035-1051, 2007.
[Sun08]Yijun Sun, S. Todorovic, and S. Goodison. A Feature Selection Algorithm Capable of Handling Extremely Large Data Dimensionality. In Proc. 8th SIAM International Conference on Data Mining (SDM08), pp. 530-540, April 2008.
[Subramani06]P Subramani, R Sahu and S Verma. Feature selection using Haar wavelet power spectrum. In BMC Bioinformatics 2006, 7:432.