RangeFilter Class

Filters a dataview on a column of type Single, Double or Key (contiguous). Keeps the values that are in the specified min/max range. NaNs are always filtered out. If the input is a Key type, the min/max are considered percentages of the number of values.

Inheritance
nimbusml.internal.core.preprocessing.filter._rangefilter.RangeFilter
RangeFilter
nimbusml.base_transform.BaseTransform
RangeFilter
sklearn.base.TransformerMixin
RangeFilter

Constructor

RangeFilter(min=-1, max=None, complement=False, include_min=True, include_max=None, columns=None, **params)

Parameters

columns

a string representing the column name to perform the transformation on.

  • Input column type: numeric.

  • Output column type: numeric.

The << operator can be used to set this value (see Column Operator)

For example

  • RangeFilter(columns='age')

  • RangeFilter() << {'age'}

For more details see Columns.

min

Minimum value (0 to 1 for key types).

max

Maximum value (0 to 1 for key types).

complement

If true, keep the values that fall outside the range.

include_min

If true, include in the range the values that are equal to min.

include_max

If true, include in the range the values that are equal to max.

params

Additional arguments sent to compute engine.

Examples


   ###############################################################################
   # RangeFilter
   import numpy as np
   from nimbusml import FileDataStream
   from nimbusml.datasets import get_dataset
   from nimbusml.preprocessing.filter import RangeFilter

   # data input (as a FileDataStream)
   path = get_dataset('infert').as_filepath()
   data = FileDataStream.read_csv(path, numeric_dtype=np.float32)
   print(data.head())
   #    age  case education  induced  parity  pooled.stratum  row_num  ...
   # 0  26.0   1.0    0-5yrs      1.0     6.0             3.0      1.0  ...
   # 1  42.0   1.0    0-5yrs      1.0     1.0             1.0      2.0  ...
   # 2  39.0   1.0    0-5yrs      2.0     6.0             4.0      3.0  ...
   # 3  34.0   1.0    0-5yrs      2.0     4.0             2.0      4.0  ...
   # 4  35.0   1.0   6-11yrs      1.0     3.0            32.0      5.0  ...

   # transform usage
   xf = RangeFilter(min=20, max=30, columns='age')

   # fit and transform, rows with age outside the range will be deleted
   features = xf.fit_transform(data)
   print(features.head())
   #    age  case education  id  induced  parity  pooled.stratum  ...
   # 0  26.0     1    0-5yrs   1        1       6               3 ...
   # 1  23.0     1   6-11yrs   7        0       1               6 ...
   # 2  21.0     1   6-11yrs   9        0       1               5 ...
   # 3  28.0     1   6-11yrs  10        0       2              19 ...
   # 4  29.0     1   6-11yrs  11        1       2              20 ...

Methods

get_params

Get the parameters for this operator.

get_params

Get the parameters for this operator.

get_params(deep=False)

Parameters

deep
default value: False