RangeFilter Class
Filters a dataview on a column of type Single, Double or Key (contiguous). Keeps the values that are in the specified min/max range. NaNs are always filtered out. If the input is a Key type, the min/max are considered percentages of the number of values.
- Inheritance
-
nimbusml.internal.core.preprocessing.filter._rangefilter.RangeFilterRangeFilternimbusml.base_transform.BaseTransformRangeFiltersklearn.base.TransformerMixinRangeFilter
Constructor
RangeFilter(min=-1, max=None, complement=False, include_min=True, include_max=None, columns=None, **params)
Parameters
- columns
a string representing the column name to perform the transformation on.
Input column type: numeric.
Output column type: numeric.
The << operator can be used to set this value (see Column Operator)
For example
RangeFilter(columns='age')
RangeFilter() << {'age'}
For more details see Columns.
- min
Minimum value (0 to 1 for key types).
- max
Maximum value (0 to 1 for key types).
- complement
If true, keep the values that fall outside the range.
- include_min
If true, include in the range the values that are equal to min.
- include_max
If true, include in the range the values that are equal to max.
- params
Additional arguments sent to compute engine.
Examples
###############################################################################
# RangeFilter
import numpy as np
from nimbusml import FileDataStream
from nimbusml.datasets import get_dataset
from nimbusml.preprocessing.filter import RangeFilter
# data input (as a FileDataStream)
path = get_dataset('infert').as_filepath()
data = FileDataStream.read_csv(path, numeric_dtype=np.float32)
print(data.head())
# age case education induced parity pooled.stratum row_num ...
# 0 26.0 1.0 0-5yrs 1.0 6.0 3.0 1.0 ...
# 1 42.0 1.0 0-5yrs 1.0 1.0 1.0 2.0 ...
# 2 39.0 1.0 0-5yrs 2.0 6.0 4.0 3.0 ...
# 3 34.0 1.0 0-5yrs 2.0 4.0 2.0 4.0 ...
# 4 35.0 1.0 6-11yrs 1.0 3.0 32.0 5.0 ...
# transform usage
xf = RangeFilter(min=20, max=30, columns='age')
# fit and transform, rows with age outside the range will be deleted
features = xf.fit_transform(data)
print(features.head())
# age case education id induced parity pooled.stratum ...
# 0 26.0 1 0-5yrs 1 1 6 3 ...
# 1 23.0 1 6-11yrs 7 0 1 6 ...
# 2 21.0 1 6-11yrs 9 0 1 5 ...
# 3 28.0 1 6-11yrs 10 0 2 19 ...
# 4 29.0 1 6-11yrs 11 1 2 20 ...
Methods
get_params |
Get the parameters for this operator. |
get_params
Get the parameters for this operator.
get_params(deep=False)
Parameters
- deep