FeatureHasher Class

Definition

public class FeatureHasher : Microsoft.Spark.ML.Feature.FeatureBase<Microsoft.Spark.ML.Feature.FeatureHasher>
type FeatureHasher = class
    inherit FeatureBase<FeatureHasher>
Public Class FeatureHasher
Inherits FeatureBase(Of FeatureHasher)
Inheritance

Methods

Clear(Param)

Clears any value that was previously set for this Microsoft.Spark.ML.Feature.Param. The value is reset to the default value.

(Inherited from FeatureBase<T>)
ExplainParam(Param)

Returns a description of how a specific Microsoft.Spark.ML.Feature.Param works and is currently set.

(Inherited from FeatureBase<T>)
ExplainParams()

Returns a description of how all of the Microsoft.Spark.ML.Feature.Param's that apply to this object work and how they are currently set.

(Inherited from FeatureBase<T>)
GetCategoricalCols()

Gets a list of the columns which have been specified as categorical columns.

GetInputCols()

Gets the columns that the FeatureHasher should read from and convert into hashes. This would have been set by SetInputCol.

GetNumFeatures()

Gets the number of features that should be used. Since a simple modulo is used to transform the hash function to a column index, it is advisable to use a power of two as the numFeatures parameter; otherwise the features will not be mapped evenly to the columns.

GetOutputCol()

Gets the name of the column the output data will be written to. This is set by SetInputCol.

GetParam(String)

Retrieves a Microsoft.Spark.ML.Feature.Param so that it can be used to set the value of the Microsoft.Spark.ML.Feature.Param on the object.

(Inherited from FeatureBase<T>)
Load(String)

Loads the FeatureHasher that was previously saved using Save.

Save(String)

Saves the object so that it can be loaded later using Load. Note that these objects can be shared with Scala by Loading or Saving in Scala.

(Inherited from FeatureBase<T>)
Set(Param, Object)

Sets the value of a specific Microsoft.Spark.ML.Feature.Param.

(Inherited from FeatureBase<T>)
SetCategoricalCols(IEnumerable<String>)

Marks columns as categorical columns.

SetInputCols(IEnumerable<String>)

Sets the columns that the FeatureHasher should read from and convert into hashes.

SetNumFeatures(Int32)

Sets the number of features that should be used. Since a simple modulo is used to transform the hash function to a column index, it is advisable to use a power of two as the numFeatures parameter; otherwise the features will not be mapped evenly to the columns.

SetOutputCol(String)

Sets the name of the new column in the DataFrame created by Transform.

ToString()

Returns the JVM toString value rather than the .NET ToString default

(Inherited from FeatureBase<T>)
Transform(DataFrame)

Transforms the input DataFrame. It is recommended that you validate that the transform will succeed by calling TransformSchema.

TransformSchema(StructType)

Check transform validity and derive the output schema from the input schema.

This checks for validity of interactions between parameters during Transform and raises an exception if any parameter value is invalid.

Typical implementation should first conduct verification on schema change and parameter validity, including complex parameter interaction checks.

Uid()

The UID that was used to create the object. If no UID is passed in when creating the object then a random UID is created when the object is created.

(Inherited from FeatureBase<T>)

Applies to