• A preprocessing layer which encodes integer features.

    This layer provides options for condensing data into a categorical encoding when the total number of tokens are known in advance. It accepts integer values as inputs, and it outputs a dense representation of those inputs.

    Arguments:

    numTokens: The total number of tokens the layer should support. All inputs to the layer must integers in the range 0 <= value < numTokens, or an error will be thrown.

    outputMode: Specification for the output of the layer. Defaults to multiHot. Values can be oneHot, multiHot or count, configuring the layer as follows:

    oneHot: Encodes each individual element in the input into an array of numTokens size, containing a 1 at the element index. If the last dimension is size 1, will encode on that dimension. If the last dimension is not size 1, will append a new dimension for the encoded output.

    multiHot: Encodes each sample in the input into a single array of numTokens size, containing a 1 for each vocabulary term present in the sample. Treats the last dimension as the sample dimension, if input shape is (..., sampleLength), output shape will be (..., numTokens).

    count: Like multiHot, but the int array contains a count of the number of times the token at that index appeared in the sample.

    For all output modes, currently only output up to rank 2 is supported. Call arguments: inputs: A 1D or 2D tensor of integer inputs. countWeights: A tensor in the same shape as inputs indicating the weight for each sample value when summing up in count mode. Not used in multiHot or oneHot modes.

    Parameters

    • args: CategoryEncodingArgs

    Returns CategoryEncoding

    Doc

Generated using TypeDoc