Get desktop application:
View/edit binary Protocol Buffers messages
Descriptor for feature extractor.
Top-level feature function for extractor.
Descriptor for feature function.
Used in:
Feature function type.
Feature function name.
Default argument for feature function.
Named parameters for feature descriptor.
Nested sub-feature function descriptors.
Used in:
A Sentence contains the raw text contents of a sentence, as well as an analysis.
Identifier for sentence.
Raw text contents of the sentence.
Tokenization of the sentence.
Task input descriptor.
Used in:
Name of input resource.
Name of stage responsible of creating this resource.
File format for resource.
Record format for resource.
Is this resource multi-file?
An input can consist of multiple file sets.
Used in:
File pattern for file set.
File format for file set.
Record format for file set.
Task output descriptor.
Used in:
Name of output resource.
File format for output resource.
Record format for output resource.
Number of shards in output. If it is different from zero this output is sharded. If the number of shards is set to -1 this means that the output is sharded, but the number of shard is unknown. The files are then named 'base-*-of-*'.
Base file name for output resource. If this is not set by the task component it is set to a default value by the workflow engine.
Optional extension added to the file name.
A task specification is used for describing executing parameters.
Name of task.
Workflow task type.
Task inputs.
Task outputs.
Task parameters.
Used in:
A sentence token marks a span of bytes in the sentence text as a token or word.
Used in:
Token word form.
Start position of token in text.
End position of token in text. Gives index of last byte, not one past the last byte. If token came from lexer, excludes any trailing HTML tags.
Head of this token in the dependency tree: the id of the token which has an arc going to this one. If it is the root token of a sentence, then it is set to -1.
Part-of-speech tag for token.
Coarse-grained word category for token.
Label for dependency relation between this token and its head.
Break level for tokens that indicates how it was separated from the previous token in the text.
Used in:
No separation between tokens.
Tokens separated by space.
Tokens separated by line break.
Tokens separated by sentence break.