Get desktop application:
View/edit binary Protocol Buffers messages
A coreference chain. These fields are not *really* optional. CoreNLP will crash without them.
Used in:
Used in:
the second element of position
A protobuf which allows to pass in a document with basic dependencies to be converted to enhanced
The expected value of this is a regex which matches relative pronouns
A dependency graph representation.
Used in:
, , ,Used in:
Used in:
A document; that is, the equivalent of an Annotation.
Used in:
,* A peculiar field, for the corner case when a Document is serialized without any sentences. Otherwise
* This field is for entity mentions across the document.
used to differentiate between null and empty list
* xml information
* coref mentions for entire document *
A representation of an entity in a relation. This corresponds to the EntityMention, and more broadly the ExtractionObject classes.
Used in:
,inherited from ExtractionObject
Implicit uint32 sentence @see implicit in sentence
A protobuf for calling the java constituency parser evaluator from elsewhere
Used in:
repeated so you can send in kbest parses, if your parser handles that note that this already includes a score field
A version of ParseTree with a flattened structure so that deep trees don't exceed the protobuf stack depth
Used in:
Used in:
Used in:
An enumeration for the valid languages allowed in CoreNLP
Used in:
,A map from integers to strings. Used, minimally, in the CoNLLU featurizer
A map from strings to strings. Used, minimally, in the CoNLLU featurizer
Used in:
Used in:
,An NER mention in the text
Used in:
,The seven informative Natural Logic relations
Used in:
A Natural Logic operator
Used in:
A syntactic parse tree, with scores.
Used in:
The polarity of a word, according to Natural Logic
Used in:
A quotation marker in text
Used in:
,A representation of a relation, mirroring RelationMention
Used in:
inherited from ExtractionObject
Implicit uint32 sentence @see implicit in sentence
An OpenIE relation triple. Created by the openie annotator.
Used in:
The surface form of the subject
The surface form of the relation (required)
The surface form of the object
The [optional] confidence of the extraction
The tokens comprising the subject of the triple
The tokens comprising the relation of the triple
The tokens comprising the object of the triple
The dependency graph fragment for this triple
If true, this expresses an implicit tmod relation
If true, this relation string is missing a 'be' prefix
If true, this relation string is missing a 'be' suffix
If true, this relation string is missing a 'of' prefix
Used in:
A message for requesting a semgrex Each sentence stores information about the tokens making up the corresponding graph An alternative would have been to use the existing Document or Sentence classes, but the problem with that is it would be ambiguous which dependency object to use.
Used in:
The response from running a semgrex If you pass in M semgrex expressions and N dependency graphs, this returns MxN nested results. Each SemgrexResult can match multiple times in one graph
Used in:
Used in:
Used in:
Used in:
Used in:
The serialized version of a CoreMap representing a sentence.
Used in:
The OpenIE triples in the sentence
The KBP triples in this sentence
The entailed sentences, by natural logic
The entailed clauses, by natural logic
Only needed if we're only saving the sentence.
Fields set by other annotators in CoreNLP
Useful when storing sentences (e.g. ForEach)
date of section
section index for this sentence's section
name of section
author of section
doc id
is this sentence in an xml quote in a post
check if there are entity mentions
check if there are KBP triples
check if there are OpenIE triples
quote stuff
the quote annotator can soometimes add merged sentences
speaker stuff
The speaker speaking this sentence
The type of speaker speaking this sentence
An entailed sentence fragment. Created by the openie annotator.
Used in:
An enumeration of valid sentiment values for the sentiment classifier.
Used in:
A Span of text
Used in:
Used in:
A Timex object, representing a temporal expression (TIMe EXpression) These fields are not *really* optional. CoreNLP will crash without them.
Used in:
,The serialized version of a Token (a CoreLabel).
Used in:
, , ,Fields set by the default annotators [new CoreNLP(new Properties())]
the word's gloss (post-tokenization)
The word's part of speech tag
The word's 'value', (e.g., parse tree node)
The word's 'category' (e.g., parse tree node)
The whitespace/xml before the token
The whitespace/xml after the token
The original text for this token
The word's NER tag
The word's coarse NER tag
The word's fine-grained NER tag
listing of probs
The word's normalized NER tag
The word's lemma
The character offset begin, in the document
The character offset end, in the document
The utterance tag used in dcoref
The speaker speaking this word
The type of speaker speaking this word
The begin index of, e.g., a span
The begin index of, e.g., a span
The begin index of the token
The end index of the token
The time this word refers to
Used by clean xml annotator
Used by clean xml annotator
The [primary] cluster id for this token
A temporary annotation which is occasionally left in
optional string projectedCategory = 25; // The syntactic category of the maximal constituent headed by the word. Not used anywhere, so deleted.
The index of the head word of this word.
If this is an operator, which one is it and what is its scope (as per Natural Logic)?
The polarity of this word, according to Natural Logic
The polarity of this word, either "up", "down", or "flat"
The span of a leaf node of a tree
The final sentiment of the sentence
The index of the quotation this token refers to
The coarse POS tag (used to store the UPOS tag)
Fields set by other annotators in CoreNLP
gender annotation (machine reading)
true case type of token
true case gloss of token
Chinese character info
Arabic character info
Section info
French tokens have parents
mention index info
mwt stuff
number info
The index of a token in a document, including the sentence index and the offset.
Used in:
It's possible to send in a whole document, but we only care about the Sentences and Tokens
The result will be a nested structure: repeated PatternMatch, one for each pattern each PatternMatch has a repeated Match, which tells you which sentence matched and where
Used in:
Used in:
Used in: