Proto commits in stanfordnlp/stanza

These 23 commits are when the Protocol Buffers files have changed:

2024-02-24

Commit:	8a6c543
Author:	John Bauer	2024-02-16 00:26:02 -0800
Committer:	John Bauer	2024-02-23 23:41:20 -0800

Update constituency evaluation to accommodate the per-tree f1 (this will be sent back via the proto in the next version of CoreNLP after 4.5.6)

The documentation is generated from this commit.

2023-10-17

Commit:	9c53cfa
Author:	John Bauer	2023-10-16 07:38:24 -0700
Committer:	John Bauer	2023-10-16 18:22:41 -0700

Clean up some whitespace in the protobuf definitions

Commit:	7efd4fb
Author:	John Bauer	2023-10-15 20:12:00 -0700
Committer:	John Bauer	2023-10-16 18:22:41 -0700

Add the fields needed to send empty nodes in protobufs as part of an enhanced dependencies from UD

2023-04-11

Commit:	69a1c60
Author:	John Bauer	2023-04-10 17:01:41 -0700
Committer:	John Bauer	2023-04-10 23:19:54 -0700

Ssurgeon now passes the misc column for MWT as well as words to CoreNLP. Only does anything useful if there is a new CoreNLP release, but in the meantime it doesn't crash or throw away the whitespace markings, at least. Adds a commented out test of the mwt_misc column

2023-03-07

Commit:	fd8b524
Author:	John Bauer	2023-02-27 23:28:11 -0800
Committer:	John Bauer	2023-03-07 00:09:47 -0800

Copy two functionalities from an updated CoreNLP proto: send named egdes back as part of a semgrex search & add an Ssurgeon request/response

2023-01-11

Commit:	0987794
Author:	John Bauer	2023-01-09 15:48:32 -0800
Committer:	John Bauer	2023-01-11 08:01:20 -0800

Add an interface for the CoreNLP conversion from English constituencies to dependencies. Only works for English. Not currently unit tested (obviously tested during development) because it requires a new CoreNLP release first Return the doc after processing - makes it more pipelineable

2022-10-22

Commit:	17b8d03
Author:	John Bauer	2022-10-22 11:17:40 -0700
Committer:	John Bauer	2022-10-22 13:47:22 -0700

Update corenlp.proto with definitions that will connect to the Morphology annotator

2022-08-03

Commit:	f660b95
Author:	John Bauer	2022-08-03 01:00:27 -0700

Add the graphIndex and semgrexIndex from CoreNLP 4.5.0 to make the semgrex interface a bit more readable (hopefully)

2022-04-23

Commit:	fc68b55
Author:	John Bauer	2022-01-10 15:28:58 -0800
Committer:	John Bauer	2022-04-22 21:36:46 -0700

Add tsurgeon interface to the python/corenlp interface Includes a context manager to the tsurgeon Add a unit test to tsurgeon

Commit:	f15c38b
Author:	John Bauer	2022-01-10 13:19:18 -0800
Committer:	John Bauer	2022-04-22 21:36:46 -0700

Add the kbestF1 field to the parser eval

2022-01-13

Commit:	07a63bf
Author:	John Bauer	2022-01-10 15:28:58 -0800
Committer:	John Bauer	2022-01-13 00:42:34 -0800

Add tsurgeon interface to the python/corenlp interface Includes a context manager to the tsurgeon Add a unit test to tsurgeon

2022-01-10

Commit:	2134ae8
Author:	John Bauer	2022-01-10 13:19:18 -0800
Committer:	John Bauer	2022-01-10 13:16:36 -0800

Add the kbestF1 field to the parser eval

2021-09-24

Commit:	9031802
Author:	John Bauer	2021-06-18 12:37:14 -0700
Committer:	John Bauer	2021-09-24 16:40:05 -0700

Constituency parser based on word embeddings to create Trees out of a sequence of words. This is a squash of what was originally a long list of changes See 2f7db846e14ce73ca95416172b1ba5ba512821f5 for a original sequence Primary methods are either top-down or in-order transition sequences, as per ths paper: In-Order Transition-based Constituent Parsing Jiangming Liu and Yue Zhang Parser eval interface which calls the CoreNLP parser eval Model is based on LSTMs. Includes a treebank evaluation request to CoreNLP via a protobuf Has options to use a variety of small modifications to the models. Constraints on the transitions hopefully prevent the parser from getting stuck. Allow either adadelta or sgd as optimizer Allow choice of relu or tanh for nonlinearity Includes a bunch of tests Move the constituency tests into their own directory Defaults are set to reasonable values for the WSJ PTB Lots of effort put into bulk operations instead of doing single transitions at a time Saves the optimizer state when saving a model. Makes the model much larger, but allows for restarting training from the same optimizer Also, a mode to remove the optimizer from a model (which shrinks it). Uses a mechanism similar to the original implementation to avoid too many "unary" transitions, eg an open immediately followed by a close. However, some training trees have too many unary transitions for the original limit=3 to be sufficient Charlm integration, including batching, although that didn't seem to help Also has some doc on things which didn't help

2021-04-18

Commit:	450041b
Author:	John Bauer	2020-08-12 16:30:52 -0700
Committer:	John Bauer	2021-04-18 16:31:20 -0700

Add protobuf for an enhancement request

2021-03-30

Commit:	2c528e8
Author:	John Bauer	2021-03-30 13:10:52 -0700

Updated proto file, including a tokensregex interface

2020-07-29

Commit:	48ee26d
Author:	John Bauer	2020-07-14 16:04:51 -0700
Committer:	John Bauer	2020-07-28 18:11:56 -0700

Python interface to the semgrex processor

2020-07-14

Commit:	8a90cf2
Author:	John Bauer	2020-07-14 16:03:17 -0700

Transfer a couple proto updates from corenlp

2020-04-16

Commit:	08c5de0
Author:	John Bauer	2020-04-16 14:16:53 -0700

Update a couple more uint32->int32 to be compatible with the CoreNLP use of these fields, where -1 can represent null

2020-04-06

Commit:	43ec748
Author:	John Bauer	2020-04-06 10:17:34 -0700

Update CoreNLP.proto to use int32 for some annotators which use that to signify 'not present'

2020-03-13

Commit:	1ab2a2e
Author:	John Bauer	2020-03-12 17:42:41 -0700

Update corenlp protocol buffer to the candidate version for corenlp 4.0.0

2018-04-11

Commit:	f15677b
Author:	Sina	2018-04-11 13:22:55 +0100

Update to 3.9.1

2017-06-10

Commit:	d05093e
Author:	Arun Tejasvi Chaganty	2017-06-09 19:41:40 -0700

Updated tests and MANIFEST

2017-04-28

Commit:	6755b5f
Author:	Arun Tejasvi Chaganty	2017-04-28 00:53:35 -0700

Initialized with protobuf and tests