Proto commits in ahmetaa/zemberek-nlp

These 40 commits are when the Protocol Buffers files have changed:

Commit:d51279c
Author:Balkı

Bug Fix

The documentation is generated from this commit.

Commit:6cb7bbb
Author:Balkı

GRPC Go Client

Commit:e9d0a0b
Author:Balkı

GRPC Go Client

Commit:1a77aea
Author:Balkı

GRPC Language ID - Get Scores

Commit:f3457d8
Author:Mehmet

Update proto related dependencies, Remove lite runtime from lexicon proto temporarily, apparently we need a different protoc for this now.

Commit:5244100
Author:aaa

Add `doNotSplitInDoubleQuotes` to preprocessing grpc service

Commit:cb0bd18
Author:ahmetaa

pre 0.16.0 work. remove stale php stuff. refactor python grpc stuff.

Commit:23db1ba
Author:aaa

Proto name changes

Commit:1f70700
Author:ahmetaa

grpc work. change simple_analysis proto to morphology proto. Add ZemberekGrpcConfiguation class. Only contains normalization related stuff for now. Remove initialization methods from normalization service.

Commit:089cdbc
Author:ahmetaa

more grpc work. remove analysis.proto TODO: simple_analysis-> morholigical_analysis. normalization now applies noisy text normalization. later add spell checking servise separately. Not sure about the current initialization mechanism.

Commit:48bcf67
Author:aaa

Add "Informal" Root attribute.

Commit:3d13b50
Author:ahmetaa

more grpc work.

Commit:e258ba8
Author:aaa

grpc exp continued.

Commit:ad7f139
Author:aaa

grpc exp

Commit:ead746d
Author:ahmetaa

grpc experiment

Commit:375662b
Author:ahmetaa

grpc experiment

Commit:475392d
Author:ahmetaa

Add optional token boundary information to Grpc preprocessing service. Add console application for grpc server.

Commit:41e441d
Author:aaa

Merge remote-tracking branch 'origin/master'

Commit:38d8df7
Author:aaa

Remove Root attribute `Special`. It is not used anymore.

Commit:1db7ecb
Author:Mehmet

Add simple normalization grpc service.

Commit:96ee5ce
Author:Mehmet

Add sentence extractor service. A few cosmetic fixes.

Commit:49aaaec
Author:Mehmet

Add basic preprocessing service with tokenize method. Fix a div by zero issue in langid (needs to be checked if fix is accurate)

Commit:cbbbbf3
Author:aaa

some changes in fasttext. change language_id proto file.

Commit:60054aa
Author:Mehmet

Separate Language Id Service.

Commit:e26740f
Author:ahmetaa

Add simple language detection to grpc module.

Commit:cc81cfd
Author:aaa

Add person-names dictionary after cleaning up Improve disambiguation further by adding more data and shuffling training data after each iteration. Make some methods static in TurkishDictionaryLoader Change name of the morpheme from "AfterDoing" to "AfterDoingSo" to make it more compatible Remove "Plural" RootAttribute and use "ImplicitPlural". Change dictionaries and proto files accordingly. Add missing values to proto file.

Commit:ca8d2df
Author:mdakin

Add more fields to response in grpc AnalysisServer. Use strings instead of enums for now.

Commit:482775d
Author:mdakin

Add id to morphemedata.

Commit:0042b3e
Author:mdakin

Start adding real analysis result to grpc calls.

Commit:84c558d
Author:Mehmet

Try to add common protos to core, unfortunately does not work yet. For some reason proto includes fails, this is a problem with maven of course.

Commit:9292c9b
Author:mdakin

Simplify main pom, fix grpc proto

Commit:7232638
Author:mdakin

Remove submitted generated proto, various fixes to pom files, a skeleton grpc module.

Commit:23f51a5
Author:ahmetaa

Add PronunciationGuesser. It is used for guessing Turkish abbreviation pronunciations. Add PronunciationGuessed RootAttribute. Add a settable `strict` parameter to Syllable extractor.

Commit:8c0d183
Author:ahmetaa

sync proto. add some stuff back to TurkishAlphabet to unbreak stuff.

Commit:e4daa3c
Author:Ahmet

Add `referenceItem` mechanism to Serialization. Fix dictionary.

Commit:02a6462
Author:Mehmet

Synchronize proto enums with changes, add index to proto. Now can serializa / deserialize with no errors.

Commit:23a9b83
Author:Mehmet

Add a simple serializer - deserializer. Doesn't work 100% correctly yet. Performance is better but not great. Dictionary load time: 240ms RootLexicon creation time: 130ms.

Commit:be83d7d
Author:Mehmet

lite is ok for us (smaller generated code)

Commit:c939509
Author:Mehmet

Fixes to proto file and maven configs.

Commit:0f9289d
Author:Mehmet
Committer:Mehmet

Initial version of lexicon protos. Does not work yet.