These 75 commits are when the Protocol Buffers files have changed:
Commit: | bf03463 | |
---|---|---|
Author: | Shashank Mittal | |
Committer: | GitHub |
[GSOC] `hyperopt` suggestion service logic update (#2412) * resolved merge conflicts Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * fix Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * DISTRIBUTION_UNKNOWN enum set to 0 in gRPC api Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * convert parameter method fix Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> validation fix add e2e tests for hyperopt added e2e test to workflow * convert feasibleSpace func updated Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * renamed DISTRIBUTION_UNKNOWN to DISTRIBUTION_UNSPECIFIED Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * fix Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * added more test cases for hyperopt distributions Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * added support for NORMAL and LOG_NORMAL in hyperopt suggestion service Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * added e2e tests for NORMAL and LOG_NORMAL Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> sigma calculation fixed fix parse new arguments to mnist.py * hyperopt-suggestion example update Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * updated logic for log distributions Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * updated logic for log distributions Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * e2e test fixed Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * added support for parameter distributions for Parameter type INT Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * unit test fixed Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * Update pkg/suggestion/v1beta1/hyperopt/base_service.py Co-authored-by: Yuki Iwai <yuki.iwai.tz@gmail.com> Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * comment fixed Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * added unit tests for INT parameter type Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * completed param unit test cases Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * handled default case for normal distributions when min or max are not specified Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * fixed validation logic for min and max Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * removed unnecessary test params Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * fixes Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * added comments Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * fix Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * set default distribution as uniform Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * line omit Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * removed empty spaces from yaml files Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> --------- Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> Co-authored-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
The documentation is generated from this commit.
Commit: | 2f5bda2 | |
---|---|---|
Author: | Shashank Mittal | |
Committer: | GitHub |
[GSOC] added Unknown distribution and convertDistribution in suggestion client (#2403) * added Unknown distribution and convertDistribution in suggestion client added unit tests Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> * removed custom compare func Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in> --------- Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in>
Commit: | ffc0058 | |
---|---|---|
Author: | Shashank Mittal | |
Committer: | GitHub |
added `Distribution` field to feasibleSpace in `api.proto` (#2397) Signed-off-by: Shashank Mittal <shashank.mittal.mec22@itbhu.ac.in>
Commit: | 1c45521 | |
---|---|---|
Author: | Andrey Velichkevich | |
Committer: | GitHub |
Cherry pick of #2350 #2355 #2357 #2344 #2358 #2360 into release-0.17 branch (#2362) * Fix TestReconcileBatchJob (#2350) * update Signed-off-by: forsaken628 <forsaken628@gmail.com> * fix Signed-off-by: forsaken628 <forsaken628@gmail.com> * update Signed-off-by: forsaken628 <forsaken628@gmail.com> * update Signed-off-by: forsaken628 <forsaken628@gmail.com> * update Signed-off-by: forsaken628 <forsaken628@gmail.com> * fix Signed-off-by: forsaken628 <forsaken628@gmail.com> * cleanup Signed-off-by: forsaken628 <forsaken628@gmail.com> * fix Signed-off-by: forsaken628 <forsaken628@gmail.com> * update Signed-off-by: forsaken628 <forsaken628@gmail.com> * use gomock Signed-off-by: forsaken628 <forsaken628@gmail.com> --------- Signed-off-by: forsaken628 <forsaken628@gmail.com> Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com> * Use cache-dependency-path in actions/setup-go for CI workflow (#2355) Signed-off-by: forsaken628 <forsaken628@gmail.com> Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com> * Replace already closed github.com/golang/mock with go.uber.org/mock (#2357) * replace gomock Signed-off-by: forsaken628 <forsaken628@gmail.com> * fix Signed-off-by: forsaken628 <forsaken628@gmail.com> * revert Signed-off-by: forsaken628 <forsaken628@gmail.com> * fix Signed-off-by: forsaken628 <forsaken628@gmail.com> * fix Signed-off-by: forsaken628 <forsaken628@gmail.com> --------- Signed-off-by: forsaken628 <forsaken628@gmail.com> Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com> * Replace gRPC code generation tool from Znly/protoc to Buf (#2344) * Replace gRPC code generation tool from Znly/protoc to Buf Signed-off-by: forsaken628 <forsaken628@gmail.com> * fix Signed-off-by: forsaken628 <forsaken628@gmail.com> * del build.sh Signed-off-by: forsaken628 <forsaken628@gmail.com> * cleanup Signed-off-by: forsaken628 <forsaken628@gmail.com> * fix test Signed-off-by: forsaken628 <forsaken628@gmail.com> * fix Signed-off-by: forsaken628 <forsaken628@gmail.com> * fix Signed-off-by: forsaken628 <forsaken628@gmail.com> * refine Signed-off-by: forsaken628 <forsaken628@gmail.com> * fix Signed-off-by: forsaken628 <forsaken628@gmail.com> * rm outter yaml Signed-off-by: forsaken628 <forsaken628@gmail.com> * fix Signed-off-by: forsaken628 <forsaken628@gmail.com> --------- Signed-off-by: forsaken628 <forsaken628@gmail.com> Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com> * Upgrade the protobuf version to >=4.21.12,<5 (#2358) Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com> Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com> * [SDK] Fix empty list for env variables and numpy version (#2360) * [SDK] Fix empty list for env variables Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com> * Fix numpy version in tests Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com> --------- Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com> --------- Signed-off-by: forsaken628 <forsaken628@gmail.com> Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com> Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com> Co-authored-by: coldWater <forsaken628@gmail.com> Co-authored-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
Commit: | 0d190b9 | |
---|---|---|
Author: | coldWater | |
Committer: | GitHub |
Replace gRPC code generation tool from Znly/protoc to Buf (#2344) * Replace gRPC code generation tool from Znly/protoc to Buf Signed-off-by: forsaken628 <forsaken628@gmail.com> * fix Signed-off-by: forsaken628 <forsaken628@gmail.com> * del build.sh Signed-off-by: forsaken628 <forsaken628@gmail.com> * cleanup Signed-off-by: forsaken628 <forsaken628@gmail.com> * fix test Signed-off-by: forsaken628 <forsaken628@gmail.com> * fix Signed-off-by: forsaken628 <forsaken628@gmail.com> * fix Signed-off-by: forsaken628 <forsaken628@gmail.com> * refine Signed-off-by: forsaken628 <forsaken628@gmail.com> * fix Signed-off-by: forsaken628 <forsaken628@gmail.com> * rm outter yaml Signed-off-by: forsaken628 <forsaken628@gmail.com> * fix Signed-off-by: forsaken628 <forsaken628@gmail.com> --------- Signed-off-by: forsaken628 <forsaken628@gmail.com>
Commit: | 4a2db41 | |
---|---|---|
Author: | Johnu George | |
Committer: | GitHub |
Remove deprecated variable from GRPC definitions (#1994) * Update training operator image in CI * Remove deprecated GRPC var * Remove deprecated GRPC var * Remove deprecated GRPC var * Support for k8s v1.25 in CI * Revert "Support for k8s v1.25 in CI" This reverts commit 16e6fe4b16820aea30e266a5c69560a92cfb851c.
Commit: | 04ac975 | |
---|---|---|
Author: | a9p | |
Committer: | GitHub |
Population based training (#1833) * docs: update new algorithm service details * feat: trial augmentation strategy * feat: pbt suggestion service * feat: PbtTemplate and associated test image * feat: introduce annotation field to trial specifications * feat: trial assignment changes to support annotations from suggestion - Add new Annotation types to suggestion_types.go - Add Annotation object and update Trial parser in trial.py * feat: update pbt suggestion to use new Annotation api - Suggestion uses exact match to track spawned trials - Trials that get transmitted, but not created (or added to experiment) are added back to the respawn pool (population_size consistency) * chore: gofmt and black run across PBT changes * feedback: remove tf summary export, change default print unit, reduce range to be percentage compatible. * feedback: move PBT template to example. * feedback: changes to inject_webhook and utils. - Rename mutateVolume to mutateMetricsCollectorVolume - Add addContainerVolumeMount - Add getPrimaryContainerIndex * feedback: change suggestion mutation mount variable name and add to consts * feedback: Add trial_names to GetSuggestionsReply and change suggestion path to <experiment>/<trial> * feedback: removed unnecessary checks and moved to async pbt implementation * feedback: update trial name override location and change annotations override to labels. * feedback: add pbt to github workflow * feedback: move labels to ParameterAssignments in GetSuggestionsReply and cleanup pbt.yaml. * feedback: remove operator changes * feedback: GHA updates * feedback: new formatting changes * feedback: add suggestion-pbt to gh-actions build-load.sh. * fix: missing pbt->simple-pbt name changes, add simple-pbt to update-images.sh update yaml function (causing failing gha). * feedback: add pointer to website from main readme for pbt
Commit: | ab2f596 | |
---|---|---|
Author: | Yuki Iwai | |
Committer: | GitHub |
Include MetricsUnavailable condition to Complete in Trial (#1877) * include MetricsUnavailable condition to Complete in Trial It is not easy for users to find why Trial failed when training code output incorrect format logs since the trial-controller sets Succeeded condition with False to Trial if there are unavailable metrics in Katib DB as described in https://github.com/kubeflow/katib/issues/1343. So we also include MetricsUnavailable condition to Complete in Trial. * add gh-actions tasks to verify generated codes * fix gh-actions workflow * when the number of Failed Trials reaches maxTrialCount, experiment-controller sets Failed to Experiment status * fix e2e test * To avoid being set Failed in Experiment status when and is equal to 0, we need to add condition,
Commit: | 10051dc | |
---|---|---|
Author: | Yuki Iwai | |
Committer: | GitHub |
Implement validation for early stopping (#1709) * implement validation for early stopping * fix some documents * fix error messages * implement gRPC API to verify parameters for early stopping * review: use early_stopping as gRPC API Co-authored-by: Andrey Velichkevich <andrey.velichkevich@gmail.com> * review: fix error description Co-authored-by: Andrey Velichkevich <andrey.velichkevich@gmail.com> * review: remove t.Run * review: remove condition to verify algorithmName for early stopping * remove description about updating gRPC API docs in kubeflow website Co-authored-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>
Commit: | 16e0574 | |
---|---|---|
Author: | Andrey Velichkevich | |
Committer: | GitHub |
Modify gRPC API with Current Request Number (#1728) * Modify API to current_request_number * Changes after review * Add request_number deprecated API * Fix test
Commit: | 698a9c6 | |
---|---|---|
Author: | Johnu George | |
Committer: | GitHub |
CherryPick: Reconcile semantics for Suggestion Algorithms (#1633) (#1644) * Reuse suggestions * Fix tests
Commit: | fe5963f | |
---|---|---|
Author: | Johnu George | |
Committer: | GitHub |
Reconcile semantics for Suggestion Algorithms (#1633) * Reuse suggestions * Fix tests
Commit: | ce12a89 | |
---|---|---|
Author: | Johnu George |
Reuse suggestions
Commit: | b65e4c3 | |
---|---|---|
Author: | Andrey Velichkevich | |
Committer: | GitHub |
Fix gRPC manager build script (#1492) * Remove legacy gRPC REST * Remove gRPC Swagger and Makefile * Trigger CI
Commit: | 91e4996 | |
---|---|---|
Author: | Andrey Velichkevich | |
Committer: | GitHub |
Remove v1alpha3 version (#1396) * Remove v1alpha3 files * Modify SDK * Change dict() to object
Commit: | 60f6c20 | |
---|---|---|
Author: | Andrey Velichkevich | |
Committer: | GitHub |
Early Stopping Implementation (#1344) * Init commit * Experiment API changes * Add APIs * Remove old es apis * Remove logging * Add Early Stopping implementation * Show metrics for early stopped trials * Clean const * Remove pid file with completed line from metrics collector * Add unit test * Add EarlyStopping as unique service * Add early stopped Trials to completed * Fix few comments * Generate clients and SDK * Fix tests * Fix goptuna test * Remove cluster role from test * Update observation for early stopped Trials * Fix pv name in e2e test * Add Katib config for Early Stopping * Fix comment * Remove unused gRPC Experiment spec * Remove legacy test files * Remove labels and conditions from example * Modify API to be consistent with Algorithm * Ignore no such file error * Add comments to proto * Add median stop implementation * Fix few comments * Fix unit tests * Add o-type flag to tfevent metrics collector * Fix es unit test * Fix hyperband suggestion
Commit: | 27658a7 | |
---|---|---|
Author: | Andrey Velichkevich | |
Committer: | GitHub |
GRPC: Rename Manager to DBManager service (#1279) * Rename Manager to DBManager in gRPC * Update git ignore
Commit: | 2179c16 | |
---|---|---|
Author: | Andrey Velichkevich | |
Committer: | GitHub |
Rename algorithm_setting to algorithm_settings in manager (#1204)
Commit: | 6e7a1aa | |
---|---|---|
Author: | Andrey Velichkevich | |
Committer: | GitHub |
Katib v1beta1 version (#1197) * Add v1beta1 version * Swagger for v1alpha3 and v1beta1 versions Fix format in bash scripts * Change make build to make buildv1alpha3 * Add folder path to python test * Fix folder in python test * Add goptuna and darts suggestions to check-katib-ready * Disable custom metrics collector e2e in v1beta1
Commit: | 92759fd | |
---|---|---|
Author: | Sarah Maddox | |
Committer: | Kubernetes Prow Robot |
Added version number and TODO descriptions to API proto. (#1017)
Commit: | 983583e | |
---|---|---|
Author: | Johnu George | |
Committer: | Kubernetes Prow Robot |
Delete v1alpha2 files (#953)
Commit: | 9d7164a | |
---|---|---|
Author: | Hougang Liu | |
Committer: | Kubernetes Prow Robot |
Remove unsed katib-manager-rest (#876)
Commit: | 1267a90 | |
---|---|---|
Author: | Hougang Liu | |
Committer: | Kubernetes Prow Robot |
Remove used manager message definition (#837)
Commit: | 69904e9 | |
---|---|---|
Author: | Hougang Liu | |
Committer: | Kubernetes Prow Robot |
Remove metrics in DB when delete trial (#830)
Commit: | d39865b | |
---|---|---|
Author: | Ce Gao | |
Committer: | Kubernetes Prow Robot |
feat: Remove useless APIs (#818) * feat: Remove useless APIs Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Remove Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix Signed-off-by: Ce Gao <gaoce@caicloud.io>
Commit: | f255c29 | |
---|---|---|
Author: | Johnu George | |
Committer: | Kubernetes Prow Robot |
Removing suggestions from manager interface (#772) * Removing suggestions from manager interface * Removing long running services * Increasing timeout to 60 sec
Commit: | 7df0955 | |
---|---|---|
Author: | Johnu George | |
Committer: | Kubernetes Prow Robot |
Adding status conditions for Suggestion CRD (#770) * Adding status conditions for Suggestion CRD * Fix tests
Commit: | 17d36c0 | |
---|---|---|
Author: | Ce Gao | |
Committer: | Kubernetes Prow Robot |
feat(GRPC): Replace trial with assignment (#767) * feat: Update API Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Replace trial with assignment Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix test Signed-off-by: Ce Gao <gaoce@caicloud.io>
Commit: | 20f9b40 | |
---|---|---|
Author: | Ce Gao | |
Committer: | Kubernetes Prow Robot |
feat: Support HyperOpt (#753) * feat: Support HyperOpt Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Add cmd Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Add hyperopt example Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Update API Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add build Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: chmod Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix test Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove useless tag Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove SJTUG mirror Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add version in requirements Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Address comments Signed-off-by: Ce Gao <gaoce@caicloud.io>
Commit: | 6467615 | |
---|---|---|
Author: | Ce Gao | |
Committer: | Kubernetes Prow Robot |
feat(GRPC): Update API for Suggestion (#743) * feat: Fix the API Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Code generate Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove the implementation Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix CI Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix import Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix test case Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix test Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove manager test Signed-off-by: Ce Gao <gaoce@caicloud.io>
Commit: | daacf9f | |
---|---|---|
Author: | Johnu George | |
Committer: | Kubernetes Prow Robot |
Katib v1alpha3 api implementation (#739) * v1alpha3 api implementation * fix jsonnet params * Adding v1alpha3 examples * Adding UI to builds
Commit: | ab0b4a7 | |
---|---|---|
Author: | Johnu George | |
Committer: | Kubernetes Prow Robot |
Refactor directory structure (#737) * Refactor directory structure * Fix tests
Commit: | b068809 | |
---|---|---|
Author: | Johnu George | |
Committer: | Kubernetes Prow Robot |
Delete v1alpha1 api (#734) * Delete v1alpha1 api * Removing modelstore
Commit: | 17dbca3 | |
---|---|---|
Author: | Hougang Liu | |
Committer: | Kubernetes Prow Robot |
Implement GetExperimentInDB (#558) * Implement GetExperimentInDB * Parse ErrNoRows error * Fix pod ready condition in test script * Add PreCheckRegisterExperiment
Commit: | 4c378bc | |
---|---|---|
Author: | Johnu George | |
Committer: | Kubernetes Prow Robot |
Added metric name to GetObservationLogRequest (#559) * Adding metric name to GetObservationLogRequest * regenerate mockdb
Commit: | 928c66a | |
---|---|---|
Author: | Hougang Liu | |
Committer: | Kubernetes Prow Robot |
Update trial status DB operation (#537)
Commit: | a8086c0 | |
---|---|---|
Author: | Andrey Velichkevich | |
Committer: | Kubernetes Prow Robot |
Register Trial in DB (#530) * Register Trial in DB * Fix errors * Change Spec and Status for Trial and Experiment * Fix unit test * Fix condition in Register Experiment and Trial * Fix ut in manager * Fix Status in Experiment Config * Fix Experiment Spec in py test * Add trial status * Fix ut with trial status
Commit: | e4891e4 | |
---|---|---|
Author: | Guang Ya Liu | |
Committer: | Kubernetes Prow Robot |
Dep ensure to sync up vendor. (#535) * Enable remove un-used package. * Run dep ensure.
Commit: | f626d4b | |
---|---|---|
Author: | Ce Gao | |
Committer: | Kubernetes Prow Robot |
chore: Remove dep ensure in CI (#525) * chore: Move unit test before image building in v1alpha2 Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Move python suggestions build to dep Signed-off-by: Ce Gao <gaoce@caicloud.io> * chore: Move more layers to dep Signed-off-by: Ce Gao <gaoce@caicloud.io> * chore: Add dep and remove depensure in CI Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Run setup cluster later Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Address comments Signed-off-by: Ce Gao <gaoce@caicloud.io>
Commit: | a38806c | |
---|---|---|
Author: | Richard Liu | |
Committer: | Kubernetes Prow Robot |
Add metrics collector spec and objective spec to Trial (#489) * Add metrics collector spec to Trial spec * Fix e2e test * Move ObjectiveSpec definition to Trial CRD * Move common types * Move common types to its own package * Add metrics collector spec to DB
Commit: | 823fa9f | |
---|---|---|
Author: | Andrey Velichkevich | |
Committer: | Kubernetes Prow Robot |
Get experiment config from the instance (#474) * Get experiment config from instance * Add parsing * Move getExperiment to util * Change objectmeta.name to name
Commit: | fd4c21c | |
---|---|---|
Author: | Richard Liu | |
Committer: | Kubernetes Prow Robot |
Add metrics collector spec to v1alpha2 API (#481) * Add metrics collector spec to v1alpha2 API * Delete metricsCollectorType * Fix * Fix unit test
Commit: | 2b55c69 | |
---|---|---|
Author: | Hougang Liu | |
Committer: | Kubernetes Prow Robot |
share one grpc-health-probe (#477)
Commit: | b886768 | |
---|---|---|
Author: | oshima | |
Committer: | Kubernetes Prow Robot |
v1alpha2 api server implementation (#456) * add v1-alpha2 api server implementation Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * add test Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * add filter argument to GetTrialList Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * rename filter to filter_by_name Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * revert filter_by_name to filter Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Commit: | 7ef5594 | |
---|---|---|
Author: | Andrey Velichkevich | |
Committer: | Kubernetes Prow Robot |
Update API for NAS in v1alpha2 (#450) * Update API for NAS in v1alpha2 * Fix name * Fix name in input size
Commit: | 3d4cd04 | |
---|---|---|
Author: | Johnu George | |
Committer: | Kubernetes Prow Robot |
Code restructuring to support V1alpha1 and V1alpha2 API (#448) * Code restructuring to support V1alpha1 and V1alpha2 API * Adding comments * Test package changes * Moving requirements file * Fix the package location * Renaming studyjobcontroller to katib-controller
Commit: | 1316bad | |
---|---|---|
Author: | oshima | |
Committer: | Kubernetes Prow Robot |
add v1alpha2 grpc api (#427) * add v1alpha2 grpc api Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * update gRPC API Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * add v1alpha2 DB IF Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix typo, add doc and add todo for nasconfig Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * apply comments Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix typo Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * update proto Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * update Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Commit: | 8f89ad4 | |
---|---|---|
Author: | Andrey Velichkevich | |
Committer: | Kubernetes Prow Robot |
Add validation for NAS job in Katib controller (#398) * Initial commit * Add validation for NAS config * Fix validation * Add algorithmType in NasConfig validation * Add Discrete ParameterType to validation * Move validation to webhook Change GetJobType function Make a list with NAS algorithms * Add ValidateSuggestionParameters function in Katib API * Fix api * Add ValidateSuggestionParameters to Suggestion service * Change isValid to int32 * Create Validation function in NAS RL Suggestion service * Fix small problems * Reduce code inside Validation function * Add empty ValidateSuggestionParameters function in each HP service written in GO * Fix logging * Add ValidateSuggestionParameters to mock * Handle Unvailable error
Commit: | 4d031e7 | |
---|---|---|
Author: | Andrey Velichkevich | |
Committer: | Kubernetes Prow Robot |
Add create time to Trial API (#410) * Add create time to Trial API * Add Trial create time information * Fix UT for db
Commit: | f11c13e | |
---|---|---|
Author: | Andrey Velichkevich | |
Committer: | Kubernetes Prow Robot |
Extend Katib API for NAS jobs (#327) * Add fields to studyjob structure * Change nasjob yaml file * Change parameter type * Add Parameter Type=range * Change API * Change input size * Reset API structure * Change StudyJob API structure * Remove Range parameter * Fix api.proto * Fix gopkg.toml * Remove old nasjob file * Fix nasjob.yaml * Add custom suggestion * Add blank NAS suggestion Change Katib API to process yaml file for NAS * Add correct YAML file for NAS example * Fix newline * Change StudyID to 1 * Add jobType parameter in Parsing * Remove changes in manager * Add NasConfig inside Yaml file * Fix name in nasConfig * Fix get StudyConfig in NAS * Add JobType in all services * Add job_type in bayesian_service * Add pointers in NasConfig structure * Fix Pointer in API * Add consts for jobType Remove return from populateCommonConfigFields * Move const jobType to const file * Remove Range parameter * Modify YAML file for NAS jobs * Add getStudyJobType function in GRPC server * Add blank GetStudyJobType func in manager * Fix metrics collector * Remove jobType from getStudy * Remove getStudyJobType from manager * Add NAS RL yaml deployment * Change worker to GPU * Clean nasrl suggestion * Add -u inside training-container * Fix namespace in worker template
Commit: | f78a108 | |
---|---|---|
Author: | Hougang Liu | |
Committer: | Kubernetes Prow Robot |
delete obsolete data in db (#315) * delete obsolete data in db * add delete study test * make sure trials and workers deleted when study deleted in ut test
Commit: | fae6aa5 | |
---|---|---|
Author: | Hougang Liu | |
Committer: | Kubernetes Prow Robot |
add bestTrialId to statusJob status (#312) * add bestTrialId to statusJob status * generate mock and add bestworkerid
Commit: | f24889c | |
---|---|---|
Author: | oshima | |
Committer: | Kubernetes Prow Robot |
Add api doc (#303) * add api doc Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix typo Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * add instructions for update api files and docs Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix typo Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Commit: | 36d8d25 | |
---|---|---|
Author: | Koichiro Den | |
Committer: | Kubernetes Prow Robot |
Implement gRPC Health Checking Protocol + add readiness/liveness probes to vizier-core (#270) * Ensure vizier-core never been stuck too long waiting for DB conn Signed-off-by: Koichiro Den <den@valinux.co.jp> * Add standard Health gRPC service Signed-off-by: Koichiro Den <den@valinux.co.jp> * Change db.New to return error instead of exit(1) with log.Fatal Signed-off-by: Koichiro Den <den@valinux.co.jp> * Add SelectOne() to VizierDBInterface Signed-off-by: Koichiro Den <den@valinux.co.jp> * Rename import for later convenience Signed-off-by: Koichiro Den <den@valinux.co.jp> * Implement and register Health Server for Katib manager Signed-off-by: Koichiro Den <den@valinux.co.jp> * Add readiness/liveness probes to vizier-core Signed-off-by: Koichiro Den <den@valinux.co.jp> * Update test codebase Fixes: 61ac5607353 ("Add SelectOne() to VizierDBInterface") Signed-off-by: Koichiro Den <den@valinux.co.jp>
Commit: | 0bc5182 | |
---|---|---|
Author: | oshima | |
Committer: | k8s-ci-robot |
New UI for Katib (#208) * add ui Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * add ui Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * update test and doc Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix test Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix test Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix test Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * remove modelDB Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix test Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * refactor Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * add loading img Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix test Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * Add loading image Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * refactor Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * add root redirection Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * add latestLog flag to GetWorkerFullInfo Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix test Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Commit: | 81f2b74 | |
---|---|---|
Author: | Mayank Juneja | |
Committer: | k8s-ci-robot |
Add REST API using grpc gateway (#142) * dep ensure * add grpc-gateway via dep * update protobuf via dep ensure * update compiled go code, add reverse proxy * add REST entrypoint for manager * update API build script * use build script to generate code * remove binary file * update build, deploy scripts for REST API * change name * add manifests for core-rest * remove deploy * add comments * remove vendor * use Gopkg files from master * update Gopkg files * update Gopkg files * update proto files and protobufs * update build scripts and tests * copy vendor for tests * uncomment deploy * update image name * ignore vizier-core-rest for port forwarding * update build script * update manifests * Add docs for REST API * core review changes * remove service account
Commit: | 4085701 | |
---|---|---|
Author: | oshima | |
Committer: | k8s-ci-robot |
StudyJobController: Update worker status and fix status bug (#159) * mark complete after metrics reported Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * update worker status Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix save model bug Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * save models after completed Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Commit: | 8f85e81 | |
---|---|---|
Author: | oshima | |
Committer: | k8s-ci-robot |
refactor studyjob CRD controller (#152) * refactor studyjob CRD controller Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix type Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix test Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * update mocks Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * update deploy and build script Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * Avoid duplication of suggestion request Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * add RawTemplate for WorkerSpec Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Commit: | 3c0499d | |
---|---|---|
Author: | oshima | |
Committer: | k8s-ci-robot |
Delete vendor dir (#153) * delete vendor directory Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * update .gitignore Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * update tests Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix test Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Commit: | e0bd5ee | |
---|---|---|
Author: | YujiOshima | |
Committer: | YujiOshima |
allow same study name on multiple job Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Commit: | 5ea9c3b | |
---|---|---|
Author: | YujiOshima | |
Committer: | YujiOshima |
WorkerSpec contain only path for template, add comment Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Commit: | 15e6bfc | |
---|---|---|
Author: | YujiOshima | |
Committer: | YujiOshima |
add StudyJobController CRD and Controller Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Commit: | cd0d27d | |
---|---|---|
Author: | YujiOshima | |
Committer: | YujiOshima |
update vendoring pkgs Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Commit: | 8f22850 | |
---|---|---|
Author: | Vinay Kakade | |
Committer: | k8s-ci-robot |
Fix indent to spaces (#121)
Commit: | 14963b0 | |
---|---|---|
Author: | oshima | |
Committer: | k8s-ci-robot |
API: Add WorkerStatus to GetMetrics and remove unused items (#110) * update API Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * add status to GetMetrics and delete unused item in API Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Commit: | 0a95175 | |
---|---|---|
Author: | oshima | |
Committer: | k8s-ci-robot |
Refine API (#74) * Refine API Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * add cli Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * add gird demo Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * add GKEDemo Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix test Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * add mnist-models.yaml Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix docs Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * improve GKEdemo docs Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * add more docs Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix test Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * enable get worker from trialid ana add getParameterList from studyid Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix doc Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix test Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Commit: | 71a2bd3 | |
---|---|---|
Author: | oshima | |
Committer: | k8s-ci-robot |
Cobra cli (#69) * add cobra to cli Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * add doc Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix comment Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Commit: | 3157a7a | |
---|---|---|
Author: | Ce Gao | |
Committer: | k8s-ci-robot |
*: Refactor the structure (#65) * cmd: Add CLI Signed-off-by: Ce Gao <gaoce@caicloud.io> * scripts: Move the scripts to the directory Signed-off-by: Ce Gao <gaoce@caicloud.io> * manager: Refactor Signed-off-by: Ce Gao <gaoce@caicloud.io> * mock: Refactor Signed-off-by: Ce Gao <gaoce@caicloud.io> * earlystopping: Refactor Signed-off-by: Ce Gao <gaoce@caicloud.io> * build.sh: Fix Signed-off-by: Ce Gao <gaoce@caicloud.io> * kubernetes: Remove Signed-off-by: Ce Gao <gaoce@caicloud.io> * suggestion: Refactor Signed-off-by: Ce Gao <gaoce@caicloud.io> * examples: Rename conf to examples Signed-off-by: Ce Gao <gaoce@caicloud.io> * api: Refactor Signed-off-by: Ce Gao <gaoce@caicloud.io> * *: Fix Signed-off-by: Ce Gao <gaoce@caicloud.io> * build.sh: Remove comments Signed-off-by: Ce Gao <gaoce@caicloud.io>
Commit: | 5b0929b | |
---|---|---|
Author: | oshima | |
Committer: | k8s-ci-robot |
refactor Model API (#51) * refactor Model API Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * fix ModelStore IF name Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Commit: | 3ca2df0 | |
---|---|---|
Author: | oshima | |
Committer: | k8s-ci-robot |
Add Model Management API (#48) * update vendor Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * add ModelStore API Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * refactor Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * refactor Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Commit: | 5733bd8 | |
---|---|---|
Author: | oshima | |
Committer: | k8s-ci-robot |
add early stoppping service (#41) * add early stoppping service Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * delete debug message Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com> * add medianstopping to test script Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Commit: | 830e2a7 | |
---|---|---|
Author: | oshima | |
Committer: | Ce Gao |
add test script and argo files (#22) Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Commit: | 293c6ba | |
---|---|---|
Author: | oshima | |
Committer: | Ce Gao |
update packages (#19) Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Commit: | f38278f | |
---|---|---|
Author: | Ce Gao | |
Committer: | GitHub |
vendor: Add (#12) * dep: Add config Signed-off-by: Ce Gao <gaoce@caicloud.io> * vendor: Add Signed-off-by: Ce Gao <gaoce@caicloud.io> * dep: Remove useless constraint Signed-off-by: Ce Gao <gaoce@caicloud.io>
Commit: | 279b47e | |
---|---|---|
Author: | oshima | |
Committer: | Ce Gao |
add katib code (#4) Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>