Proto commits in cogment/cogment-verse

These 46 commits are when the Protocol Buffers files have changed:

Commit:ebff5bf
Author:William Duguay
Committer:GitHub

Bump and clean dependencies (#187) * update requirements * update requirements and test experiments * remove supersuit installation instructions * correct PettingZoo references * gitlab test * full gymnasium support * self review * linters

The documentation is generated from this commit.

Commit:38f35d0
Author:wduguay-air

merge 142-bump-dependencies

Commit:ce70763
Author:wduguay-air

merge main

Commit:f68ca7f
Author:wduguay-air

full gymnasium support

The documentation is generated from this commit.

Commit:bd95827
Author:Clodéric Mars
Committer:GitHub

Centralize a bunch of boilerplate code in shared [environment / actor / sample producer] session (#178) * Introduce the session helper in the behavior cloning code * Transparent addition of the session helpers as mixin * Classic pettingzoo environment using the helper * tutorials * updated tutorial doc * simple_dqn * td3 * sac * appo * ppo_atari * isaac gym and overcooked_ai * linters * test hydra config composition before full runs * linters * Naming + Documentation * Fix tutorials * remove web proto files * revert gitignore --------- Co-authored-by: wduguay-air <william@ai-r.com>

Commit:b8eaba2
Author:William Duguay
Committer:GitHub

Add Typescript and Rebuild web app (#185) * add typescript and rebuild web app * remove untracked files accidentaly tracked * gitignore * removed tracked files to untrack * update gitignore

Commit:607a35b
Author:wduguay-air

fix multi env spec support

Commit:7651033
Author:wduguay-air

fix web app

Commit:dd604d1
Author:Clodéric Mars
Committer:GitHub

Extract environments frontend from the main SDK (#174)

Commit:91c4380
Author:wduguay-air

base integration of hf hub actor

Commit:33acefa
Author:Clodéric Mars
Committer:GitHub

Update readme & license (#169) * Update the doc and citation file * Update license

Commit:922556f
Author:Clodéric Mars
Committer:Clodéric Mars

WIP

Commit:64d6e6d
Author:William Duguay
Committer:GitHub

Migrate to Model Registry V2 (#144) * space serialization * web app * linters * linters * linters * gitignore * model_registry_v2 and smoke tests * remove smoke tests * remove changes from Smoke test PR * remove python-version * self-review * self-review * self-review * linters * linters * use pysdk model_registry_v2 * remove old model registry * fix config * fix config * self review * adapt new actors * lintersd * fix pong impl * soft-actor critic test config * rename model_version to model_iteration * renaming * track latest model changes * address PR comments * latest model * fix model tracking. Bump cogment. Fix dependencies with SuperSuit

Commit:f03b619
Author:Luong-Ha Nguyen
Committer:GitHub

Multi-agent RL for Petting Zoo (#135) * Single python entry point & hydra based configuration * Introducing DQN * Add support for petting zoo classic environment - action mask - turn based play - connect four UI * Self play and HILL DQN training for connect four * Update readme with missing dependencies * Fix issue in the mountain car bc experiment conf * Fix bug for linux * Using cogment 2.5.0 * add debugger for docs in the next branch (#82) * add debugger for docs in the next branch * correct the typo * Pytorch multiproc fix (#81) * Fix config bug * Fix pytorch multiproccessing bug * Move torch specific multiprocessing config to torch imports * Add space to pass license test Co-authored-by: saikrishnagv_1996 <saikrishnagv1996@gmail.com> * fix SimpleQueue issue (#83) * fix SimpleQueue issue * add a2c run config and reference in experiment * ToDos * black formatting * fix override_run in simple_a2c/cartpole * fix log_metric bug (#86) * Upgrading Cogment and Gym (#87) * Dev ppo (#91) * add ppo for continuous actions * fix lint * remove debug folder * fix lint * fix lint * refactor gym env wrapper * latest code * multi envs with a single agent for ppo * remove debugger * latest code * clean code * fixed lint * remove time profiler results * uncomment the render option for gym * revised version according to the reviewers feedback * fixe bugs in gyms (#98) * remove duplicates in gym adapter * Hot fix (#100) * remove duplicates in gym adapter * remove duplicates in gym adapter * fix lint * TD3 (#94) * TD3 * add exploration noise, black format * remove unused imports * new test * hacky fix to sample_space * add random action * current * fix shape bug in critic loss, add extra model params * current * converrging rewards afrer 400K time steps * black formatting * fix pylint issues * chill out pylint * disable pylint in td3.py * Isaac gym integration (#84) * isaac adapter, configs * add isaacgym requirements * arrange imports * add isaacgym installation instructions in README * fix import issue * edit requirements, readme, mlflow port * fixes * hydra config * add extra instructions for isaac gym * add full readme instructions * black format * Update config.yaml * Update import_class.py * Update environment.py * remove unused imports * disable pylint * pylint * pylint test * fixed some issues for isaac gym * black format * fixed some bugs * fix vugs (#102) * fix vugs * fix gym adapter Co-authored-by: Luong-Ha Nguyen <ha@Luong-Has-MBP.air> * black formatting * fix pylint * pylint * handle both discrete and continous actions * no cuda imposition in simple_a2c * config.environment_specs * pylint fix in TD3 * modified random action size * put replaybuffer back * fixed lint Co-authored-by: Luong-Ha Nguyen <ha@Luong-Has-MacBook-Pro.local> Co-authored-by: Luong-Ha Nguyen <ha@Luong-Has-MBP.air> Co-authored-by: Luong-Ha Nguyen <luongha.nguyen@gmail.com> * add soft actor critic * add sac actor * add SAC optimizer * add working pipeline * fixed sac policy formulations * fixed bugs in SAC actor * fixed license * set separate learning rate for policy and value networks * add new set of hyperparameters for SAC * Adapt PPO to petting zoo atari * Add training part for petting zoo pong * Add UI test * Resolve conflits * Add selfplay petting zoo * Add working UI for pong * Add hill ppo for petting zoo * Add control keyboards for petting zoo pong * Add feedback icons * Add colored buttons for UI * Refactor UI for human feedback * Add data buffer for human data and update README for petting zoo * Fix typos in README * Fix lint * Add player name for human feedback UI * Add neuralfeedback and feedback frequency to UI * Defaulting training device to CPU * Take all observations of an episode for training (pz pong) * Fix typo * Fix bugs in loglikehood * Add dqn for petting zoo * Fit UI to screen size and Modify data collection for ppo * simple PPO * hyperparams * Add new sets of hyperparameters for pong * Add model registry for videos * Modify hyperparameter values * Adapt to new main branch * Remove component folders * Adapt PPO to gym spaces * Adapt petting zoo to cogverse * Remove merged file * Fix lint * Fix license * fix lint * fix black * delete trained model for pong * Modify unitest * Fix serialization format * Speccify the serialization format for unitest * Modify the code based on the review * Fix typos in readme * change run command to console * Fix typo developpment setup * Add hydra in requirements * Remove duplicated hydra * Remove redundant comments * Add evaluator class and handling random seed * Remove unsed import * fix: modify config experiment for ppo * refactor: modify the config file for improving unittest * refactor: unittest config file --------- Co-authored-by: Clodéric Mars <cloderic@ai-r.com> Co-authored-by: Josh <josh@ai-r.com> Co-authored-by: joshair <109359509+joshair@users.noreply.github.com> Co-authored-by: saikrishnagv_1996 <saikrishnagv1996@gmail.com> Co-authored-by: vabdollahi <vahid@ai-r.com> Co-authored-by: Luong-Ha Nguyen <luong.ha.nguyen@notostechnologies.com> Co-authored-by: Luong-Ha Nguyen <ha@Luong-Has-MacBook-Pro.local> Co-authored-by: Luong-Ha Nguyen <ha@Luong-Has-MBP.air>

Commit:f338c29
Author:wduguay-air
Committer:GitHub

Support Gym MultiDiscrete MultiBinary (#130) * space serialization * web app * linters * linters * linters * js space serialization and inspector

Commit:24d80ee
Author:wduguay-air
Committer:GitHub

117 Add int32 to serialization data types (#118) * fix * fix * test * address review comments * linter * rebuild web app * address review comments --------- Co-authored-by: Clodéric Mars <cloderic@ai-r.com>

Commit:a1483cf
Author:Clodéric Mars
Committer:Clodéric Mars

Directly using and serializing gym.spaces and their value (#116) * Directly using and serializing gym.spaces and their value * Introducing debug inspector of received observation on the web side * Take into account review

Commit:cc4343b
Author:Clodéric Mars
Committer:Clodéric Mars

[BREAKING] Single python entry point & hydra based configuration

Commit:a2311bf
Author:Clodéric Mars
Committer:Clodéric Mars

Introduce a lobby + fix multiuser joining + instruction for deploying using a tunnelling system (#109) * Fix missing constructor parameters * Introduce a trial lobby * Fix copyright notices * Adding documentation for ngrok based tunnel * Take into account review

Commit:6a166a1
Author:Clodéric Mars
Committer:GitHub

Introduce a lobby + fix multiuser joining + instruction for deploying using a tunnelling system (#109) * Fix missing constructor parameters * Introduce a trial lobby * Fix copyright notices * Adding documentation for ngrok based tunnel * Take into account review

Commit:4212224
Author:Clodéric Mars
Committer:Clodéric Mars

Add support for petting zoo classic environment - action mask - turn based play - connect four UI

Commit:68953eb
Author:Clodéric Mars
Committer:Clodéric Mars

Introducing DQN

Commit:d949a1c
Author:Clodéric Mars
Committer:Clodéric Mars

Single python entry point & hydra based configuration

Commit:46e1a9d
Author:Clodéric Mars
Committer:GitHub

Add continuous action space web client for lunar lander (#65) * Introduce `Space` definitions, use in environment specs * Streamline the human player implementation in the web client * Implementing visual Joystick and DPad for lunar lander - Would be easy to add support for other games * Add some styling powered by tailwind * Fix lint issues * Fix remaining formatting issues * Update environment_adapter.py * flatten dimensions * black * Revert "black" This reverts commit 5be497969f72f9c75e13d1dff82babcf51cec5d3. * format agent adapter * Fix format and improve readme Co-authored-by: saikrishnagv_1996 <saikrishnagv1996@gmail.com>

Commit:2fd1f99
Author:saikrishna-1996

no contrainer hive

Commit:6eb691d
Author:Clodéric Mars
Committer:GitHub

Migration to cogment 2.2 - No container (#61)

Commit:c23958a
Author:saikrishnagv_1996
Committer:GitHub

Dev sb3 (#54) * Add hugging face - stable baselines 3

Commit:7cc93e8
Author:saikrishnagv_1996

add data proto

Commit:eec3326
Author:Sagar Kurandwad
Committer:GitHub

Selfplay RL (#45) * Defining multiple environment implementations * add procgen dependency * basic procgen wrapper * Example run config for procgen * add documentation * remove unnecessary whitespace * procgen webclient support * add controls for remaining procgen environments * fix control description * added basic adapter for pybullet driving environment * resolving merge conflicts * added pybullet, resolved import error * required changes to data.proto and training_run.py * solved parameters problem, error while setting joint control in car.py * check git creds * fixed parallel runs issue * selfplayRL * training run changes * reun params * data.proto update * run_params * class name changes * sample_producer * selfplay_agent * refactor * add selfplay_td3 to env * bug fixes * refactoring * change num_players * env cahnges: * pybullet env integration * env switch turns b/w bob and alice; end of trial flag * switch turns bob and alice * switch turns bob and alice * adding turns b/w agents * adding turns b/w agents * turn based agent actions * cleanup * sample producer * get SARSD for both agents * sample producer * debugging * alie rewards * agent implementation * model implementation * add action_dims * replaybuffer * training * learning * agent training and test * rebase * change mlflow port * environment mode * adding exception * updae bob and alice order in trials * hyperparameters and port updates * readme update * cleanup * cleanup * cleanup * cleanup * Licenses * cleanup and pylynt fixes * pylynt changes * pylint changes * license checker * license checker * pylint changes Co-authored-by: Clodéric Mars <cloderic@ai-r.com> Co-authored-by: Jonathan Fisher <jonathan@ai-r.com> Co-authored-by: Kharyal <chaitanyajee@gmail.com> Co-authored-by: saikrishna-1996 <saikrishnagv1996@gmail.com>

Commit:2f754c6
Author:Jonathan Fisher

rework muzero networks

Commit:e0a13df
Author:Jonathan Fisher
Committer:GitHub

Muzero (#32) * Basic MuZero implementation * Update PyTorch version * Improve test coverage

Commit:f54979a
Author:Clodéric Mars
Committer:GitHub

Introduce a `play` run to execute (and observe) a few trials using any actor implementation (#41) * Extract environment spec configuration to its own message * Using environment params in all run configs * Introduce a 'play' run and a 'random' agent implementations * Factorizing the environment specs * Moving the number of player in an environment to the specs * Fix docker compose * Further linter fix * Make sure client containers are removed after each call * Make better usage of model and version user data when saving/loading the models * Add the notion of role to the human actor * Fix python base test * Add documentation

Commit:5a42655
Author:air-sara
Committer:GitHub

2.0 (#38) Update to Cogment 2.0

Commit:df637e6
Author:Jonathan Fisher
Committer:GitHub

Behavior Cloning Tutorial + Development Mode (#36) * simple BC functional * Lander, cartpole, & mountaincar working * add doc for simple BC * fix lint issue * Change the default mlflow port to 3000 macOS Monterey nos uses port 5000 by default for its AirPlay server * Introduce dev version of the web_client - with sync with local files - + proper production version * Minor refactors * Tutorial steps * Update and unify the poetry installations * Introduce a 'development' mode * Add instructions to build the client when needed * Fix HILL environment rendering * Update simple_bc.md * docker tzdata noninteractive fix * Update README.md * Making the dev mode more resilient * Fix the retrieval of the atari roms * Pinning grpcwebproxy version * Make sure that environment and tf_agents are interrupted properly Co-authored-by: Clodéric Mars <cloderic@ai-r.com> Co-authored-by: saikrishnagv_1996 <sai@ai-r.com> Co-authored-by: saikrishnagv_1996 <saikrishnagv1996@gmail.com>

Commit:0dbf54c
Author:Air-sara
Committer:Air-sara

2.0!

Commit:bb0591c
Author:saikrishnagv_1996
Committer:GitHub

Merge branch 'main' into dev-hive

Commit:7eefa64
Author:saikrishnagv_1996

current version

Commit:c070a1c
Author:Jonathan Fisher
Committer:GitHub

Add procgen environments (#28) * Defining multiple environment implementations * add procgen dependency * basic procgen wrapper * Example run config for procgen * add documentation * procgen webclient support Co-authored-by: Clodéric Mars <cloderic@ai-r.com> * add link to details * fix typo * add link to README * Run black linter * fix linter complaints Co-authored-by: Clodéric Mars <cloderic@ai-r.com>

Commit:cc57725
Author:saikrishnagv_1996

update atari config, data proto, fixes in td3, training run

Commit:96f7f2b
Author:Clodéric Mars
Committer:Clodéric Mars

Defining multiple environment implementations

Commit:be12619
Author:Sagar Kurandwad
Committer:GitHub

Reinforce Refactor (#18) * tune hyperparameters * remove probability regularizer * refactor learning * refactor learning * refactor init_model and remove epsilon and epsilon schedule * refactor reinforce and agent adapter * cleanup * cleanup * cleanup * cleanup * refactoring * cleanup * removing double initialization of model * rollout cleanup * cleanup * rollout cleanup * sample producer cleanup * sample producer cleanup * sample producer cleanup * sample producer cleanup * remove torch * cleaning load * removing lr schedule * remove legal_moves * replaybuffer cleanup * remove model_params * replay buffer cleanup * removing third party * cleanup * reinforce docs * remove equation * remove equation * remove equation * apche liceses * new run results * Update REINFORCE.md * Update index.md Co-authored-by: Clodéric Mars <cloderic@ai-r.com>

Commit:0486fef
Author:Vincent ROBERT

Support pipe world in the cogver environment

Commit:e8a5474
Author:Clodéric Mars
Committer:GitHub

Upgrade to cogment-py-sdk 1.3.1 (#21)

Commit:542a736
Author:Vincent ROBERT

Inital work to conect with web-client

Commit:5dd665d
Author:Clodéric Mars
Committer:GitHub

Introduce very simple A2C implementation (#1) * Add support for providing seed to the environment * Remove unused parts of the configuration * Use the same MLFLOW_TRACKING_URI inside and outside the docker container * Add support for multiple run config message types * Add ability to log full dict or protobuf messages in xp tracker * Improve manual interruption handling * Introduce simple A2C implementation

Commit:52f2bc2
Author:Clodéric Mars

Initial implementation including several agents and environments