Proto commits in awslabs/sagemaker-debugger

These 12 commits are when the Protocol Buffers files have changed:

Commit:6cb0d55
Author:Allen Liu
Committer:GitHub

Changes to support tensorflow 2.12 (#652) * changes to support tensorflow 2.12 * format change * updagrade protobuf version for tf 212

The documentation is generated from this commit.

Commit:534c44e
Author:Danny Key
Committer:GitHub

Fix dt uint64 missing proto (#617) * Add missing SMDebug tensorflow protobuf types * Bumping version to 1.0.21 for TF 2.10

Commit:245784b
Author:Rahul Huilgol
Committer:GitHub

Publishing changes from 0.4 into master (#76) * Update version * Bump up version to 0.4.10, skipping 0.4.9 due to confusion of reverting release * Get integration tests working again without patch hacks * Release 0.4.11 * Update version * Pass iteration_number to metrics.log_metric as keyword argument. Fix bug where it was being passed to the timestamp positional argument. (#62) Overriding CI fail due to urgency and clear fix. (cherry picked from commit 41fdc8088b70769a592c13931a06e765615485b6) * Update version (0.4.12 -> 0.4.13) * Update NOTICE (cherry picked from commit 647f0005d3df452192d7565da76b7a0185bb0d6d) * Create THIRD-PARTY (cherry picked from commit 5bfdbf439f966a86ea5904d527d9e39597fdfa97) * Update THIRD-PARTY (cherry picked from commit e18a8bffdbbdfaeb5f38a16ba2e048b2ec60161c) * Mention modifications of original .proto files from TB. * Undo version change * precommit run

Commit:b20b2e4
Author:Denis Davydenko
Committer:Denis Davydenko

Mention modifications of original .proto files from TB.

Commit:9092ad5
Author:Jared T Nielsen
Committer:GitHub

Rename package 'tornasole' to 'smdebug (#382) * Change setup.py * Rename tornasole folder to smdebug * Replace 'tornasole/' with 'smdebug/' * Replace 'tornasole;' with 'smdebug;' * Replace 'tornasole.' with 'smdebug.' * Replace 'from tornasole import' with 'from smdebug import' * Replace 'tornasole;' with 'smdebug;' * Replace 'tornasole/' with 'smdebug/' * Add isort * Replace, without the file exclusions * Disable tree depth tests since they're timing out on CI. Probably a bug in PR #340

Commit:3377070
Author:Rahul Huilgol
Committer:GitHub

Remove graph export support in pytorch, and fix subtle bugs in tensorboard dir assignment (#370) * Remove graph export support in pytorch, and fix some expands of user provided tensorboard directory * Removed print * check none * fix test Signed-off-by: Rahul Huilgol <huilgolr@amazon.com> * fix json load of tensorboard configs * Address comments Signed-off-by: Rahul Huilgol <huilgolr@amazon.com>

Commit:0dceec1
Author:Rahul Huilgol
Committer:GitHub

Add graph export support for MXNet and Pytorch (#247) * Save histograms for weights and gradients * Use standard TF summary function * undo line break changes * fix cases when bool tensor was being passed to add_histogram, and fix tests * Fix region bug and update tb_writer construction * Include summaries if any write_histogram was set to True * Refactor writers in core * set default step to 0 * Use new writer in hook * Cherry picking change of refactor writers * set default step to 0 * remove histogram related stuff * rename IndexUtil * Fix imports * remove import of re * Fix import of summary proto * Fix step usage in writers * Fix step usage by event file writer * Remove direcotry in tensorboard directory, and add collection name as prefix for summaries created * Fix import errors * Fix resnet example which did not have str2bool args * Fix core test * Fix core test * Indentation and move some code to a new function * Merged Vikas' branch on tb data read * Add untested support to read tensorboard data * Write mode and mode_step for summaries, and fix the error of multiple global steps being assigned to same train step * remove unnecessary file * remove test script * Remove changes to imagenet script * working scalars * Change path of tornasole event files * Have new index file per mode for tensorboard events * Move tensor values to different file * move to outside tensors folder * Change frequencies for tf examples * Introduce CollectionKeys * Merging export as json * Make histogram a reduction config property, and add save_raw_tensor field to reduction config. Verified the usage for tensorflow. Also some cleanup with respect to save config in save manager * Fix bug in loading collections * Fix writing tensorboard data in global mode * Add graph support to pytorch models. Copied some new protos, and a couple of files from torch.tensorboard. * Working graph export for mxnet * Save graph correctly for mxnet * undo utils change worker pid * fix import * fix import * do not flush index writer * remove data files * Fix save config issue * make save_histogram a property of collection * Fix save config bugs, and add scalar support to TF * Skip summaries whose tensors are unreachable in graph, and avoid adding histogram when original collection is not included * Move histogram creation to writer instead of event_file_writer, refactor should_save_collection in save manager, add save_scalar methods to MXNet and Pytorch * WIP tensor scalar support * undo add of data * remove test * use correct writer * Make saving scalars work, and added type checks * Writing scalars and tensors supported. tested in tensorboard. need to test through trials * WIP testing steps * remove save scalar and tensor for now because of step number issues. work on trial loading tensorboard data and come back to this * Working reads in non index mode * Tensorboard reads working with indexing * cleanup index file location function * Make pytorch tests working * Reduce length of test_estimator_modes, and add tf tensorboard test * Add basic scalar summary test * Untested completed reads of tensorboard data * Add more tensorboard tests for trial * fix test when reading event files for tensorboard from s3 * Fixed a reduction test * Fix reduction test in TF * Fix merge of a test * fix logger import, and default save/reduction config in save manager * Fix reduction save_raw_tensor in TF * Some cleanup of prepare and collection includes * fix tf tests * Fix all tests * Add tensorboard index test * Fix tensorboard test wrt optimizer_variables * not save histogram for strings * remove when nan support * add hash * Fix collection checks in xgboost * add xgboost tests * Typo * Update hook.py (#243) * reduce length of test and add / to prefix * WIP move to tornasole hist summaries for TF * Change collections_to_save_for_step, make TF use custom histograms, refactor to _save_tensor method for all frameworks * rename to save_for_tensor * undo some files * undo some files * Update tests.sh * remove pytorch graph support * remove mxnet graph support * Revert "remove mxnet graph support" This reverts commit 56754da7b44ce7276cf6c9830fd7b0308061ef55. * Revert "remove pytorch graph support" This reverts commit d5c49def8fb369f95282b384dc0bc8a9928ae941. * remove old files * fix export of models * Create __init__.py

Commit:cb9c20e
Author:Rahul Huilgol
Committer:GitHub

Refactor Core's Writers (#190) * Cherry picking change of refactor writers * set default step to 0 * remove histogram related stuff * rename IndexUtil * Fix imports * remove import of re * Fix step usage by event file writer * Fix import errors * Fix core test * undo utils change worker pid * fix import * fix import * do not flush index writer * review comments

Commit:494498e
Author:Rahul Huilgol
Committer:GitHub

One Repo to rule them all (#72) * WIP, added tf and core * WIP * add code from all repos, and fix imports * fix more imports, add tests * add docs, examples * fix imports in examples * fix setup.py and CI * fix test invoker * Reload a step directory when it was last seen as empty (#117) * fix imports * fix new imports * unskip test * Add setup.py * undo end of training merge * remove import * Add training end code * add frameworks * fix function used * update setup to use append * fixing small errors (#74) * testing * testing * testing * testing * testing * testing * testing * trigger ci * trigger ci * trigger ci * trigger ci * testing * testing * testing * testing * testing * testing * testing * testing * testing * testing * testing * testing * testing * testing * testing * testing * testing * testing * testing * testing * uploading test reports to s3 * uploading test reports to s3 * uploading test reports to s3 * uploading test reports to s3 * changes * changes * docs * Add subpackages in core * docs and examples * provides trials and rules as part of main namescope * move rules and trials outside * fix training end tests, and update setup.py * new readme for whole repo * fix setup.py * update packages * make the mxnet tests faster * reduce lenght of integration tests * add script to build binaries * update argument * change num steps and frequency * delete path * add boto3 * fix training end tests * changes * move exceptions to its own module * fix links * update version string in setup.py * uncommented test * making the pytorch stuff up to date (#79) * making the pytorch stuff up to date * reverting util.py * fixing the hook imports * fixing test imports * fix increment of step * training_has_ended fix for pytorch (#80) * making the pytorch stuff up to date * Revert "making the pytorch stuff up to date" This reverts commit f87f9560b5351f135553072c495f2123964b9f3c. * changing to training_has_ended

Commit:171ace9
Author:Andrea Olgiati

s/tornasole_numpy/tornasole_core/

Commit:a5580e8
Author:Andrea Olgiati

First version

Commit:5c77ccd
Author:Andrea Olgiati

Import