Proto commits in snuspl/nimble

These commits are when the Protocol Buffers files have changed: (only the last 100 relevant commits are shown)

2020-09-05

Commit:	3699274
Author:	Chunli Fu	2020-09-04 18:34:59 -0700
Committer:	Facebook GitHub Bot	2020-09-04 18:37:22 -0700

[DPER3] AOT integration Summary: Integrate aot flow with model exporter. Test Plan: buck test dper3/dper3_backend/delivery/tests:dper3_model_export_test replayer test see D23407733 Reviewed By: ipiszy Differential Revision: D23313689 fbshipit-source-id: 39ae8d578ed28ddd6510db959b65974a5ff62888

The documentation is generated from this commit.

2020-09-03

Commit:	f96b913
Author:	Jordan Fix	2020-09-03 11:06:02 -0700
Committer:	Facebook GitHub Bot	2020-09-03 11:07:45 -0700

[caffe2.proto] Add AOTConfig (#44020) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44020 Pull Request resolved: https://github.com/pytorch/glow/pull/4853 Add AOT config Reviewed By: yinghai Differential Revision: D23414435 fbshipit-source-id: 3c48acf29889fcf63def37a48de382e675e0e1f3

2020-08-07

Commit:	cb1ac94
Author:	Chunli Fu	2020-08-06 23:52:25 -0700
Committer:	Facebook GitHub Bot	2020-08-06 23:54:03 -0700

[blob reorder] Seperate user embeddings and ad embeddings in large model loading script Summary: Put user embedding before ads embedding in blobReorder, for flash verification reason. Test Plan: ``` buck run mode/opt-clang -c python.package_style=inplace sigrid/predictor/scripts:enable_large_model_loading -- --model_path_src="/home/$USER/models/" --model_path_dst="/home/$USER/models_modified/" --model_file_name="182560549_0.predictor" ``` https://www.internalfb.com/intern/anp/view/?id=320921 to check blobsOrder Reviewed By: yinghai Differential Revision: D22964332 fbshipit-source-id: 78b4861476a3c889a5ff62492939f717c307a8d2

2020-07-14

Commit:	befb227
Author:	Edward Yang	2020-07-14 09:10:08 -0700
Committer:	Facebook GitHub Bot	2020-07-14 09:11:34 -0700

Fix a number of deprecation warnings (#40179) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40179 - Pass no-psabi to shut up GCC about # Suppress "The ABI for passing parameters with 64-byte alignment has changed in GCC 4.6" - Fix use of deprecated data() accessor (and minor optimization: hoist accessor out of loop) - Undeprecate NetDef.num_workers, no one is serious about fixing these - Suppress warnings about deprecated pthreadpool types Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D22234138 Pulled By: ezyang fbshipit-source-id: 6a1601b6d7551a7e6487a44ae65b19acdcb7b849

2020-03-21

Commit:	d87750c
Author:	Jordan Fix	2020-03-20 22:40:16 -0700
Committer:	Facebook GitHub Bot	2020-03-20 22:43:50 -0700

[caffe2.proto] Add backend_option to PartitionInfo Summary: Att Test Plan: Updated C2 importer test in stack. Reviewed By: yinghai, bangshengtang Differential Revision: D20527162 fbshipit-source-id: cf3d59089b651565db74f2a52af01f26fdfcbca6

2020-03-13

Commit:	4ae74b3
Author:	Chunli Fu	2020-03-12 20:21:25 -0700
Committer:	Facebook GitHub Bot	2020-03-12 20:25:50 -0700

[DPER3][Shape Inference] Initial Shape Inference in DPER3 frontend (#33607) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33607 Differential Revision: D20025048 fbshipit-source-id: 8b3a3bcfeb450de4d38c555bf2bb116ddedad3ec

2020-02-26

Commit:	04f88a3
Author:	Yinghai Lu	2020-02-26 14:50:10 -0800
Committer:	Facebook Github Bot	2020-02-26 14:54:58 -0800

Add partition info message to NetDef (#33616) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33616 Att. We start by assign `node_name` of DeviceOption in each of the op in the net. The for each unique node_name, we will have a PartitionInfo describing the partition, including logic devices that it can be assigned and we establish the link by partition names. Test Plan: unittests Canaries: AF: https://our.intern.facebook.com/intern/ads/canary/424817103900710410 AI: https://our.intern.facebook.com/intern/ads/canary/424737510862189908 Reviewed By: ipiszy, bangshengtang, jfix71 Differential Revision: D20015493 fbshipit-source-id: 0bb0f30cfc3892f7b8709d87b8bc1fbab2f2c46d

2020-01-18

Commit:	f326045
Author:	Brian Wignall	2020-01-17 16:01:29 -0800
Committer:	Facebook Github Bot	2020-01-17 16:03:19 -0800

Fix typos, via a Levenshtein-type corrector (#31523) Summary: Should be non-semantic. Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos, with https://github.com/bwignall/typochecker to help automate the checking. Uses an updated version of the tool used in https://github.com/pytorch/pytorch/pull/30606 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/31523 Differential Revision: D19216749 Pulled By: mrshenli fbshipit-source-id: 7fd489cb9a77cd7e4950c1046f925d57524960ea

2019-12-11

Commit:	7f5f2e8
Author:	Shunting Zhang	2019-12-10 21:34:24 -0800
Committer:	Facebook Github Bot	2019-12-10 21:36:24 -0800

add ZERO_COLLISION_HASH to caffe2 data type (#30912) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30912 Add a new data type ZERO_COLLISION_HASH . Test Plan: ci Reviewed By: boryiingsu Differential Revision: D18843626 fbshipit-source-id: b2d8280f13c78b4a656cf95822198df59de7b64c

2019-12-10

Commit:	bb7befb
Author:	Chunli Fu	2019-12-10 10:31:07 -0800
Committer:	Facebook Github Bot	2019-12-10 10:34:14 -0800

Support loading by blob in predictor Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30805 Reviewed By: ipiszy Differential Revision: D18827383 fbshipit-source-id: b97f958768618ca29a02b057667a9b4ee313ad3c

2019-12-03

Commit:	e7fe64f
Author:	Brian Wignall	2019-12-02 20:15:54 -0800
Committer:	Facebook Github Bot	2019-12-02 20:17:42 -0800

Fix typos (#30606) Summary: Should be non-semantic. Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos. Pull Request resolved: https://github.com/pytorch/pytorch/pull/30606 Differential Revision: D18763028 Pulled By: mrshenli fbshipit-source-id: 896515a2156d062653408852e6c04b429fc5955c

2019-11-22

Commit:	0c18de2
Author:	Chunli Fu	2019-11-22 11:59:57 -0800
Committer:	Facebook Github Bot	2019-11-22 12:04:45 -0800

Add inferBoundShapeOp Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30101 Reviewed By: ipiszy Differential Revision: D18387803 fbshipit-source-id: 5edb6b949257370b62fa6da477bd6ed2f16a9bd1

2019-11-15

Commit:	7807d44
Author:	Chunli Fu	2019-11-15 12:56:51 -0800
Committer:	Facebook Github Bot	2019-11-15 13:06:06 -0800

Add TensorShapeAndType (#29848) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29848 design doc: https://docs.google.com/document/d/15luH8R7a0WMiZzoKxu6cI0a1XDW4C0vyaW3-XQ_3G30/edit#heading=h.cyvbc4wtxkn7 Test Plan: buck build Reviewed By: ipiszy Differential Revision: D18513718 fbshipit-source-id: c3e3b30b58360b898528422ba9618b1dd3beb0a8

2019-09-24

Commit:	fbc3c14
Author:	Wenqi Cao	2019-09-23 20:42:39 -0700
Committer:	Facebook Github Bot	2019-09-23 20:44:15 -0700

adding OpProfile proto into ProfDAGProtos to support storing operation cost (#26677) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26677 This diff adds OpProfile proto into ProfDAGProtos to support storing operation cost. During performance estimation idx, net_name, type, and exec_time will be stored in this proto. Test Plan: ``` buck test caffe2/caffe2/fb/net_transforms/tests/:stats_collector_test buck test caffe2/caffe2/fb/net_transforms/tests/:perf_estimator_test buck run caffe2/caffe2/fb/distribute/snntest/cogwheel/:cogwheel_snntest_offline_training_simple_online_training ``` Reviewed By: heslami Differential Revision: D17533791 fbshipit-source-id: a339c8eadcac891aa631daaf64522b69876b5045

2019-08-11

Commit:	77c08aa
Author:	Michael Suo	2019-08-11 15:43:28 -0700
Committer:	Facebook Github Bot	2019-08-11 15:50:29 -0700

serialize modules as classes Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23098 Test Plan: Imported from OSS Differential Revision: D16383328 Pulled By: suo fbshipit-source-id: 36389b8e45c3febb7f224cd9c630fe643fa90bef

2019-07-26

Commit:	9223fa1
Author:	Supriya Rao	2019-07-26 15:45:39 -0700
Committer:	Facebook Github Bot	2019-07-26 15:52:15 -0700

Add support to serialize qtensor in JIT. (#23356) Summary: Adds qtensor specific fields to the proto file so that they get serialized into the model.json Pull Request resolved: https://github.com/pytorch/pytorch/pull/23356 ghstack-source-id: 87263428 Differential Revision: D16473237 fbshipit-source-id: bf5b51d0863d036d30a1644a3c3b74516468224b

2019-07-02

Commit:	2c2a913
Author:	James Reed	2019-07-01 21:11:12 -0700
Committer:	Facebook Github Bot	2019-07-01 21:14:35 -0700

Preserve SourceRanges across serialization (#22179) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22179 ghimport-source-id: 9879551127da09d78ca348b9e436db5a09a92a38 Test Plan: Imported from OSS Differential Revision: D15981423 Pulled By: jamesr66a fbshipit-source-id: a2506f5a2f05916b6e8226841b0229110e758671

2019-05-17

Commit:	cd28ff5
Author:	davidriazati	2019-05-17 14:29:37 -0700
Committer:	Facebook Github Bot	2019-05-17 14:43:14 -0700

Add support for __getstate__/__setstate__ on module (#20242) Summary: Adds support for `__getstate__` and `__setstate__` on modules that are called as part of export (`torch.save()`) and import (`torch.jit.load`). * `__getstate__` and `__setstate__` must be TorchScript functions with the signatures `() -> T` and `(T) -> None` respectively * The results of `__getstate__` are stored using the pickler in `states.pkl` with one for each module in definition order (`__getstate__` returns `None` by default if an imlpementation is not provided) * This prevents sharing between `__getstate__` and attributes, but this should be fine since their use is mostly unrelated (attributes are for storing values to be used in script methods, `__getstate__` for running arbitrary computations during import) Follow up * Somehow replacing `__getstate__`/`__setstate__` with a `ScriptMethodStub` makes `MyScriptModule().__getstate__()` call `ScriptModule.__getstate__()` when used in Python. This should be fixed so semantics in Python are preserved, but it doesn't affect the typical usage. ](https://our.intern.facebook.com/intern/diff/15287161/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20242 Pulled By: driazati Differential Revision: D15287161 fbshipit-source-id: b3f5f33ab74a21a89e6d15460af63aff75cab2d8

2019-05-16

Commit:	c129ab0
Author:	Rui Zhu	2019-05-15 19:15:24 -0700
Committer:	Facebook Github Bot	2019-05-15 19:24:08 -0700

Change onnxifi workflow to support multi-group quantized & Add multi quantization info to caffe2.proto (#20439) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20439 This is the QTensorProto workflow for multi group quantization in C2 side. No DNNLOWP Tensor related thing is included in this pr, so once we finished glow side, we should be able to test this pr using resnet50. Reviewed By: yinghai Differential Revision: D15096919 fbshipit-source-id: 741eecd59eb79d24d9fe2b035f6246d42422d25c

2019-05-14

Commit:	e8fb5f3
Author:	davidriazati	2019-05-13 18:28:45 -0700
Committer:	Facebook Github Bot	2019-05-13 18:32:16 -0700

Bump torch proto version (#20444) Summary: Tagging along to changes in #20191 which added more support for types in the pickler Pull Request resolved: https://github.com/pytorch/pytorch/pull/20444 Pulled By: driazati Differential Revision: D15321463 fbshipit-source-id: 985061bf5070a7d7bad58ea8db11d531f3d13e74

2019-04-27

Commit:	a25b795
Author:	Michael Suo	2019-04-26 19:14:10 -0700
Committer:	Facebook Github Bot	2019-04-26 19:17:21 -0700

use fully qualified name for ScriptClasses (#19239) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19239 ghimport-source-id: 830aad6dc11d2a7247760a9c7c9fc8556f70a706 Differential Revision: D14928293 Reviewed By: eellison Pulled By: suo fbshipit-source-id: d2efa5d7f7397526083278d6650b9cee8d967b1a

2019-03-27

Commit:	30da6c7
Author:	zrphercule	2019-03-27 11:11:01 -0700
Committer:	Facebook Github Bot	2019-03-27 11:16:40 -0700

Add qtensors in caffe2 protobuf argument (#18486) Summary: We are about to merge onnxifi quantization support soon. Before that, I would like to merge this diff seperately to make sure it doesnt break anything. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18486 Reviewed By: bddppq, houseroad Differential Revision: D14626419 Pulled By: yinghai fbshipit-source-id: 504c1eae60be1e629203267b59defb8b69d82c0a

2019-03-19

Commit:	3d44305
Author:	David Riazati	2019-03-18 18:15:17 -0700
Committer:	Facebook Github Bot	2019-03-18 18:18:22 -0700

Attribute serialization (#17423) Summary: Allows serialization/loading of attributes (`IValue`s of any type). * metadata (attribute name, type) is stored in the `model.json` * The binary format is a subset of the `pickle` module that supports the operations necessary for `IValue`s * Attributes are serialized in the order they are defined on a module to a list in a single `attributes` file, with submodule attributes coming first. This order directly matches the order attributes are listed in `model.json` * This can be inspected in Python with `pickle.load()` or with `pickletools` (PyTorch need not be installed for this to work) * A class is used to store a tensor's index into the tensor table of the model, so to unpickle the file you have to use a custom Unpickler: ```python class TensorID(object): def __setstate__(self, id): self.id = id class JitUnpickler(pickle.Unpickler): def find_class(self, module, name): if module == '__main__' and name == 'TensorID': return TensorID JitUnpickler(open("my_model/attributes.pkl", "rb")).load() ``` * pickle format: https://svn.python.org/projects/python/trunk/Lib/pickletools.py * It currently does not support/guarantee that anything saved out with `pickle` (i.e. if you edit `attributes` with `pickle` directly) instead of our tools will be imported correctly Also will fix #17683 and fix #16367 Followup Work: * document format / choice of pickle: #17951 * create an example * list specializations * int size specializations, large binputs * do a first pass over attributes to output only necessary `BINPUT` ops * attribute reassignment (e.g `self.my_attribute = new_value`) * `tensor.save("some_checkpoint.pkl")` support with tensors embedded in Pickle file Pull Request resolved: https://github.com/pytorch/pytorch/pull/17423 Differential Revision: D14470965 Pulled By: driazati fbshipit-source-id: 6a21a9939efdbe59b4bc57fd31d6d630bab5297e

2019-03-15

Commit:	18f721f
Author:	Michael Suo	2019-03-15 12:00:50 -0700
Committer:	Facebook Github Bot	2019-03-15 12:06:23 -0700

support serialization of classes (#17856) Summary: Stack:     :black_circle:  **#17856 [jit] support serialization of classes**  [:yellow_heart:](https://our.intern.facebook.com/intern/diff/D14402599/) Add support for saving/loading TorchScript modules that depend on user-defned classes. We track class dependencies the same we track tensor constants, then write them all out such that we can just compile them in order before compiling the module hierarchy. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17856 Reviewed By: shannonzhu Differential Revision: D14461599 Pulled By: suo fbshipit-source-id: 7115f87e069fd00dc8381d7de9997864fef7ea9f

2019-02-14

Commit:	c3f5ba9
Author:	Pritam Damania	2019-02-13 19:41:25 -0800
Committer:	Facebook Github Bot	2019-02-13 19:48:11 -0800

PyTorch model metadata. (#16275) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16275 Adding a generic string `metadata` field as part of the model to capture additional metadata with the model. Reviewed By: dzhulgakov Differential Revision: D13579029 fbshipit-source-id: 7456ef2edbe73bb70bbb31889cecd94e0db329a2

2019-02-05

Commit:	9811a42
Author:	Alex Şuhan	2019-02-05 12:20:21 -0800
Committer:	Facebook Github Bot	2019-02-05 12:56:44 -0800

Add XLA / TPU device type, backend type and type id (#16763) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16763 Replicate the easy bits in https://github.com/pytorch/pytorch/pull/15153 with TPU / XLA instead of MSNPU. Also don't initialize the storage for XLA tensors for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16585 Reviewed By: ezyang Differential Revision: D13912118 Pulled By: gchanan fbshipit-source-id: 4889177e2478768fb281ed075b71146d1d850bd9

2019-02-01

Commit:	7e642df
Author:	Roy Li	2019-02-01 10:55:00 -0800
Committer:	Facebook Github Bot	2019-02-01 11:00:16 -0800

Introduce backend extensions (overriding operators on custom backends) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15153 Reviewed By: gchanan Differential Revision: D13445571 fbshipit-source-id: 62e2ebe0a6e81c4983b47cddb57ee5eb78e96708

2019-01-11

Commit:	d408324
Author:	Sebastian Messmer	2019-01-10 16:06:27 -0800
Committer:	Facebook Github Bot	2019-01-10 16:22:22 -0800

Move files to/from c10/core and c10/util (#15316) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15316 This starts cleaning up the files in c10 according to the module structure we decided on. Move to c10/util: - Half.h, Half-inl.h, Half.cpp, bitcasts.h Move to c10/core: - Device.h, Device.cpp - DeviceType.h, DeviceType.cpp i-am-not-moving-c2-to-c10 Reviewed By: dzhulgakov Differential Revision: D13498493 fbshipit-source-id: dfcf1c490474a12ab950c72ca686b8ad86428f63

2018-12-28

Commit:	cd3c4a2
Author:	Dong Li	2018-12-28 15:00:41 -0800
Committer:	Facebook Github Bot	2018-12-28 15:03:23 -0800

keep extra_info of each op in ProfDagStats (#15244) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15244 This DIFF keeps track of the extra_info information attached to each operator. When getPerOpStas() is called, it attaches the extra_info to the result ProfDagStats protobuf. Facebook Net transform attaches a global_op_id which is defined as a tuple of (orig_net_name, original_op_index) to each operator, The global_op_id is encoded as extra_info in each operator. Reviewed By: aazzolini Differential Revision: D13016289 fbshipit-source-id: 3e2719ec7ed0ebe47740b77581c565ff7e79b102

2018-12-22

Commit:	90aa21e
Author:	Pritam Damania	2018-12-21 17:34:51 -0800
Committer:	Facebook Github Bot	2018-12-21 17:42:38 -0800

Metadata for input/output formats in model file proto. (#15252) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15252 We would like to extend the model file format to include strongly type, semantic information about the model inputs and outputs. The goal is for a user to be able to consider a model file like a function with a well defined API describing what the inputs and outputs would be. Reviewed By: dzhulgakov Differential Revision: D13009915 fbshipit-source-id: 5df124a876ad03c05fbdaacae0eab659637734c1

2018-12-03

Commit:	e0f6867
Author:	Lu Fang	2018-12-03 14:07:50 -0800
Committer:	Facebook Github Bot	2018-12-03 14:10:30 -0800

Restore device when import jit script module (#14454) Summary: We align the restore logic to `torch.load`, we try to restore to the right device, and if the device is not available, an exception is raised. We allow user to remap the device through a parameter `map_location`, it can be 1) a string like 'cuda:0`, `cpu`, 2) a device, torch.device('cpu'), 3) a dict, {'cuda:1', 'cuda:0'}, and a function, and its signature looks like string map_location(tensor, saved_device_string). Pull Request resolved: https://github.com/pytorch/pytorch/pull/14454 Reviewed By: zrphercule Differential Revision: D13271956 Pulled By: houseroad fbshipit-source-id: dfd6b6049b0dc07549ddeddf2dea03ac53ba6d49

2018-12-01

Commit:	170ff77
Author:	Zachary DeVito	2018-11-30 19:15:09 -0800
Committer:	Facebook Github Bot	2018-11-30 19:19:29 -0800

Use a zip archive as our container format (#14521) Summary: After consulting with Owen, who pointed out the existence of the miniz library, I decided to take one last shot at using zip as our container format. miniz makes this surprisingly feasible and I think the benefits of using zip are large enough that we should do it. This replaces our custom container format with a zip archive, preserving all of the desirable features of our custom format, such as append-oriented writing, and mmap'able tensor data while adding a bunch of debugging advantages: 1. You can unzip and explore the container to debug what is going on with a model. 2. You can edit the model using a text editor (e.g. change the definition of a method, or editing the json-serialized meta-data), re-zip the file use OSX's native 'Compress' option, and re-load the result into pytorch. Note: this enables you to, e.g., print-debug serialized models. 3. We can easily enable features like compression in the future. 4. Stock python , without pytorch installed, and other programming languages can reasonably consume this format,using json and zipfile packages, which enables people to build tools like visualizers without those visualizers depending on pytorch. This will be especially useful if you want to, for instance, write a visualizer in javascript. Notes: * This add miniz (https://github.com/richgel999/miniz) as a dependency. miniz is a self-contained library for reading/writing zipfiles that unlike other zip libraries also includes libz compatible compress/decompress support. It is a single header and a single C file without any other dependencies. Note that the instructions for miniz explicitly state: > Please use the files from the releases page in your projects. Do not use the git checkout directly! So we have checked in the 'release' source. Miniz supports zip64, and its API is amenable to doing zip-align style things to align data. * Removes 'size' from RecordRef. This allows you to edit files in the zip archive without editing the meta-data file. Very important if you want to print-debug serialized models. * PyTorchStreamReader/PyTorchStreamWriter keep mostly the same API (though keys become strings) However, their implementation is completely swapped out to use miniz. * Code exists to check for the old magic number to give a decent warning to our preview users after we change the format. * Container version information is now put in a stand-alone 'version' file in the archive and serves a similar purpose to the other container version info. * All files in the zip archive start at 64-byte boundaries, using an approach similar to zip-align. Tests check that this property remains true. While the writer does this, the reader doesn't depend on it, allowing user-created archives that can use compression, and do not have to align data. * Added test to check for > 4GB files and archives. Disabled by default because it takes almost 2 minutes to run. * torchscript files are now optional: if a submodule does not have methods, it will not be written. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14521 Reviewed By: jamesr66a Differential Revision: D13252945 Pulled By: zdevito fbshipit-source-id: 01209294c0f6543d0fd716f85a38532249c52f8c

2018-11-30

Commit:	fd31eae
Author:	Zachary DeVito	2018-11-29 17:51:45 -0800
Committer:	Facebook Github Bot	2018-11-29 17:53:49 -0800

Switch import/export to python printing (#14400) Summary: Stacked on https://github.com/pytorch/pytorch/pull/14378, only look at the last commit. This changes the way methods are defined in TorchScript archives to use PythonPrint rather than ONNX protobufs. It also updates torch.proto to directly document the tensor data structure actually being serialized. Notes: * because PythonPrint prints all the methods at once per module, this removes MethodDef in favor of a single torchscript_area and a separate caffe2_graphs entry. Note that NetDef's already have method names, so there is no need or a separate method name entry. * This switches cpp/pickle area to RecordRef (references to a file in the container format) since it is possible the data in these arenas may be large and not suited to json ouput. * Removes 'annotations' -- annotations should be re-added on the first commit that actually has a practical use for them. In the current state it is unlikely they are representing the right information. * Some expect files have changed because PythonPrint is preserving more debug name information for parameter names. * MethodEncoder (the ONNX output format) has been deleted. There is still some cleanup possible combining EncoderBase and GraphEncode now that there is only a single pathway using EncoderBase. * This incorporates the changes from #14397 to define TensorDef Pull Request resolved: https://github.com/pytorch/pytorch/pull/14400 Reviewed By: suo Differential Revision: D13231800 Pulled By: zdevito fbshipit-source-id: af5c1152d0bd6bca8b06c4703f59b161bb19f571

2018-11-21

Commit:	7a65461
Author:	Lu Fang	2018-11-20 23:33:30 -0800
Committer:	Facebook Github Bot	2018-11-20 23:37:50 -0800

Add tensor table in ModelDef and use it for jit script serialization and deserialization (#13861) Summary: As we discussed, the tensors in the torch script will be associated with the tensor data in the serialized file. So let's add a table of tensor (actually it's a repeated TensorProto filed) in the ModelDef. TensorProto.name will be the id. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13861 Reviewed By: dzhulgakov Differential Revision: D13036940 Pulled By: zrphercule fbshipit-source-id: ecb91b062ac4bc26af2a8d6d12c91d5614efd559

2018-11-19

Commit:	f34c848
Author:	Lu Fang	2018-11-19 14:29:31 -0800
Committer:	Facebook Github Bot	2018-11-19 14:34:05 -0800

Store the optimize flag in module (#14166) Summary: When the save/load of script module, we store optimize flag in module instead of encoding it in method. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14166 Reviewed By: ezyang Differential Revision: D13117577 Pulled By: dzhulgakov fbshipit-source-id: dc322948bda0ac5809d8ef9a345497ebb8f33a61

2018-11-14

Commit:	e2a7d43
Author:	Lu Fang	2018-11-14 00:19:08 -0800
Committer:	Facebook Github Bot	2018-11-14 00:22:09 -0800

Use the torch.proto to store script module (#13736) Summary: Directly operate protobuf in the serializer/deserializer. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13736 Reviewed By: dzhulgakov Differential Revision: D13028487 Pulled By: houseroad fbshipit-source-id: e578474008874f00f2a22f0a2ffd85f52643881a

2018-11-05

Commit:	274f3c0
Author:	Hector Yuen	2018-11-04 21:45:40 -0800
Committer:	Facebook Github Bot	2018-11-04 21:47:45 -0800

add explicit fpga context (#13318) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13318 add a context to describe fpga this will remove the need of having opencl with fpga engine the next step is to change the opencl implementation to explicitly use the fpga context Reviewed By: soumith Differential Revision: D12828795 fbshipit-source-id: 0700a83672d117d7aa3d941cd39c2ae627cb6e5f

2018-10-26

Commit:	2f15428
Author:	Roy Li	2018-10-26 09:25:25 -0700
Committer:	Facebook Github Bot	2018-10-26 09:27:11 -0700

reduce Device to 32bits (#12767) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12767 In preparation of using TypeMeta in TensorOptions. We need TensorOptions to fit in 128-bits, this isn't possible if both TypeMeta and Device are 64-bit. Reviewed By: ezyang Differential Revision: D10416051 fbshipit-source-id: 23c75db14650f7f3045b1298977f61a0690a8534

2018-10-25

Commit:	e290a9d
Author:	Junjie Bai	2018-10-24 17:08:06 -0700
Committer:	Facebook Github Bot	2018-10-24 17:11:25 -0700

Back out "Migrate DeviceOption.numa_node_id to DeviceOption.device_id" Summary: Original commit changeset: 82583d0ad4b8 Reviewed By: enosair, ilia-cher Differential Revision: D10560741 fbshipit-source-id: e289a37d441bd2243b369810abf451292891d9ee

2018-10-19

Commit:	202893f
Author:	Junjie Bai	2018-10-19 12:43:46 -0700
Committer:	Facebook Github Bot	2018-10-19 12:45:48 -0700

Migrate DeviceOption.numa_node_id to DeviceOption.device_id Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12717 Reviewed By: ilia-cher Differential Revision: D10408325 fbshipit-source-id: 82583d0ad4b8db094ee4c5c607b52500826328f7

2018-10-16

Commit:	30aaa07
Author:	Lu Fang	2018-10-16 16:35:18 -0700
Committer:	Facebook Github Bot	2018-10-16 16:36:58 -0700

New serialization format (#12384) Summary: Addressed Dima's feedback. The proposal is here: https://fb.quip.com/TbQmAuqIznCf Pull Request resolved: https://github.com/pytorch/pytorch/pull/12384 Reviewed By: dzhulgakov Differential Revision: D10246743 Pulled By: houseroad fbshipit-source-id: c80db0c35d60ca32965275da705f2b1dfb2a7265

2018-10-11

Commit:	89010d6
Author:	Junjie Bai	2018-10-10 18:10:49 -0700
Committer:	Facebook Github Bot	2018-10-10 18:25:06 -0700

Migrate HIP to use DeviceOption.device_id and delete DeviceOption.hip_gpu_id Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12546 Reviewed By: hyuen, xw285cornell Differential Revision: D10305222 fbshipit-source-id: 955e1d2878508a25fe4e9980ae66f8f54aaf7db9

2018-10-09

Commit:	f54ab54
Author:	Junjie Bai	2018-10-09 15:44:49 -0700
Committer:	Facebook Github Bot	2018-10-09 15:54:04 -0700

Rename cuda_gpu_id to device_id in DeviceOption (#12456) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12456 codemod with 'Yes to all' codemod -d . --extensions h,cc,cpp,cu,py,proto,pbtxt,pb.txt,config cuda_gpu_id device_id Overload TextFormat::ParseFromString to do string replace when parsing from protobuf format Reviewed By: Yangqing Differential Revision: D10240535 fbshipit-source-id: 5e6992bec961214be8dbe26f16f5794154a22b25

2018-10-02

Commit:	1d3f650
Author:	Dmytro Dzhulgakov	2018-10-02 00:31:42 -0700
Committer:	Facebook Github Bot	2018-10-02 00:43:40 -0700

Revert D10098106: [pytorch][PR] [WIP] New version of PT1 model format Differential Revision: D10098106 Original commit changeset: 94ec7fc57c84 fbshipit-source-id: 38f729b0970618f38359797b806cbbcd865f4715

Commit:	ff608a9
Author:	Junjie Bai	2018-10-01 21:44:08 -0700
Committer:	Facebook Github Bot	2018-10-01 21:54:52 -0700

Back out "Revert D10123245: Back out "codemod cuda_gpu_id to device_id"" (#12232) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12232 Original commit changeset: fca91fea58b7 This adds proper modifications to the DeviceType <->DeviceOption conversion code added in D10033396 Reviewed By: jerryzh168 Differential Revision: D10132473 fbshipit-source-id: 801ef777e2950982cb47b48051b1471a0a91e64b

2018-10-01

Commit:	35becd1
Author:	Lu Fang	2018-10-01 15:42:45 -0700
Committer:	Facebook Github Bot	2018-10-01 15:57:02 -0700

New version of PT1 model format (#12149) Summary: Considered four different existing formats: 1) static graph, 2) torch script, 3) pickle files, 4) PyTorch C++ serialize APIs Pull Request resolved: https://github.com/pytorch/pytorch/pull/12149 Reviewed By: BIT-silence Differential Revision: D10098106 Pulled By: houseroad fbshipit-source-id: 94ec7fc57c842e50fae5286ddeda657a4967a07a

Commit:	3010dc4
Author:	Rick Ratmansky	2018-10-01 12:09:39 -0700
Committer:	Facebook Github Bot	2018-10-01 12:22:36 -0700

Revert D10123245: Back out "codemod cuda_gpu_id to device_id" Differential Revision: D10123245 Original commit changeset: d83da8e00a12 fbshipit-source-id: fca91fea58b7df208edc2e218a1d514f9821ec7b

Commit:	7d7d336
Author:	Yang Liu	2018-10-01 11:13:44 -0700
Committer:	Facebook Github Bot	2018-10-01 11:31:14 -0700

Back out "codemod cuda_gpu_id to device_id" Summary: Original commit changeset: f5614a5d2607 D9986213 is causing Multifeed Aggregator a [huge performance different](https://our.intern.facebook.com/intern/ads/analyze_canary/412951953278781781/) and is blocking aggregator push since last Friday night: https://fburl.com/feedtools/b6izvwjz We need to land this revert ASAP to unblock aggregator push. Reviewed By: orionr Differential Revision: D10123245 fbshipit-source-id: d83da8e00a1250f5d09811a0a587c127e377aab2

2018-09-28

Commit:	3eb5940
Author:	Junjie Bai	2018-09-27 20:14:15 -0700
Committer:	Facebook Github Bot	2018-09-27 20:24:53 -0700

codemod cuda_gpu_id to device_id (#12022) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12022 codemod -d . --extensions h,cc,cpp,cu,py,proto,pbtxt,pb.txt,config cuda_gpu_id device_id codemod with 'Yes to all' Reviewed By: orionr Differential Revision: D9986213 fbshipit-source-id: f5614a5d26078817aee8caf79a494abfd6a95ff1

2018-09-21

Commit:	30521a3
Author:	Roy Li	2018-09-20 18:51:31 -0700
Committer:	Facebook Github Bot	2018-09-20 18:55:19 -0700

codemod: caffe::float16 -> at::Half (#11785) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11785 Replace each instead of float16 with Half. Reviewed By: Yangqing Differential Revision: D9892158 fbshipit-source-id: b9225ca7bd5c84fd1c04a9d24b026c8b6cbff120

2018-09-19

Commit:	32494c2
Author:	Lu Fang	2018-09-19 08:39:43 -0700
Committer:	Facebook Github Bot	2018-09-19 08:41:33 -0700

OperatorDef <==> NodeProto Conversion (#11621) Summary: Operator level proto conversion between (new) torch proto and (old) caffe2 proto. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11621 Reviewed By: BIT-silence Differential Revision: D9892422 Pulled By: houseroad fbshipit-source-id: 01a55ec0a09479876a27082d90fc970723f4d431

2018-09-11

Commit:	727a445
Author:	Lu Fang	2018-09-11 10:44:00 -0700
Committer:	Facebook Github Bot	2018-09-11 10:55:43 -0700

New Serialization Proto Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11166 Reviewed By: mingzhe09088 Differential Revision: D9623522 Pulled By: houseroad fbshipit-source-id: f21153034a398de7959404321d8534234cd58a40

2018-09-05

Commit:	9f4bcdf
Author:	Jerry Zhang	2018-09-05 16:13:54 -0700
Committer:	Facebook Github Bot	2018-09-05 16:28:09 -0700

caffe2::DeviceType -> at::DeviceType (#11254) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11254 Previously we use DeviceType in caffe2.proto directly, but it's an `enum` and have implicit conversion to int, which does not have type safety, e.g. we have to explicitly check for a device type is valid in event.h: ``` template <int d> struct EventCreateFunctionRegisterer { explicit EventCreateFunctionRegisterer(EventCreateFunction f) { static_assert(d < MaxDeviceTypes, ""); Event::event_creator_[d] = f; } }; ``` at::DeviceType is an `enum class`, and it does not have implicit conversion to int, and provides better type safety guarantees. In this diff we have done the following refactor(taking CPU as an example): 1. caffe2::DeviceType → caffe2::DeviceTypeProto 2. caffe2::CPU → caffe2::PROTO_CPU 3. caffe2::DeviceType = at::DeviceType 4. caffe2::CPU = at::DeviceType::CPU codemod -d caffe2/caffe2 --extensions h,cc,cpp 'device_type, ' 'device_type(), PROTO_' + some manual changes In short, after this diff, in c++, caffe2::CPU refers to the at::DeviceType::CPU and the old proto caffe2::CPU will be caffe2::PROTO_CPU. In python side, we have a temporary workaround that alias `caffe2_pb2.CPU = caffe2_pb2.PROOT_CPU` to make the change easier to review and this will be removed later. Reviewed By: ezyang Differential Revision: D9545704 fbshipit-source-id: 461a28a4ca74e616d3ee183a607078a717fd38a7

2018-08-27

Commit:	9aa92bc
Author:	Junjie Bai	2018-08-27 14:52:38 -0700
Committer:	Facebook Github Bot	2018-08-27 14:55:46 -0700

Change the default value of DeviceOption.numa_node_id from -1 to 0 (#10877) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10877 change default value of DeviceOption.numa_node_id to 0 and use has_numa_node_id() to check existence Reviewed By: ilia-cher Differential Revision: D9473891 fbshipit-source-id: 91ac6a152f445644691023110c93d20a3ce80d43

2018-08-10

Commit:	40109b1
Author:	Yangqing Jia	2018-08-10 11:04:12 -0700
Committer:	Facebook Github Bot	2018-08-10 11:10:26 -0700

Remove caffe1 specific proto (#10380) Summary: This was used as a convenient way for us to convert c1 models. Now that conversion is more or less done, we should probably require any users who need to convert c1 models to explicitly install c1. This PR removes the explicit c1 proto (which was copied from c1) in favor of explicit installation. Note that caffe_translator would still work properly, only difference is that now users need to install c1 separately. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10380 Differential Revision: D9267981 Pulled By: Yangqing fbshipit-source-id: a6ce5d9463e6567976da83f2d08b2c3d94d14390

2018-08-04

Commit:	3693941
Author:	Edward Yang	2018-08-03 19:09:57 -0700
Committer:	Facebook Github Bot	2018-08-03 19:25:06 -0700

Introduce at::DeviceType, which subsumes at::Device::Type and (partially) caffe2::DeviceType (#10175) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10175 Previously, we had at::Device::Type and caffe2::DeviceType (from protobuf), intended to help us distinguish between CPU, CUDA, etc. devices. This replaces at::Device::Type entirely with at::DeviceType, which in turn is a direct, 'enum class' version of the protobuf generated caffe2::DeviceType 'enum'. We can't eliminate the 'enum' because this would a pretty drastic API change (enum is interconvertible with integers, enum class is not) but we can make the two line up exactly and share code for, e.g., printing. Reviewed By: Yangqing Differential Revision: D9137156 fbshipit-source-id: 566385cd6efb1ed722b25e6f7849a910b50342ab

2018-07-02

Commit:	553c41f
Author:	Yavuz Yetim	2018-07-02 08:53:53 -0700
Committer:	Facebook Github Bot	2018-07-02 09:09:39 -0700

Adds serialization path (#9035) Summary: Closes https://github.com/pytorch/pytorch/pull/9035 This diff builds on the structure in the stacked diff to add serialization/deserialization. It supports the old format and a new suggested format. Reviewed By: ilia-cher Differential Revision: D8415115 fbshipit-source-id: acaacce2b015f4c6ac0ae22625455290a3f30262

2018-06-26

Commit:	edb88b5
Author:	Orion Reblitz-Richardson	2018-06-26 14:55:48 -0700
Committer:	GitHub	2018-06-26 14:55:48 -0700

Update from Facebook (#8887) * add opencl + fpga context adds an opencl context inside caffe2/fb which can be used for fpga access * [Caffe2] Force tensor inference checks to be triggered during testing We've started to rely on TensorInference functions more for different analysis. This diff ensures that the TensorInference function's result matches what is expected from the definition of the operator. * Enable building //caffe2:torch with @mode/opt In @mode/opt, python runs out of a PAR, which breaks a lot of assumptions in the code about where templates/ folders live relative to __file__. Rather than introduce hacks with parutil, I simply turn template_path into a parameter for all the relevant functions and thread it through from the top level. * [Caffe2] Fix cost models for DotProduct and Div. Update Tensor Inference for dot product As title. DotProduct states that output is a 1-D tensor (https://caffe2.ai/docs/operators-catalogue.html#dotproduct) though code suggests it is either 0- or 1-D depending on inputs. TensorInference defined to support implementation. * [SG-MoE] Add an option to make the experts NOT as components * [nomnigraph] Rename and fixup convertToNeuralNetOperator API This will make things a bit cleaner * no longer symlink THNN.h and THCUNN.h * forced decoder network (onnx export) Closes https://github.com/pytorch/translate/pull/95 Add networks in ensemble_export.py to create a forced decoding network from PyTorch NMT checkpoints. This network takes an arbitrary numberized (source, target) pair and returns the model score for the translation, including penalties. Vocabulary reduction networks are also supported, but note that target indices which are not in the possible_translation_tokens generated for the source input will be trea * Revert schema change to fix production models Revert schema change to fix production models * MockLogDeviceReader - rebase on FIX # Goal 1), Build a make_mock_log_device_reader using make_mock_reader 2), Replace the real log_device_reader here: https://fburl.com/raihwf1p # Log by D8151734 Real log_device_reader: ``` I0529 20:29:05.373108 954994 tensor.h:839] Tensor print_net/log of type std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >. Dims: (): read_net/ParseOpenTrainingRow:0 I0529 20:29:05.373244 954994 tensor.h:839] Tensor read_net/ParseOpenTrainin * [C2/D2][1/n]: Nonnegative-Constrained Optimization -- log barrier implement log barrier as a regularization method * Add teacher weight screening. Add teacher weight sceening according to teacher labels. If teacher label is zero, we do not use the distill loss in the objective function. * Add NormalizerContext See task for more detail. This implementation is a copy of what exists for RegularizerContext except for how the parameters are defined in the model_definition thrift file. I'll try an alternative implementation which overrides the default arguments of functions instead like for argscopes in tensorflow. https://github.com/pytorch/pytorch/compare/master...MaximeBoucher:update-from-facebook-0939578c068c?expand=1 * Adding cosine similarity option in dot processor Add pairwise cosine similarity option in dot product. Add an option to concate dot product and cosine similarity. Add test cases. * [nomnigraph][redo] Concat elim for sparseNN Same as D7962948, which was reverted because Operator Schema was not defined * [pytorch] Revert pytorch/pytorch#7918 'Release GIL when copying to shared memory', breaks ASAN Revert this pytorch diff that breaks ASAN when running Filament in dev mode; in opt mode it gives "bad file descriptor" errors. Looks like a race when copying tensors to shared memory in multiple mp.Queue's (which spawn separate threads). https://github.com/pytorch/pytorch/pull/7918/files * [nomnigraph][mobile] Enable nomnigraph by default, use -Oz on nomnigraph related code to reduce code size enables nomnigraph and reduces codesize * [Warmup] Allow both offline incremental training and online training Change plan name on saving side and reading side to support both training type This diff depends on D8128530 and D8168651. * Revert D7802642: [Warmup] Allow both offline incremental training and online training This reverts commit afc213cf9b36cecf75333a788391c4d09f4afccc @bypass-lint An infra SEV is better than not reverting this diff. If you copy this password, see you in SEV Review! @cause_a_sev_many_files * Add legacy grad logic to fix div op on old graphs. Add legacy grad logic to fix div op on old graphs. * Correctly propagate operator failures Propagate errors from operators that throw exceptions and return false * Revert D8374829: [caffe2][nomnigraph][redo] Concat elim for sparseNN This reverts commit 6dda028c463e54bb5c32188bbbe9202107e188a5 @bypass-lint An infra SEV is better than not reverting this diff. If you copy this password, see you in SEV Review! @cause_a_sev_many_files * [Caffe2] Added extra_info to core.DeviceOption(), enforced extra_info to be inherited in scope.DeviceScope extra_info is a newly defined field in DeviceOption proto. This diff added extra_info to the core.DeviceOption(). And, In scope.DeviceScope(), this diff enforce the new scope to inherit the extra_info from old scope. * [opt] hgdirsync wasn't enabled, merge diverged code Here's the damage, P59732616 basically xplat was left behind but had the change from assert to CAFFE_ENFORCE * OMP parallelism over RoIs for RoIAlign op Simpler to parallelize over RoIs. Shouldn't affect other uses as it relies on the number of OMP threads set during startup. PR: https://github.com/pytorch/pytorch/pull/8562 * Use int64_t for shape in FillOps to avoid overflow of int32 * Implement Rotated RoIAlign op Based on Rotated RPNs as explained in https://arxiv.org/abs/1703.01086. The idea is simple - orientation/angle is added as an RPN anchor parameter and then the angle is further regressed similar to bbox coords. There are some additional changes related to NMS and IoU, but besides that it's a direct extension to Faster-RCNN. Further details in https://fb.quip.com/sZHlA1iMfWPZ. RoIs are represented in [center_x, center_y, width, height, angle] format. `angle` repre * Rotated RoIAlign op CUDA forward implementation CUDA forward impl for D8415490 * RoIAlignRotated op CUDA backward pass implementation TSIA * All remaining fixes to eliminate process_github.sh Most of this diff has already been reviewed separately, except for the parts relating to _thnn/utils.py and _utils._internal.py remove skipIf(True, 'Fbcode') line from process_github.sh replace sed of cpp file with #ifdef to control cudnnDestroy use undo sync-time deletion of .gitattributes, remove process_github.sh switch to using _utils._internal rather than try-import-except This diff also fixes the open-source bug where rebuilds have * Back out "Revert D7802642: [Warmup] Allow both offline incremental training and online training" Original commit changeset: 7707d2efe60e The original diff is backout becuase the online trainer package is backed out. This code would only work with new online trainer package * [easy] improve error log in adagrad op as title * re-allow use of thnn_h_path This fixes cffi usage in OSS * [4/4] [tum] paralyzing layerNorm for GPU full sync as title * add compile=False to pytorch tests, remove hack with pyc * Add shape and type inference for RowWiseArgMax operator See title * Revert D8515341: Back out "Revert D7802642: [Warmup] Allow both offline incremental training and online training" This reverts commit 78167eeef0af16b60f72c82f9dcdda9b41b4dcbd @bypass-lint An infra SEV is better than not reverting this diff. If you copy this password, see you in SEV Review! @cause_a_sev_many_files * [fix-flaky-test] mock_hive_reader_test flaky, because GlobalCounter collects local counts intervally # Problem `MockHiveReader` uses `GlobalCounter` to limit `max_examples`. GlobalCounter on server node collect local counts from worker nodes every 1 sec. This 1 sec delay makes it impossible to limit exactly to the `max_examples`, it will definitely exceed `max_examples`. # Plan Given, ``` Expected num_examples = max_examples + num_examples/sec (Read Speed) x 1 sec (GlobalCounter Sync Int * [Caffe2] Fix FCGradient cost inference. Prevent overflow in cost inference FCGradient missed a factor 2 in the `num_outputs == 3` case. Overflow was occurring with flop calculation for FC. Changed types to `uint64_t` to prevent future problems. * Fix binary ops with empty inputs Fix binary ops with empty inputs * Support the filling of input blob with provided data as title for Biz Integrity case * Back out "Revert D8515341: Back out "Revert D7802642: [Warmup] Allow both offline incremental training and online training"" Original commit changeset: 30c55dd38816 Original diff is reverted due to introducing bad integration test. Fixed the integration test. * [c2][easy] improve pack ops error loggings as desc. * Add ShapeTypeInference for LpNorm operator As desc * Shard test_nn to reduce runtime for each test target Closes https://github.com/pytorch/pytorch/pull/8793 The current test_nn would time out and be disabled in GreenWarden, and we need to have an option to split it up in order to pass the stress test. Right now GreenWarden roughly allows running 100 test cases in test_nn before timing out, and here we have an option to divide test_nn into 30 shards (with ~40 tests in each shard) to allow for some test suite growth in the future. * Change default caffe2_streams_per_gpu to 1 * Remove IN_SANDCASTLE from common.py and test_nn.py We prefer to disable the failing tests through Sandcastle UI instead. * Add a new class for an updated prof_dag.proto This diff contains: - An updated prof_dag.proto that contains blob profiles. - A class to deserialize this information (serialization is in a follow up diff) - Update to separate profiling information from NeuralNet (and use it as part of the class above). - Unit tests * Lambdarank for SparseNN This diff adds a lambda_rank_layer for SparseNN. changes include 1) Adds support for multi sessions in c2 op 2) Adds support for two different loss functions in c2 op 3) Unit tests for op * Revert D8586950: Back out "Revert D8515341: Back out "Revert D7802642: [Warmup] Allow both offline incremental training and online training"" This reverts commit 012220ed63eccc35659a57b31d16a3625da6317b @bypass-lint An infra SEV is better than not reverting this diff. If you copy this password, see you in SEV Review! @cause_a_sev_many_files * [easy] A few fixups to multithread predictor benchmark (1) support perf on T6 server (2) remove dead code * fix a bug about the map size as title * Fix reduce sum on in-place case. Fix reduce sum on in-place case. * [Warmup] Reland reverted diff Allow both offline incremental training and online training Closes https://github.com/pytorch/pytorch/pull/8827 fix net transform integration test. Allow offline and online trainer to coexist D7802642. * Add StoreHandlerNotAvailableException Add an exception for a store that is not available or has been deleted. * Use exception handling for fault tolerance, missing KV store Remove status blobs to communication ops so that exceptions propagate on failure. * [C2/D2][2/n]: Nonnegative-Constrained Optimization -- bounded grad proj for simple bounded constrained optimization, incl non-negative box constraints. * [GanH]: Adaptive Weighting with More Estimations With implemented postivity optimization, we now learn adaptive weights with different parameterizations. This improves parameter estimation and training stability. * Revert some changes for landing * Remove AutoNoGIL in StorageSharing * Temporarily disable net_tests * Revert "[Caffe2] Force tensor inference checks to be triggered during testing" This reverts commit 67ef05c22b2f71b4a489695384932f968384a2a4. * Revert "Fix reduce sum on in-place case." This reverts commit 6cb8a8e1b3db7b6d20941b0053e3f3836068eb64. * Revert "Revert "Fix reduce sum on in-place case."" This reverts commit 130a257c0893dc09f4bd6e6a45d112261807fd2c.

2018-06-13

Commit:	7ca8e2f
Author:	Yangqing Jia	2018-06-13 06:33:05 -0700
Committer:	Lu Fang	2018-06-13 21:33:05 +0800

fix old comment to point to the right file (#8416)

2018-06-04

Commit:	e5b9972
Author:	bddppq	2018-06-04 09:04:30 -0700
Committer:	GitHub	2018-06-04 09:04:30 -0700

[Caffe2] Enabling AMD GPU Backend for Caffe2 (#7955) * Add hip support for caffe2 core * Add MIOPEN header/wrapper to caffe2 core * Add HIP device into caffe2 PB * top level makefile change for rocm/hip * makefile scaffolding for AMD/RocM/HIP * Makefile scafodding for AMD/RocM/HIP; add makefile/utility for HIP files * caffe2 PB update for AMD/ROCM HIP device * Add AMD/RocM/Thrust dependency * HIP threadpool update * Fix makefile macro * makefile fix: duplicate test/binary name * makefile clean-up * makefile clean-up * add HIP operator registry * add utilities for hip device * Add USE_HIP to config summary * makefile fix for BUILD_TEST * merge latest * Fix indentation * code clean-up * Guard builds without HIP and use the same cmake script as PyTorch to find HIP * Setup rocm environment variables in build.sh (ideally should be done in the docker images) * setup locale * set HIP_PLATFORM * Revert "set HIP_PLATFORM" This reverts commit 8ec58db2b390c9259220c49fa34cd403568300ad. * continue the build script environment variables mess * HCC_AMDGPU_TARGET * Cleanup the mess, has been fixed in the lastest docker images * Assign protobuf field hip_gpu_id a new field number for backward compatibility * change name to avoid conflict * Fix duplicated thread pool flag * Refactor cmake files to not add hip includes and libs globally * Fix the wrong usage of environment variables detection in cmake * Add MIOPEN CNN operators * Revert "Add MIOPEN CNN operators" This reverts commit 6e89ad4385b5b8967a7854c4adda52c012cee42a. * Resolve merge conflicts * . * Update GetAsyncNetHIPThreadPool * Enable BUILD_CAFFE2 in pytorch build * Unifiy USE_HIP and USE_ROCM * always check USE_ROCM * . * remove unrelated change * move all core hip files to separate subdirectory * . * . * recurse glob core directory * . * correct include * .

2018-06-01

Commit:	82b981e
Author:	Bram Wasti	2018-06-01 14:41:09 -0700
Committer:	Soumith Chintala	2018-06-01 17:41:09 -0400

Update from facebook 1ee4edd286a3 (#8040) * Adding instance weight to batch distill loss as title * add bfloat 16-31 added bfloat 16-31 and their respective unit tests * [CUDA9] Upgrade - fbcode CUDA9 upgrade diff D5654023 has been out for a while thanks to Pieter. But with time growing it's becoming quite hard to rebase, because of the symlinks and auto-generated build/config files in tp2. Break D5654023 into two diffs, one touching tp2 config files, and another one touching fbcode TARGETS file (adding nvcc flag). These two should be a bit easier to rebase (for detailed procedure see "Test Plan"). This diff can only be committed if: 1. CUDA 9 rpm is rolled out fleet-wide (TBD) 2. NVidia driver 390.40 is rolled out fleet-wide (done) 3. Upgrade CUDA 9.1, cudnn 7.1, nccl 2.1 (done) 4. Make sure all dependents are built (done) 5. Test all C2 operators, PyTorch (see test plan) * Share intermediate int32 buffer across Conv ops Adding a known type * [C2 fix] infer function for ensure_cpu_output_op this is adding the missing device funtion for ensure_cpu_output_op * [int8] Add blob serializer/deserializer for Int8TensorCPU To export to logfiledb * [nomnigraph] Add try catch block to optimization passes in predictor This will catch failures that happen in the optimization pass. * Caffe2: avoid static initialization order fiasco for CAFFE_ENFORCE CAFFE_ENFORCE uses strack trace fetcher. Which is currently a global static variable. If at static initialization time CAFFE_ENFORCE is used, this is a SIOF. Recently CAFFE_ENFORCE was added into init functions registration, so we started to see this. Meyers singleton is going to provide safety here. If stacktrace fetcher was not registered yet, it will just use a dummy one. * NUMA support in SparseNN CPU benchmark Adding support for NUMA in SparseNN CPU benchmark * [mobile-roofline] Add logging needed for roofline model This should be all that's needed * Let the operators using the same input if the operators are not chained or else, we have to change the input data dims * fix null-pointer-use UBSAN errors in in reshape_op.h * revert previous fix on input blob name as title * Adding flag to let MineHardNegative automatically extract single value from dict Model exporter requires the output of the model to be a struct. This makes it convenient to use those models directly in MineHardNegative by allow automatic extraction of the single element of dict, which is a common use case. * Reverting change that broke internal tests back to OSS compatible state

2018-05-24

Commit:	966c658
Author:	bddppq	2018-05-23 17:58:47 -0700
Committer:	Orion Reblitz-Richardson	2018-05-23 17:58:47 -0700

Revert "[Caffe2] Enabling AMD GPU Backend for Caffe2" (#7802) * Revert "[auto] Update onnx to 4898c9e - Added TensorDenotation and metadata_props for images (onnx/onnx#879) https://github.com/onnx/onnx/commit/4898c9e925b57ad4c58518515b98de66966ad3b1" This reverts commit 9c679dab5fe7cac27bb8c783fd143276e6046ef1. * Revert "Add BiasCHW fallback for GPU (#7738)" This reverts commit 14ad2e74f108d13ec98abb078f6aa7f01aae0aad. * Revert "[Caffe2] Enabling AMD GPU Backend for Caffe2 (#7566)" This reverts commit 2ebcf4bb37739733e76b754284cf8b2ffcba1c30.

2018-05-23

Commit:	2ebcf4b
Author:	Peter Yeh	2018-05-23 15:13:09 -0700
Committer:	bddppq	2018-05-23 15:13:09 -0700

[Caffe2] Enabling AMD GPU Backend for Caffe2 (#7566) * Add hip support for caffe2 core * Add MIOPEN header/wrapper to caffe2 core * Add HIP device into caffe2 PB * top level makefile change for rocm/hip * makefile scaffolding for AMD/RocM/HIP * Makefile scafodding for AMD/RocM/HIP; add makefile/utility for HIP files * caffe2 PB update for AMD/ROCM HIP device * Add AMD/RocM/Thrust dependency * HIP threadpool update * Fix makefile macro * makefile fix: duplicate test/binary name * makefile clean-up * makefile clean-up * add HIP operator registry * add utilities for hip device * Add USE_HIP to config summary * makefile fix for BUILD_TEST * merge latest * Fix indentation * code clean-up * Guard builds without HIP and use the same cmake script as PyTorch to find HIP * Setup rocm environment variables in build.sh (ideally should be done in the docker images) * setup locale * set HIP_PLATFORM * Revert "set HIP_PLATFORM" This reverts commit 8ec58db2b390c9259220c49fa34cd403568300ad. * continue the build script environment variables mess * HCC_AMDGPU_TARGET * Cleanup the mess, has been fixed in the lastest docker images * Assign protobuf field hip_gpu_id a new field number for backward compatibility * change name to avoid conflict * Fix duplicated thread pool flag * Refactor cmake files to not add hip includes and libs globally * Fix the wrong usage of environment variables detection in cmake * Add MIOPEN CNN operators * Revert "Add MIOPEN CNN operators" This reverts commit 6e89ad4385b5b8967a7854c4adda52c012cee42a.

2018-04-23

Commit:	26ddefb
Author:	Jinghui	2018-04-23 12:58:14 +0800
Committer:	Yinghai Lu	2018-04-22 21:58:14 -0700

[feature request] [Caffe2] Enable MKLDNN support for inference (#6699) * Add operators based-on IDEEP interfaces Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Enable IDEEP as a caffe2 device Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add test cases for IDEEP ops Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add IDEEP as a caffe2 submodule Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Skip test cases if no IDEEP support Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Correct cmake options for IDEEP Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add dependences on ideep libraries Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Fix issues in IDEEP conv ops and etc. Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Move ideep from caffe2/ideep to caffe2/contrib/ideep Signed-off-by: Gu Jinghui <jinghui.gu@intel.com> * Update IDEEP to fix cmake issue Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Fix cmake issue caused by USE_MKL option Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Correct comments in MKL cmake file Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

2018-04-20

Commit:	a73b3fd
Author:	Bram Wasti	2018-04-20 11:31:21 -0700
Committer:	GitHub	2018-04-20 11:31:21 -0700

[caffe2][opencl] Add OpenCL context (#6777)

2018-04-18

Commit:	6223bfd
Author:	Orion Reblitz-Richardson	2018-04-17 23:36:40 -0700
Committer:	bddppq	2018-04-17 23:36:40 -0700

Update from Facebook (#6692) * [GanH][Easy]: Add assertion to adaptive weighting layer 0 weight causes numeric instability and exploding ne * [Easy] Add cast op before computing norm in diagnose options As LpNorm only takes floats we add a manual casting here. * Introduce a new caching device allocator `cudaMalloc` and `cudaFree` calls are slow, and become slower the more GPUs there are. Essentially, they grab a host-wide (not device-wide) lock because GPU memory is transparently shared across all GPUs. Normally, this isn't much of a concern since workloads allocate memory upfront, and reuse it during later computation. However, under some computation models (specifically, memory conserving approaches like checkpoint-and-recompute, see https://medium.com/@yaroslavvb/fitting-larger-networks-into-memory-583e3c758ff9) this assumption is no longer true. In these situations, `cudaMalloc` and `cudaFree` are common and frequent. Furthermore, in data parallel contexts, these calls happen at nearly the same time from all GPUs worsening lock contention. A common solution to this problem is to add a custom allocator. In fact, nVIDIA provides one out of the box: CUB, which Caffe2 already supports. Unfortunately, the CUB allocator suffers from very high fragmentation. This is primarily because it is a "buddy" allocator which neither splits nor merges free cached blocks. Study https://github.com/NVlabs/cub/blob/1.8.0/cub/util_allocator.cuh#L357 if you want to convince yourself. This diff adapts a caching allocator from the Torch codebase https://github.com/torch/cutorch/blob/master/lib/THC/THCCachingAllocator.cpp which does splitting and merging and ends up working really well, at least for workloads like the checkpoint-and-recompute computation models noted above. I simplified the implementation a little bit, made it a bit more C++-like. I also removed a bunch of stream synchronization primitives for this diff. I plan to add them back in subsequent diffs. * Report reader progress in fblearner workflows Integrate with fblearner progress reporting API and add support to report training progress from reader nodes. If reader is constructed with batch limits, report based on finished batch vs total batch. The finished batch may be more than total batch because we evaludate if we should stop processing everytime we dequeue a split. If no limit for the reader, report based on finished splits (Hive files) vs total splits. This is fairly accurate. * [GanH][Diagnose]: fix plotting 1. ganh diagnose needs to set plot options 2. modifier's blob name is used for metric field can need to be fixed before generating net * Automatic update of fbcode/onnx to 985af3f5a0f7e7d29bc0ee6b13047e7ead9c90c8 * Make CompositeReader stops as soon as one reader finishes Previously, CompositeReader calls all readers before stopping. It results in flaky test since the last batch may be read by different threads; resulting in dropped data. * [dper] make sure loss is not nan as desc. * [rosetta2] [mobile-vision] Option to export NHWC order for RoIWarp/RoIAlign Thanks for finding this @stzpz and @wangyanghan. Looks like NHWC is more optimized. For OCR though it doesn't yet help since NHWC uses more mem b/w but will soon become important. * Intra-op parallel FC operator Intra-op parallel FC operator * [C2 Proto] extra info in device option passing extra information in device option design doc: https://fb.quip.com/yAiuAXkRXZGx * Unregister MKL fallbacks for NCHW conversions * Tracing for more executors Modified Tracer to work with other executors and add more tracing * Remove ShiftActivationDevices() * Check for blob entry iff it is present When processing the placeholders ops, ignore if the blob is not present in the blob_to_device. * Internalize use of eigen tensor Move use of eigen tensor out of the header file so we don't get template partial specialization errors when building other libraries. * feature importance for transformed features. * - Fix unused parameter warnings The changes in this diff comments out unused parameters. This will allow us to enable -Wunused-parameter as error. #accept2ship * add opencv dependencies to caffe2 The video input op requires additional opencv packages. This is to add them to cmake so that it can build * Add clip_by_value option in gradient clipping Add clip_by_value option in gradient clipping when the value is bigger than max or smaller than min, do the clip * std::round compat

2018-03-06

Commit:	9e71de3
Author:	Dmytro Dzhulgakov	2018-03-05 19:57:14 -0800
Committer:	Dmytro Dzhulgakov	2018-03-06 00:33:11 -0800

[core] Graph-level NUMA awareness in Caffe2 Adding NUMA awareness through numa_node_id in DeviceOption. Blobs of operators with numa_node_id are allocated on corr. memory banks, using CPU pools with NUMA affinity set to run operators.

2018-02-22

Commit:	4e5df5c
Author:	Bram Wasti	2018-02-22 14:27:26 -0800
Committer:	Bram Wasti	2018-02-22 15:53:49 -0800

added debug info to OperatorDef

2017-11-14

Commit:	4847f8c
Author:	Yan Shang	2017-11-13 17:14:46 -0800
Committer:	Facebook Github Bot	2017-11-13 17:25:15 -0800

Remove unused field in tensor proto Summary: This new field is not needed anymore, so this diff removes it Reviewed By: kennyhorror Differential Revision: D6316744 fbshipit-source-id: f8afc1c42a0592fd03c7939f8e6f78afc8510ec9

2017-11-10

Commit:	fc8532c
Author:	Yangqing Jia	2017-11-10 13:03:58 -0800
Committer:	Facebook Github Bot	2017-11-10 13:14:21 -0800

Allow serialization of custom types inside Tensor Summary: The use case is that sometimes we need a Tensor of custom type instead of POD or string. This diff allows one to delegate to BlobSerializerBase to further serialize the contents inside the Tensor. Design choices: (1) Each element is serialized as a BlobProto string, and stored in the repeated string field. (2) UNDEFINED is used as the enum value for the tensor data type, and the exact type string is stored in the additional field. (3) BlobSerializer is called on each item to obtain the serialized string. (4) This requires the custom type to have copy constructor - otherwise it will simply not be possible to copy over the deserialized content without explicit type. See blob_test.cc for an example. Reviewed By: sunnieshang Differential Revision: D6300196 fbshipit-source-id: 18bf94a22a07337e0fa83d3f1004b3651e38cf27

2017-10-04

Commit:	d2e94d0
Author:	Bram Wasti	2017-10-04 11:13:33 -0700
Committer:	Facebook Github Bot	2017-10-04 11:17:57 -0700

change device enums to be contiguous Summary: quick change Reviewed By: ajtulloch Differential Revision: D5976025 fbshipit-source-id: a5a1538a380edb7c3b0af76e74c2ccee09ecb928

2017-10-02

Commit:	49396c6
Author:	Bram Wasti	2017-10-02 15:33:19 -0700
Committer:	Facebook Github Bot	2017-10-02 15:59:25 -0700

add openglv2 to experimental Summary: only changes needing review are in proto_utils.cc and caffe2.proto Reviewed By: jerryzh168 Differential Revision: D5956743 fbshipit-source-id: e03fffaf5bc8413f2320c20a89a421f1a69b2870

2017-09-13

Commit:	68f3584
Author:	Alisson Gusatti Azzolini	2017-09-13 15:49:55 -0700
Committer:	Facebook Github Bot	2017-09-13 16:04:04 -0700

Add node_name to DeviceOption Summary: Allow for generalizing net transforms. Reviewed By: Yangqing Differential Revision: D5812140 fbshipit-source-id: e3f30acad362ae1f0614ee218d331b525710b88e

2017-08-23

Commit:	e33dfe9
Author:	Ilia Cherniavskii	2017-08-22 18:59:55 -0700
Committer:	Facebook Github Bot	2017-08-22 19:01:18 -0700

Update proto definition Summary: Update Argument's definition to allow direct passing of NetDef Reviewed By: azzolini Differential Revision: D5681837 fbshipit-source-id: e6c618bff051f9bbc56075c796aeba0094fa97dd

2017-08-22

Commit:	65112f3
Author:	Yangqing Jia	2017-08-21 21:49:35 -0700
Committer:	Facebook Github Bot	2017-08-21 22:07:48 -0700

code cleanup: separate the several net implementations to separate files. Summary: TSIA. Reviewed By: harouwu Differential Revision: D5670906 fbshipit-source-id: 507e789978144341bf696fb20dc11f3c2d55493b

2017-08-18

Commit:	5d24a4e
Author:	Yangqing Jia	2017-08-18 15:34:26 -0700
Committer:	Facebook Github Bot	2017-08-18 15:46:51 -0700

Early design for a general Event abstraction cross-devices. Summary: There are ad-hoc efforts on avoiding excessive device synchronizations, such as async_dag, singlethread_async, etc. This diff aims to provide an early design for a general Event class, that can achieve the following: (1) It is device agnostic, essentially using a vtable to do cross device record, wait and synchronization. (2) Created new functions WaitEvent and Record in the Context class for interacting with Events. (3) Exposed the corresponding WaitEvent and Record functions in the OperatorBase class as well. An example use case is that, after potential future refactoring, one can achieve a real async execution per operator by running op.WaitEvent(previous_event); op.RunAsync(); op.RecordEvent(this_op_event); and the next op can do next_op.WaitEvent(this_op_event); Right now, I changed async_dag net implementation so that it uses the general event design. The old Event class is assimilated to the general Event class and the old Stream class is now essentially taken over by the Context class itself. Reviewed By: harouwu Differential Revision: D5648463 fbshipit-source-id: 58bd84d06e4a9977b0b835110ddb2f18be3b7cbc

2017-08-16

Commit:	14950a9
Author:	Lei Chen	2017-08-16 10:22:26 -0700
Committer:	Facebook Github Bot	2017-08-16 10:28:55 -0700

Support session in distributed realtime trainer Summary: Convert from PlanDef ProtoBuf into python Plan object by recursively creating Nets and ExecutionSteps. Also support running Plan object directly in Session. Reviewed By: azzolini Differential Revision: D5608393 fbshipit-source-id: c0ae3b6da743a759af6db3b614a5a3935fe0b34c

2017-08-02

Commit:	7df8598
Author:	Dmitrii Podoprikhin	2017-08-02 15:54:35 -0700
Committer:	Facebook Github Bot	2017-08-02 16:08:09 -0700

Added functionality that allows users to store huge blobs Summary: Added functionality that allows users to store huge blobs of any type not only Tensors. Blob has to be divided into chunks in the same way as Tensor blob. Reviewed By: kennyhorror Differential Revision: D5432762 fbshipit-source-id: c171faacd99d209bfae6f9707ebde7c4e23ba3b9

2017-06-21

Commit:	7d48274
Author:	Alisson Gusatti Azzolini	2017-06-20 22:20:45 -0700
Committer:	Facebook Github Bot	2017-06-20 22:32:07 -0700

Allow tasks/execution_steps to be cloned at runtime Summary: Advantages of cloning the tasks/execution_steps at runtime: - Less complexity on the python side: no need to clone nets and add prefixes to blob names - Faster start-up: we had cases of complex plans that took up to 30min to be created. - Better isolation: each task cloned at runtime has its own child workspace, preventing false sharing of blobs. - Opens up possibility for dynamic scheduling: Number of threads per task can be increased on the fly, at runtime. Reviewed By: dzhulgakov Differential Revision: D5100730 fbshipit-source-id: 71b83193b135da4e6eaf2536d8fc266528e1fdcc

2017-06-19

Commit:	83e6a0b
Author:	Alexander Sidorov	2017-06-19 16:41:02 -0700
Committer:	Facebook Github Bot	2017-06-19 16:47:31 -0700

Revert uuid change to OperatorDef protobuf Summary: a few issues: 1. Randomization hurts memoization 1. Even if we make it non random, then we can get key colisions when loading it back. 2. RNNs use prototxt for step net and apparently its not forward compatible like normal protobuf is I am thinking of a better less invasive solution now. Reviewed By: jamesr66a Differential Revision: D5272118 fbshipit-source-id: ab577fad04fbfc632e1fceffa923377a0d3da1be

2017-06-15

Commit:	2ec294a
Author:	haracejacob	2017-06-14 18:17:13 -0700
Committer:	Facebook Github Bot	2017-06-14 18:22:39 -0700

Fix a few typos and grammars in comment Summary: Fix a few typos and grammars in comment by using language-check, python library spell_checker source code is here : https://github.com/17-1-SKKU-OSS/011A/blob/master/spell_checker/spell_checker.py here is the text file which indicates what things should be fixed : https://github.com/17-1-SKKU-OSS/011A/tree/master/spell_checker/fix/caffe2 Closes https://github.com/caffe2/caffe2/pull/719 Differential Revision: D5165118 Pulled By: aaronmarkham fbshipit-source-id: 7fb8ef7a99d03cd5fd2f9ebdb01b9865e90fc37b

2017-06-14

Commit:	eebda50
Author:	Alexander Sidorov	2017-06-13 18:45:25 -0700
Committer:	Facebook Github Bot	2017-06-13 18:50:02 -0700

Operator python traceback Summary: This is going to show a python Caffe2 user where a failed operator was created. Motivation for having this information not right in protobuf is to avoid having it too verboose and keep ability to read protobufs of a net after a simple print() call. Reviewed By: jamesr66a Differential Revision: D5226047 fbshipit-source-id: 7edfe850e05a2ec209577142aa3368664a57a108

2017-04-24

Commit:	3b0069a
Author:	Mohammad Hossain	2017-04-21 16:15:54 -0700
Committer:	Yangqing Jia	2017-04-24 15:52:25 -0700

Expose operators execution statistics to python frontend. Summary: To expose operators execution statistics in python, profiling measurements collected in ProfDAGNet class is leveraged. In current implementation, a new operator is defined that outputs the statistic data in a protobuf message. In the frontend, OperatorStatsContainer works as a wrapper to print ProfDAGNet statistics. Differential Revision: D4923009 fbshipit-source-id: 18a6d76a405ef277a3fca7a312609051cf943207

2017-04-04

Commit:	39fa092
Author:	Fei Sun	2017-04-04 14:58:55 -0700
Committer:	Facebook Github Bot	2017-04-04 15:03:39 -0700

Constant string is generated from Protobuf instead of Thrift Summary: To make the predictor open souorce, move the constants that are generated from Thrift to Protobuf. Reviewed By: salexspb Differential Revision: D4656884 fbshipit-source-id: d4dbb3416e8396185e0981fcd9a090fbb054a18a

2017-03-31

Commit:	834142b
Author:	Fei Sun	2017-03-30 22:27:07 -0700
Committer:	Facebook Github Bot	2017-03-30 22:34:58 -0700

Change the predictor to use Protobuf Reviewed By: salexspb Differential Revision: D4644798 fbshipit-source-id: 0cf96dfc9061f87978a57d2fedcfe4a0bb012405

2017-03-09

Commit:	41a3ec2
Author:	Bram Wasti	2017-03-08 23:58:22 -0800
Committer:	Facebook Github Bot	2017-03-09 00:01:12 -0800

QTensor serialization/deserialization Summary: Added protobuf style serialization/deserialization w/o chunking for qtensors Reviewed By: salexspb Differential Revision: D4622677 fbshipit-source-id: 1f845ad773a61b7ae2c362ec31d8de04e4217f68

2017-03-03

Commit:	8dff5a8
Author:	Fei Sun	2017-03-03 07:12:00 -0800
Committer:	Facebook Github Bot	2017-03-03 07:15:34 -0800

Change the type of content in BlobProto from string to bytes Summary: We are converting MetaNetDef from thrift to protobuf. The protobuf is binary encoding. Since bytes is a superset of string. Change the field to bytes so that no warning is generated when compiling caffe2. Reviewed By: Yangqing Differential Revision: D4635581 fbshipit-source-id: 916b799e1fb9466658e1dd198bfb5c6928f22488

2017-03-01

Commit:	0293790
Author:	Aapo Kyrola	2017-02-28 23:17:34 -0800
Committer:	Facebook Github Bot	2017-02-28 23:33:32 -0800

add inference for gradient ops + a couple of missing shape inference functions + fix to scalars Summary: A bit too much stuff in one diff, so sorry: 1. Add inference for gradient types by using the fact that x_grad is gradient of x and must be of same shape. This is kind of awkward to use string matching, but in addition I rely on the operator being actually a gradient op. 2. dzhulgakov was write, scalar shape is () and not (1). Sorry, my claim easlier was #fakenews. 3. Added inference functions for MakeTwoClass, MomentumSGDUpdate and Cross entropy ops. Reviewed By: dzhulgakov Differential Revision: D4569758 fbshipit-source-id: 0db13f33819777fdddefe21d4b1ebf906fcaf98c

2017-02-22

Commit:	8fa156d
Author:	Alisson Gusatti Azzolini	2017-02-21 20:15:24 -0800
Committer:	Facebook Github Bot	2017-02-21 20:17:40 -0800

Improve "reporter net" design Summary: Previously we had several limitations for a reporter net: - needed to be a net, not an execution step - only one allowed per execution step, with a single interval Now, "reporter nets" become repoter steps and multiple of them can be specified with different timeouts. Reviewed By: dzhulgakov Differential Revision: D4583686 fbshipit-source-id: ad7266e16f96e7829fd24dcc1f165f39e9db573d

2017-02-03

Commit:	dcefc74
Author:	Aapo Kyrola	2017-02-02 22:26:31 -0800
Committer:	Facebook Github Bot	2017-02-02 22:29:22 -0800

Shape and Type Inference Part1 Summary: This is a bit large diff, sorry about it. It includes basic shape and type inference functionality, based on YQ's Schema scaffolding. I added some helper functions to make it easier to write simple translations. Bigger refactoring was needed for ConvPoolBase so that we could use the shape inference already there in the schema. I annotated enough operators to be able to infer forward-pass of shapes for basic convnet, and added test for that. I intend to bootcamp some annotations and annotate enough to handle Resnets fully. Need to think about gradients, if they could be annotated in an easier way. Only shapes are now exposed to Python, types will follow later. Also the inference is not called yet anywhere but unit test. Also I am not sure if everything is in the best location in the code, but shouldn't be hard to move stuff around. Reviewed By: dzhulgakov Differential Revision: D4436818 fbshipit-source-id: eebee5937ccc9ac09c245465302388a1fae6933c

2016-11-29

Commit:	2790043
Author:	Yangqing Jia	2016-11-22 11:26:27 -0800
Committer:	Bram Wasti	2016-11-29 15:18:36 -0800

Add the MKLDNN type to the tensor type strings and added proper docs. Summary: TSIA Reviewed By: dzhulgakov Differential Revision: D4217541 fbshipit-source-id: f68d1aba9c20af0fb0aed2cc1b2099961f6fa7a4

2016-11-18

Commit:	5893989
Author:	Yangqing Jia	2016-11-18 15:41:06 -0800

fbsync at f5a877

2016-11-15

Commit:	238ceab
Author:	Yangqing Jia	2016-11-14 14:58:04 -0800
Committer:	Yangqing Jia	2016-11-15 00:00:46 -0800

fbsync. TODO: check if build files need update.

2016-08-02

Commit:	c15e45c
Author:	Yangqing Jia	2016-08-01 20:58:46 -0700

chunky sync again

2016-07-29

Commit:	3c98934
Author:	Yangqing Jia	2016-07-28 23:37:02 -0700

caffe translator with added back legacy pooling support

2016-07-28

Commit:	bcea409
Author:	Yangqing Jia	2016-07-28 15:06:04 -0700
Committer:	Yangqing Jia	2016-07-28 15:06:43 -0700

sync

Commit:	f01f206
Author:	Yangqing Jia	2016-07-28 09:55:49 -0700

bring up caffe.proto to master

2016-07-21

Commit:	09bed67
Author:	Yangqing Jia	2016-07-21 11:26:41 -0700

add untracked files

Commit:	6463eeb
Author:	Yangqing Jia	2016-07-21 10:16:42 -0700

chunky sync - build scripts to be written

2016-05-13

Commit:	559053d
Author:	Yangqing Jia	2016-05-13 14:43:48 -0700

chunky sync