These 14 commits are when the Protocol Buffers files have changed:
Commit: | b5e3d97 | |
---|---|---|
Author: | TFRT team | |
Committer: | Copybara-Service |
clean up TFRT distributed_runtime PiperOrigin-RevId: 490026485
The documentation is generated from this commit.
Commit: | 079fe32 | |
---|---|---|
Author: | Haoyu Zhang | |
Committer: | Copybara-Service |
Support multi-client initialization. The process typically involves 3 rounds of RPCs, with the first two rounds almost the same as in single-client initialization. 1. The leader task collect remote device info from all other tasks; 2. The leader task creates distributed contexts on all other tasks, and gets back remote ready chains from them. 3. The leader task broadcasts all remote chains to other tasks. PiperOrigin-RevId: 353711143
Commit: | 592aa21 | |
---|---|---|
Author: | Ayush Dubey | |
Committer: | Copybara-Service |
Add remote op execution support. PiperOrigin-RevId: 347458002
Commit: | 2c51c89 | |
---|---|---|
Author: | Haoyu Zhang | |
Committer: | Copybara-Service |
Propagate remote device information in distributed initialization. PiperOrigin-RevId: 346924443
Commit: | 7c7b841 | |
---|---|---|
Author: | TFRT team | |
Committer: | Copybara-Service |
- Implement kernels for sending and receiving bytes. These kernels simply wraps the underlying fabric communicator. - Implement serialization & deserialization of DHT. PiperOrigin-RevId: 346331191
Commit: | c7a5316 | |
---|---|---|
Author: | Haoyu Zhang | |
Committer: | Copybara-Service |
Add garbage collect and keep alive mechanisms to distributed TFRT. Distributed contexts created by remote clients are subject to leaks if the remote clients are disconnected. To avoid it, we add garbage collect mechanism so that the servers will periodically check and delete stale contexts if they have been inactive for a certain period of time. The GC timeout is by default 600 seconds and configurable in ServerContextConfiguration. The last access time for a distributed context is updated automatically by any RPCs with the context_id. The client also sends out KeepAlive messages periodically to make sure the remote contexts are alive. The time interval is half of the GC timeout (i.e., by default, 300 seconds). PiperOrigin-RevId: 345709415
Commit: | 1afb012 | |
---|---|---|
Author: | Bramandia Ramadhana | |
Committer: | Copybara-Service |
Adds support for remote chain handling: - Introduces RemoteChainManager which is a container for a single chain for every host - Introduces kernels that uses RemoteChainManager to get/set chain for a particular host - Introduces test kernel that creates RemoteChainManager. In production, there will be a separate kernel that would read RemoteChainManager from ExecutionContext. The latter will be on latter commit - Adds InitializeContext method to RemoteClient. In this commit, this simply initializes remote chain. There is a TODO to implement this more elaborately to initialize context. - DistributedContext now manages a set of ready chain. During initialization, it would call InitializeContext to all participants to retrieve the ready chains. - RemoteChainManager is initialized with a set of ready chains PiperOrigin-RevId: 342895156
Commit: | d488a3c | |
---|---|---|
Author: | Bramandia Ramadhana | |
Committer: | Copybara-Service |
Changed types of fields of RemoteObjectId proto to be the right type. This was missed in some previous commit PiperOrigin-RevId: 342680061
Commit: | 6eff110 | |
---|---|---|
Author: | Haoyu Zhang | |
Committer: | Copybara-Service |
Support remote context initialization in distributed TFRT. Add functions to create and close `DistributedContext`s on remote workers. This function should be used by the "master" node at the beginning/end of distributed execution. If there are errors, the done callback will be invoked with an llvm::Error (one error) or ErrorCollection (multiple errors). PiperOrigin-RevId: 341953241
Commit: | 8d5b096 | |
---|---|---|
Author: | Haoyu Zhang | |
Committer: | Copybara-Service |
Define cluster configuration as protos in preparation for remote initialization. The main purposes of the proto definitions are: * Better serialization/deserialization support, as these configurations will be used as part of remote distributed context initialization; * Better cross-language support. In the future the API layers in Python should be able to directly construct these configurations for the runtime. PiperOrigin-RevId: 341691961
Commit: | c996b81 | |
---|---|---|
Author: | TFRT team | |
Committer: | chuanhaozhuge |
Internal change PiperOrigin-RevId: 340714785
Commit: | bd1e47b | |
---|---|---|
Author: | Bramandia Ramadhana | |
Committer: | Copybara-Service |
Adds support for Remote Object deletion: - Adds DeleteRemoteObjects method to RemoteObjectManager - Adds DeleteRemoteObjectsAsync method to RemoteClient PiperOrigin-RevId: 340279592
Commit: | 7bffa4e | |
---|---|---|
Author: | Haoyu Zhang | |
Committer: | Copybara-Service |
Modularize distributed TFRT. PiperOrigin-RevId: 339919426
Commit: | 1bef0ea | |
---|---|---|
Author: | TFRT team | |
Committer: | chuanhao |
Internal change. PiperOrigin-RevId: 308308798
The documentation is generated from this commit.