These 47 commits are when the Protocol Buffers files have changed:
Commit: | 74d414b | |
---|---|---|
Author: | Lalith Suresh | |
Committer: | GitHub |
Revert "Simple anti-entropy mechanism (#24)" (#28) This reverts commit 0a1788f671ef6690b5bd3d97951578ede9cf693a.
The documentation is generated from this commit.
Commit: | 51bedce | |
---|---|---|
Author: | Lalith Suresh | |
Committer: | GitHub |
Revert "Simple anti-entropy mechanism (#24)" This reverts commit 0a1788f671ef6690b5bd3d97951578ede9cf693a.
The documentation is generated from this commit.
Commit: | 0a1788f | |
---|---|---|
Author: | Manuel Bernhardt | |
Committer: | GitHub |
Simple anti-entropy mechanism (#24) * Anti-entropy mechanism It can happen that a node misses a part of the consensus messages whilst still being able to send out its own vote (unidirectional network partition, message overload, ...). In this case, the rest of the group will see this node as being part of the group and the monitoring mechanism will still be working as expected, but the stale node will run an old configuration. In order to enforce consistency in this case, the following new anti-entropy mechanism is used: - each node maintains a set of configurations it has been part of - probe messages now contain the configuration ID of the observer - when a node receives a probe message with a configuration ID it does not know, it will start a background task to check again after a configured timeout (1 minute by default) - if the configuration ID is still unknown after the timeout has reached, the node leaves (using the LEAVE protocol) * Allows a node to catch up if it misses a consensus round * Removing leave when out-of-sync strategy, improve implementation
Commit: | 941aeb3 | |
---|---|---|
Author: | Manuel Bernhardt | |
Committer: | GitHub |
Endpoint performance and memory pressure optimization (#19) This is a bit of a controversial change from the view point of the API, yet makes a lot of sense from the view point of performance and memory utilization for very large clusters. The issue here is that the hostname of an Endpoint is modelled as a protobuf "string" type. This type carries with it the overhead of encoding to or decoding from UTF-8 every time a message is sent or received (and the field accessed). From the view point of the algorithms in place, there's no added value in having the endpoint host data be encoded as byte array or utf-8 encoded string. It is just data, what matters is that the ordering of the endpoints can be established. Having a string only matters at the interfaces: when configuring a hostname, when sending a message to one and when printing log statements (most of which at DEBUG/TRACE level). Yet at the moment, when adding a new endpoint to the membership ring(s), the following code runs: ``` public java.lang.String getHostname() { java.lang.Object ref = hostname_; if (ref instanceof java.lang.String) { return (java.lang.String) ref; } else { com.google.protobuf.ByteString bs = (com.google.protobuf.ByteString) ref; java.lang.String s = bs.toStringUtf8(); hostname_ = s; return s; } } ``` For freshly received messages containing Endpoints, this means running `toStringUtf8()`, which when there are many is quite expensive in terms of CPU and memory usage. This PR does the following: - use `bytes` rather than `string` to encode the hostname in protobuf - adjust all interfaces - the Cluster APIs are actually (almost) not affected since they use the `HostAndPort` construct - use the existing underlying / existing byte array when computing the hashcode of an Endpoint in `Utils.AddressComparator` - getting rid of the mapping between `Map<String, Metadata>` and `Map<Endpoint, Metadata>` by representing the map as two lists in protobuf (keys and values) On local tests with 1000 concurrent nodes joining, there's a 10% improvement in memory allocation and a 20% improvement in CPU usage of the stack starting at the `TreeSet.add` method (39% vs 58%).
The documentation is generated from this commit.
Commit: | b04d666 | |
---|---|---|
Author: | Manuel Bernhardt | |
Committer: | GitHub |
Proactively informing observers when shutting down (#15) Rather than waiting for edge failure detection to kick in when a cluster has been shut down, this change proactively informs the observers of a node with a new Leaving message. In turn the observers the trigger edge failure alerting immediately. Adds Cluster.leaveGracefully() and Cluster.shutdown() APIs for graceful and forced shutdowns respectfully. Accessing membership state after either of these APIs are invoked is illegal and will result in a thrown exception. * Proactively informs observer nodes that the node is leaving when the cluster is shut down * Leave notifications delivered in parallel, call to leave() protected by try/finally * Fixing parallel leave message sending - tolerating failure in delivering the messages, i.e. not cancelling other notifications - adjusting test intervals in order to reach agreement faster * Throw exceptions when trying to access membership state after shutting down
Commit: | bce1e27 | |
---|---|---|
Author: | Lalith Suresh | |
Committer: | GitHub |
Terminology edits (#11) * Rename APIs to match observer -> subject terminology * WatermarkBuffer -> almost-everywhere agreement filter * Rename monitoring links -> monitoring edges * Use cut detection terminology
Commit: | 11f4b73 | |
---|---|---|
Author: | lalithsuresh |
Endpoint is now tagged with metadata
Commit: | bfed8d4 | |
---|---|---|
Author: | lalithsuresh | |
Committer: | lalithsuresh |
Endpoint protobuf type now represents each node to avoid back-and-forth conversions between strings and Guava HostAndPort
Commit: | 80738c9 | |
---|---|---|
Author: | lalithsuresh | |
Committer: | lalithsuresh |
Add Classic Paxos implementation for recovering from Fast Paxos conflicts
Commit: | e452ed0 | |
---|---|---|
Author: | lalithsuresh | |
Committer: | lalithsuresh |
Refactor messaging interfaces to decouple Rapid from the messaging implementation
Commit: | 5271b37 | |
---|---|---|
Author: | lalithsuresh |
Cleanup interface boundaries for messaging
Commit: | c119ef4 | |
---|---|---|
Author: | lalithsuresh |
Netty tests
Commit: | 0045ce4 | |
---|---|---|
Author: | lalithsuresh |
Metadata values are now ByteStrings
Commit: | 80e4acd | |
---|---|---|
Author: | lalithsuresh | |
Committer: | lalithsuresh |
Remove back-and-forth conversions for UUIDs
Commit: | 67c9bdf | |
---|---|---|
Author: | lalithsuresh | |
Committer: | lalithsuresh |
Use best effort broadcast and re-organize executor usage.
Commit: | afd9b50 | |
---|---|---|
Author: | lalithsuresh |
Avoid creating redundant copies of link-update-messages
Commit: | 538e894 | |
---|---|---|
Author: | lalithsuresh | |
Committer: | lalithsuresh |
Changes to the metadata API to avoid sending strings around
Commit: | e6e4b81 | |
---|---|---|
Author: | lalithsuresh |
Batch join-messages for multiple rings that are directed to the same monitor
Commit: | 252b907 | |
---|---|---|
Author: | lalithsuresh | |
Committer: | lalithsuresh |
Support informing ProbeMessage-based failure detectors about whether a monitoree is bootstrapping
Commit: | 5184c26 | |
---|---|---|
Author: | lalithsuresh | |
Committer: | lalithsuresh |
Supply executors to prevent grpc's usage of a cachedThreadPool
Commit: | 856c3e4 | |
---|---|---|
Author: | lalithsuresh |
Revert changes to receiving join-confirmations
Commit: | 4cc8f8c | |
---|---|---|
Author: | lalithsuresh |
Refactor join protocol to be retry friendly
Commit: | 7f89246 | |
---|---|---|
Author: | lalithsuresh |
Metadata manager now maintains a set of key-value pairs per-node
Commit: | 5b7f43b | |
---|---|---|
Author: | lalithsuresh |
Cluster can now track metadata per-node. Confined to features like roles for now.
Commit: | d4a1c07 | |
---|---|---|
Author: | lalithsuresh |
Refactor repository into a parent project with modules
Commit: | 3798363 | |
---|---|---|
Author: | lalithsuresh | |
Committer: | lalithsuresh |
Consensus implementation
Commit: | cf1f4e7 | |
---|---|---|
Author: | lalithsuresh |
Implement monitoring support
Commit: | 9c28c32 | |
---|---|---|
Author: | lalithsuresh |
Avoid proposal logging by default + nits
Commit: | 4423070 | |
---|---|---|
Author: | lalithsuresh |
Cleanup protobuf descriptions
Commit: | 72c122d | |
---|---|---|
Author: | lalithsuresh |
Cleanup protobuf descriptions
Commit: | 91d82a4 | |
---|---|---|
Author: | lalithsuresh | |
Committer: | lalithsuresh |
Refactor out redundant LinkUpdateMessage class. We only use the protobuf definition now.
Commit: | 5305e39 | |
---|---|---|
Author: | lalithsuresh |
Implement update batching
Commit: | fe196f2 | |
---|---|---|
Author: | lalithsuresh |
Use InProcessChannel for tests.
Commit: | af88219 | |
---|---|---|
Author: | lalithsuresh |
Join protocol works until a configuration change. Need to stream back configuration.
Commit: | 731faaa | |
---|---|---|
Author: | lalithsuresh |
Refactor code to accommodate changes to bootstrap procedure
Commit: | a13daf1 | |
---|---|---|
Author: | lalithsuresh |
Checkpoint before re-working MembershipView
Commit: | 1edb1e2 | |
---|---|---|
Author: | lalithsuresh |
Checkpoint before gossip implementation
Commit: | 9c9abbc | |
---|---|---|
Author: | lalithsuresh |
Checkpoint before async implementation
Commit: | 00a1c2d | |
---|---|---|
Author: | lalithsuresh |
Test bootstrap
Commit: | c87a041 | |
---|---|---|
Author: | lalithsuresh |
Improve tests and hashing stability
Commit: | da5dd04 | |
---|---|---|
Author: | lalithsuresh |
Part 1 of join protocol
Commit: | 2da809a | |
---|---|---|
Author: | lalithsuresh |
Prepare to implement join protocol
Commit: | 5b73d2c | |
---|---|---|
Author: | lalithsuresh |
Performance improvements to MembershipView
Commit: | d86641a | |
---|---|---|
Author: | lalithsuresh |
Introduce node-id maintenance
Commit: | b6c2c62 | |
---|---|---|
Author: | lalithsuresh |
First take at messaging tests with a simple broadcaster
Commit: | 13a4bd5 | |
---|---|---|
Author: | lalithsuresh |
Split protobuf generated definitions into multiple files
Commit: | 1cf205a | |
---|---|---|
Author: | lalithsuresh |
Transition to gRPC and remove checker framework