These 39 commits are when the Protocol Buffers files have changed:
Commit: | 96b4bd1 | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Improve handling of command/cluster selector runtime errors Motivation: allow a client or user to distinguish between 2 failure cases: - The job is submitted with criteria that cannot be satisfied, criteria must be updated for this job to be accepted. - The selector encountered a runtime error, resubmitting the job with the same criteria may work Until now, both cases would be handled the same leading to: * Error `412: Precondition Failed' via API * Fatal error resolving job request via Agent And the job would be marked FAILED with status message JobStatusMessages.FAILED_TO_RESOLVE_JOB This change better distinguishes the 2 error conditions such that: - Selector runtime error returns a `500` via API - Selector runtime error produces a retryable error in the agent - Selector runtime error sets a different message with the final job status - Cluster selector runtime error does not proceed to the next selector (for better determinism) This allows clients to retry, if appropriate.
The documentation is generated from this commit.
Commit: | 1815e05 | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Bug: integer overflow for large files Updates the gRPC protocol to allow downloading files that are over 2GB in size. The previous version of the protocol was (naively) using int32 for offsets. So anything beyond the 2GB mark would produce an integer overflow server-side. - Rename deprecated unit32 fields, they will be ignored by current and future agent in favor of new int64 - Add a field so that an agent can inform the server it is recent and supports files/ranges over 2GB - Block request whose range is over the 2GB mark IFF the target agent is old and does not support it - Migrate from int to long where appropriate Old behavior: 1) Files smaller than 2GB: no problem 2) Range below 2GB for larger than 2GB file: no problem 3) Range above 2GB for larger than 2GB files: error 500, integer overflow New behavior 1) Unchanged 2) Unchanged 3) (against old agent) Unchanged (but better error message) 3) (against new agent) successful
Commit: | 0a22e12 | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Define and implement 'changeJobArchiveStatus' gRPC API
Commit: | 3804e73 | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Define and implement agent gRPC service to retrieve job status
Commit: | 4115229 | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Implement client and server for Agent configure endpoint The server-side underlying service is still a NOOP and sends back an empty map of properties. The Agent still makes no use of the properties received.
Commit: | 194f9d5 | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Define new gRPC service for agent runtime configuration Create new 'configure' RPC method and Request/Response messages. For now, this is a stub, messages are empty and the service is NOOP.
Commit: | d982a36 | |
---|---|---|
Author: | Tom Gianos | |
Committer: | Tom Gianos |
Add option to disable archiving from agent CLI
Commit: | 489448d | |
---|---|---|
Author: | Tom Gianos | |
Committer: | Tom Gianos |
Deprecate agent requested archive location prefix This makes the `--archiveLocationPrefix` argument from the agent cli a no-op
Commit: | 0d51fac | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Add missing service description in genie.proto
Commit: | 92346d6 | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Fix handling of failed job resolution in agent/server protocol Bug overview: When the agent was requesting a job resolution with criteria that cannot be satisfied, the server was not handling the case properly and sending back a generic error (error code 'UNKNOWN'). On the agent side, this was surfaced to the user as a generic exception, i.e.: `GenieRuntimeException Unhandled error: UNKNOWN: com.netflix.genie.common.exceptions.GeniePreconditionException:No cluster/command combination found for the given criteria. Unable to continue` New behavior: When the specification resolution fails due to unsatisfiable criteria, the server sends a newly introduced error code: RESOLUTION_FAILED. This results in a clearer handling of the error on the agent, and the user message becomes: `JobSpecificationResolutionException: No cluster/command combination found for the given criteria` Backward compatibility: Agent newer than server - No behavior change, the agent surfaces the error same as before. Server newer than the agent - No behavior change, the agent will fail to recognize the new RESOLUTION_FAILED code and treat it as UNKNOWN, therefore executing the same codepath as before
Commit: | 86b81ba | |
---|---|---|
Author: | Marco Primi | |
Committer: | Tom Gianos |
Update Agent/Server protocol to send executable and arguments separately Send executable and arguments (provided by Command) and job arguments (provided by user) as two separate arrays or strings via gRPC. Keep populating the deprecated `command_args` field for older agents to keep working.
Commit: | 4c989ed | |
---|---|---|
Author: | Tom Gianos | |
Committer: | Tom Gianos |
Add optional Timeout field to JobSpecification Crosses DTO, Proto and Entity layers Add corresponding converter logic
Commit: | 1bc987a | |
---|---|---|
Author: | Tom Gianos | |
Committer: | Tom Gianos |
Autoformat proto file
Commit: | fd1c9cc | |
---|---|---|
Author: | Tom Gianos | |
Committer: | Tom Gianos |
Add optional timeout to agent config proto messages Include in convert methods for the proto to dto and vice versa
Commit: | 1bf40b1 | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Proto definition for agent-server file streaming
Commit: | 69b6ea0 | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Update genie.proto comments for consistency
Commit: | 50cb815 | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Remove abandoned JobFileSync protocol
Commit: | 6e07e92 | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Define and implement gRPC handshake protocol Agent sends metadata to the server. The latter inspects it and can reject the client. This facilitates turning away agents that are running an incompatible or deprecated version.
Commit: | 488a1be | |
---|---|---|
Author: | Sumit Tandon | |
Committer: | sumitnetflix |
Persist archiveLocationPrefix and archiveLocation on the server Persist archivelLocationPrefix on reserving job id. Calculate archiveLocation from archivelLocationPrefix when resolving job specification and return to the client. Persist archiveLocation when saving job specification.
Commit: | 99a11d0 | |
---|---|---|
Author: | Sumit Tandon | |
Committer: | sumitnetflix |
GRPC implementation of job kill service interfaces
Commit: | e406c1e | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Define HeartBeat service, implement client and server services HeartBeat service using bi-directional gRPC streaming to detect connections and disconnections at both ends.
Commit: | 1b74a51 | |
---|---|---|
Author: | Tom Gianos | |
Committer: | Tom Gianos |
Fix misspelling of 'dependencies' in ExecutionResource message for proto JobService
Commit: | 758c38b | |
---|---|---|
Author: | Tom Gianos | |
Committer: | Tom Gianos |
Initial proto definition of the gRPC job file sync service Includes messages for a bi-directional stream implementation which will push files from the agent location to the server for storage and retrieval via REST API.
Commit: | 37e51e7 | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Review Agent/Server protocol and service implementations on both ends - Align all protocols to return similar kind of error messages - Add missing services declaration and implementation in AgentJobService (both client and server) - Update JobServiceProtoConverter to throw GenieConversionException where appropriate - Add more conversion types to JobServiceProtoConverter and remove some redundant ones - Stricter validation for parameters of AgentJobService
Commit: | 20b523b | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Remove deprecated 'agent registration' functionality
Commit: | 7badc30 | |
---|---|---|
Author: | Tom Gianos | |
Committer: | Tom Gianos |
Add initial error handling for gRPC DTO conversion code
Commit: | 9d8208a | |
---|---|---|
Author: | Tom Gianos | |
Committer: | Tom Gianos |
Initial implementation of change job status gRPC API
Commit: | 88cfe13 | |
---|---|---|
Author: | Tom Gianos | |
Committer: | Tom Gianos |
Initial implementation of the claimJob gRPC API
Commit: | 043126d | |
---|---|---|
Author: | Tom Gianos | |
Committer: | Tom Gianos |
Implement server side reservation of a job id
Commit: | 86a0267 | |
---|---|---|
Author: | Tom Gianos | |
Committer: | Tom Gianos |
Modify proto definition file for updated service APIs and commit associated changes to supporting classes.
Commit: | 361cda4 | |
---|---|---|
Author: | Tom Gianos | |
Committer: | Tom Gianos |
Refactor GRpc JobSpecificationService into JobService in preparation for new agree upon workflow of id reservation -> spec resolve -> initialization -> running
Commit: | 789a711 | |
---|---|---|
Author: | Tom Gianos | |
Committer: | Tom Gianos |
Add version as an option for criterion
Commit: | 25822b2 | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Use consistent field naming convention protobuf definitions Consistently use lowercase with underscores for field names. Does not affect generated Java classes.
Commit: | a8efd61 | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Add JobDirectoryLocation field to agent/server protocol and DTOs JobDirectory is specified via CLI and echoed by the server (and logged for auditing and debugging). Later this also allows the server to choose the location (useful when launching an agent in response to a job request via API).
Commit: | 0230c09 | |
---|---|---|
Author: | Tom Gianos | |
Committer: | Tom Gianos |
Server side implementation of the resolveJobSpecification rpc API. First versions of supporting v4 proto definitions and DTO classes
Commit: | 3a42699 | |
---|---|---|
Author: | Tom Gianos | |
Committer: | Tom Gianos |
Reformat for 4 spaces per tab/indent in proto file
Commit: | 9ef594c | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Define and implement agent registration protocol
Commit: | df7867d | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Repurpose 'Greeter' gRPC service example to 'Ping' service The ping protocol can be used by the agent to test for server connectivity and exchange client/server metadata.
Commit: | 5bb2552 | |
---|---|---|
Author: | Marco Primi | |
Committer: | Marco Primi |
Add protobuf/gRPC to build - Add protobuf and gRPC dependencies - Add and configure protobuf Gradle plugin and gRPC extension - Set up additional source directory for generated classes - Add new genie-proto module with placeholder message definitions - Excluded generated code from various check-type tasks - Set up Spring Boot auto-configuration and starter for gRPC server components (disabled by default) - Add example gRPC server interceptor - Create example service and unit test