Proto commits in AcademySoftwareFoundation/OpenCue

These 47 commits are when the Protocol Buffers files have changed:

Commit:94d1761
Author:Ramon Figueiredo
Committer:GitHub

[cuegui] Add LockState Filter to "Monitor Hosts" window in CueCommander (#1679) - Add `LockStateSeq` message with `repeated LockState state` in `host.proto` and update HostSearchCriteria - Update `HostMonitor.py` to include the menu to filter by lock state - Update `HostSearch.java` to include lock state filtering - Add lock state handling in `_setOptions` function in `search.py` **Link the Issue(s) this Pull Request is related to.** https://github.com/AcademySoftwareFoundation/OpenCue/issues/1678

The documentation is generated from this commit.

Commit:81b0fe1
Author:Jimmy Christensen
Committer:GitHub

[rqd] [cuegui] Add support for Loki for frame logs (#1577) **Link the Issue(s) this Pull Request is related to.** #1571 **Summarize your change.** This adds the ability to use loki as the backend for frame logs in rqd. It also adds a new plugin/widget in cuegui to read the logs from the loki server. This enables logs files to not be bound by a single namespace and also adds the potential to also store telemetry about the frame. This is still on an experimental stage and any inputs are appreciated. Screenshot of new widget : ![image](https://github.com/user-attachments/assets/aabd4e92-34c2-431f-a271-8211b7dbb579) The loki-urllib3-client python module is optional and if it's not detected, the cuegui widget will show and error message. Not yet implemented : - Search log - Timestamp in log view --------- Co-authored-by: Diego Tavares <dtavares@imageworks.com>

Commit:219444b
Author:Diego Tavares
Committer:GitHub

[rqd] Add frame recovery logic for docker mode (#1614) Whenever RQD restarts it loses track of all the frames launched by it that haven't finished. This change adds a new configurable option to backup frame states to a file, that is used to recover the frame cache state and try to re-bind to the running frames. This first version only works on docker mode --------- Signed-off-by: Diego Tavares <dtavares@imageworks.com>

Commit:c00d214
Author:Diego Tavares
Committer:GitHub

[rqd/cuebot] Hard and Soft memory limits (#1589) Currently, frames are created with minimal requirements, but limits are not enforced. This PR implements soft and hard limits on RQD when running on docker mode. For more information about soft and hard limits [read](https://docs.docker.com/engine/containers/resource_constraints/). Limits are calculated using the minimum memory defined for the layer and a multiplier that can be tuned at Dispatcher.java. Ideally, these values should be extracted to opencue.properties, but they are being used in a static context, and interpreted before the file is actually read. --------- Signed-off-by: Diego Tavares <dtavares@imageworks.com>

Commit:633df41
Author:Diego Tavares
Committer:GitHub

[cuegui/pycue] Fix Local Booking widget (#1581) This feature has been inactive on opencue since the beginning. Changes to port from Ice to Grpc were not properly tested and this widget never really worked. Local Rendering if a feature that allows users to claim the ownership of a host (their workstation) and assign a job to execute frames on that host. This is very useful in situations where the farm is busy but user workstations aren't. To access the feature, right-click on a job/layer and select "Use local cores..". On the opened widget, the user can select how much cores, memory and gpu to allocate to execute cue jobs. When confirmed, cuebot will start dispatching frames to that host. --------- Signed-off-by: Diego Tavares <dtavares@imageworks.com>

Commit:291b694
Author:Diego Tavares
Committer:GitHub

[cuebot/rqd] Add feature to run frames on a containerized environment using docker (#1549) ### Motivation Running OpenCue In a multi operational system environment requires segregating the farm, which means hosts have to be assigned to one OS and cannot be shared between shows that have different OS requirements. This can be a challenge when sharing resources between shows is necessary. ### Proposed solution A new execution mode on **rqd** `runDocker` to live alongside `runLinux`, `runWindows`, and `runDarwin` (macOs). This mode will launch the frame command on a docker container based on the frame expected OS. With this, rqd is now able to run jobs from different OSs on the same host. But to make this possible, a rqd host needs to advertise itself not with its own OS code (defined by `SP_OS` on rqd.conf), but with all the OSs of images it is capable of executing. ### Configuration changes The following sections were added to rqd.conf: ```ini [docker.config] # Setting this to True requires all the additional "docker.[]" sections to be filled RUN_ON_DOCKER=True # This section is only required if RUN_ON_DOCKER=True # List of volume mounts following docker run's format, but replacing = with : [docker.mounts] TEMP=type:bind,source:/tmp,target:/tmp,bind-propagation:slave NET=type:bind,source:/net,target:/net,bind-propagation:slave # This section is only required if RUN_ON_DOCKER=True # - keys represent OSs this rqd is capable of executing jobs in # - values are docker image tags [docker.images] centos7=centos7.3:latest rocky9=rocky9.3:latest ``` In this case, the rqd host would advertise itself with `OS=centos7,rocky9`, and the dispatch logic has been changed accordingly to account for dispatching frames to nodes that support multiple OSs. Feature has been documented at https://github.com/AcademySoftwareFoundation/opencue.io/pull/302 --------- Signed-off-by: Diego Tavares <dtavares@imageworks.com>

Commit:2c0a97e
Author:Diego Tavares
Committer:GitHub

Rest gateway (#1355) ## Summarize your change Create a service to expose a REST endpoint for the grpc interface. The motivation behind having a REST endpoint is to create a web version of cuegui (coming soon). ## How does it work See the [module README](https://github.com/AcademySoftwareFoundation/OpenCue/blob/1f3229599a58af64aa9c7feda7657f8fece9c97d/rest_gateway/README.md) for a full documentation. --------- Signed-off-by: Diego Tavares <dtavares@imageworks.com> Co-authored-by: Zachary Fong <zfong@imageworks.com> Co-authored-by: Ramon Figueiredo <rfigueiredo@imageworks.com>

Commit:b117568
Author:Diego Tavares
Committer:GitHub

Update placeholder branch for containerized_rqd (#1550) Signed-off-by: Diego Tavares <dtavares@imageworks.com>

Commit:e67a8b3
Author:Diego Tavares
Committer:GitHub

[cuebot/rqd] Prevent running frames on Swap memory (#1497) Improve logic previously implemented to handle Out-of-memory conditions to consider swap usage. When a host is using more than `dispatcher.oom_max_safe_used_physical_memory_threshold` if its physical memory and more than `dispatcher.oom_max_safe_used_swap_memory_threshold` of its swap memory, a logic that kills frames that are relying heavily on swap memory is triggered. This logic will automatically mark killed frames to be retried and possibly increase its parent layer memory requirements if it had been using more memory than initially reserved. Co-authored-by: Ramon Figueiredo <rfigueiredo@imageworks.com>

Commit:7bb39c2
Author:Zach-Fong
Committer:GitHub

[cueweb] CueWeb improvements and add unit testing (#1457) Features and improvements 1. CueWeb authorization updates - Modified CueWeb to include authorization headers in all HTTP requests to the gRPC REST gateway. - Added functionality to generate JWT tokens using a secret for inclusion in authorization headers. 2. Include unit testing - Introduced Jest tests for CueWeb to cover: Authentication middleware, error handling, and JWT creation. Co-authored-by: Zachary Fong <zfong@imageworks.com> Co-authored-by: Ramon Figueiredo <rfigueiredo@imageworks.com>

Commit:1219377
Author:Diego Tavares
Committer:GitHub

Rest gateway (#1449) Merge changes from local fork into ASWF repo --------- Signed-off-by: Diego Tavares <dtavares@imageworks.com>

Commit:4e35887
Author:Jimmy Christensen
Committer:GitHub

Add command to layerDetails (#1436) **Summarize your change.** This adds the layer `command` attribute to the layer details in the cuegui attribute widget.

Commit:85050da
Author:Diego Tavares
Committer:GitHub

Remove rqd restart feature (#1435) The feature was never really functional. Its implementation relied on a hardcoded path only valid for the init.d deployment of rqd, and even for this use case it didn't work as expected. For the three existing rqd deployment modes (docker, init.d and systemd) restarting should be accomplished by calling stop and start sequentially. Out of the three modes, only Docker is not handled by this PR, as it requires external control of the rqd service on docker, and service definitions are out of the scope of this repo.

Commit:8ea92bf
Author:Rosa Behrens Camp
Committer:GitHub

Implement feature to override frame state display text/color in UI (#1246) Implement feature to override frame state display text/color in UI The scheduled task to remove old jobs has been failing for jobs with frames that used the new override feature due to a foreign key not being handled at the delete trigger. --------- Authored-by: RosaBehrensCamp <rbehrens@imageworks.com> Signed-off-by: Diego Tavares <dtavares@imageworks.com> Co-authored-by: Diego Tavares <dtavares@imageworks.com>

Commit:174f397
Author:Diego Tavares
Committer:GitHub

Subscribe to a job using email (#1368) Summarize your change. This change give users the ability to subscribe to a job through the API or using the GUI. An user subscribed to a job will receive the same emails as the owner of the job. Signed-off-by: Diego Tavares <dtavares@imageworks.com> Co-authored-by: Akim Ruslanov <aruslanov@imageworks.com>

Commit:a314c6c
Author:Diego Tavares
Committer:GitHub

Kill job reason (#1367) Adding requester information to JobKillRequest This feature requires more information from job kill actions requested through the API. Link the Issue(s) this Pull Request is related to. This feature is motivated by a situation where a script was misusing the API and calling kill on all the jobs for a show on a regular basis. Without this feature, finding where the requests were coming from was a big endeavor. Summarize your change. This change requires that a kill request also provide username, pid, host_kill and reason. --------- Signed-off-by: Diego Tavares <dtavares@imageworks.com> Co-authored-by: Roula O'Regan <roregan@imageworks.com>

Commit:9a033a9
Author:Diego Tavares
Committer:GitHub

Fix Criteria.java query search parameters (#1369) Fix a bug in the overloaded method addPhraseRange for InRangeIntegerSearchCriterion. The method needs an extra search parameter and needs to call min/max and not value. Added two new parameters to the buildWhereClause for "greater than" and "lower than" memory proc searches. --------- Co-authored-by: Roula O'Regan <roregan@imageworks.com>

Commit:d9d24f0
Author:Rosa Behrens Camp
Committer:GitHub

Min mem increase (#1157) Added the ability for admins to set the minimum memory increase value on a service. OpenCue increases the minimum memory requirements every time a job is retried when it fails from running out of memory but the old memory increase (hardcoded to 2G) was causing some jobs to be retried many times before it finally succeeded.

Commit:6fafd62
Author:Diego Tavares da Silva
Committer:GitHub

[rqd] Core affinity for cache optimization (#1171) When possible, try to book frames from the same Layer on the same core to leverage shared cache

Commit:81a272a
Author:Akim Ruslanov
Committer:GitHub

Add layer max cores. (#1125)

Commit:3893b87
Author:Diego Tavares da Silva
Committer:GitHub

Improve view running procs (#1141) * Improve view running procs This changes the widget to only display info about the selected frame instead of all the running frames for the job. * pylint pass * pylint pass * Update VERSION.in Co-authored-by: Brian Cipriano <brian.cipriano@gmail.com> Co-authored-by: Brian Cipriano <brian.cipriano@gmail.com>

Commit:02cb67f
Author:roulaoregan-spi
Committer:GitHub

Add more exit codes to Frame state waiting (#1131) * Add more exit codes to Frame state waiting When determining the frame state cuebot needs to make sure that certain exit statuses put the frame state back into waiting, this will help save time for users when a frame fails for host hardware issues. (cherry picked from commit cbeda9387b09a8713bb76d9075fdee72c3c795f1) * Add more exit codes to Frame state waiting * Removed unused method updateFrameHostDown * Removed deprecated oracle FrameDaoJdbc file * Remove log debugging from FrameCompleteHandler * Reverting, adding updateFrameHostDown FrameDao method * Verion bump, add missing interface checkRetries method Co-authored-by: Diego Tavares da Silva <dtavares@imageworks.com>

Commit:5536904
Author:roulaoregan-spi
Committer:GitHub

Add Proc's child PIDs to Host report stats (#1130) * Add Proc's child PIDs to Host report stats Users require additional information about the running frame.data pid. For each parent process (frame.pid) add to report.proto the child process: name, rss, vsize, state, cmdline, pid. This additional info will get stored in the Proc table, while proc is running; users can view child proc stats via Cuegui and rqlog will output the highest recorded values for rss for each child pid. * Fix Proc Pylint errors * Add more exit codes to Frame state waiting When determining the frame state cuebot needs to make sure that certain exit statuses put the frame state back into waiting, this will help save time for users when a frame fails for host hardware issues. (cherry picked from commit cbeda9387b09a8713bb76d9075fdee72c3c795f1) * Add more exit codes to Frame state waiting * Removed unused method updateFrameHostDown * Removed deprecated oracle FrameDaoJdbc file * Remove log debugging from FrameCompleteHandler * Fixes to RQD and unittests * Remove unrelated changes from FrameDao * Fix pylint errors for rqmachine.py * Remove unrelated frame changes from pycue * Removing unrelated change from DispatchSupportService * Fix merged conflicts for ProcDaoTests

Commit:9e03b95
Author:Kazuki Sakamoto
Committer:GitHub

Add Job.shutdownIfCompleted API method. (#1033)

Commit:c22fe12
Author:Kazuki Sakamoto
Committer:GitHub

Add multiple GPU support #760 (#924)

Commit:847ee50
Author:Kazuki Sakamoto
Committer:GitHub

Add GetDefault and SetDefault to AllcationInterface (#939)

Commit:a77c1ff
Author:Kazuki Sakamoto
Committer:GitHub

Remove blackout (#946) * Remove blackout * Bump up version

Commit:dff882c
Author:Lars van der Bijl
Committer:GitHub

feat: Add timeout and LLU timeout (#761) Add support for layers to have timeout. If a frame goes past it's hard timeout it get's killed. LLU timeout is usually a lower value that check when the last log update has happend. if no update happens in the LLU window it's also killed. Closes #462

Commit:3877834
Author:George Pollard
Committer:GitHub

Make UID optional in frame submission (#618)

Commit:a481531
Author:Lars van der Bijl
Committer:Brian Cipriano

Fix passing message to kill frame (#546)

Commit:9e4fb1c
Author:Greg Denton
Committer:GitHub

Add basic limits functionality (#414) * Initial add of Limits to Layers. Allows users to specify arbitrary limits to job layers, to prevent running too many tasks. This is just the backend implementation, there will be separate commits for the front end and dispatcher updates. * Adding helper get limit names method to LayerDao, to make it easier to populate the Limit list on a returned layer object * cleanup * remove unnecessary import * initial add of limit client code * hooking up limit database queries * adding current_value to limit object * adding limits to pycue * update layer when properties change * adding current running column to LimitsWidget * Add functionality to edit limits on an existing layer * improving tests * fixing some layer tests * switch to use streams instead of loops

Commit:cebab82
Author:Greg Denton
Committer:GitHub

Fixes for Layer.getFrames (#447) * Remove unnecessary layer clean logic * Allow getFrames to set limit value

Commit:14b3b43
Author:Brian Cipriano
Committer:GitHub

Add more MenuActions tests. (#410)

Commit:0ffe7b2
Author:Greg Denton
Committer:GitHub

Adding unittests for group and host wrappers. (#324) NestedGroup asGroup function now returns a Group wrapper object. Correctly labeling allocation id as id in Host SetAllocation call.

Commit:2e95537
Author:Brian Cipriano
Committer:GitHub

CueAdmin code cleanup and test coverage. (#295)

Commit:7f874e0
Author:Greg Denton
Committer:GitHub

getJobWhiteboard returns groups only and a list of job ids (#264) * getJobWhiteboard returns groups only and a list of job ids, instead of all nested items. In response to Issue #251.

Commit:1addf71
Author:Greg Denton
Committer:GitHub

Fixing subgroups for monitor cue plugin in cuegui (#236) * Fixing subgroups for monitor cue plugin in cuegui

Commit:9a5a114
Author:Brian Cipriano
Committer:GitHub

Clean up PyCue unit tests and enable them in the docker build. (#201)

Commit:2489e55
Author:Greg Denton
Committer:GitHub

opencue python renaming (#109) * opencue python renaming

Commit:dd8b4ae
Author:Greg Denton
Committer:GitHub

Rqd cleanup & fixes (#98) * adding attrs so rqd doesnt need to overload the message object * grpc messages should not be overloaded and use snake case * switch syslog setup to to try to use /dev/log, if not available use localhost on all OSs.

Commit:a280b6a
Author:Brian Cipriano
Committer:GitHub

Move proto files to be a top-level directory. (#47)

Commit:cca6828
Author:Brian Cipriano
Committer:GitHub

Migrate RQD to only use GRPC. (#32)

Commit:02f79f9
Author:Brian Cipriano
Committer:GitHub

Migrate Criterion usage to GRPC. (#30)

Commit:daf1b38
Author:Greg Denton
Committer:GitHub

grpc for spi_cue (#23) * grpc for spi-cue

Commit:d58f05a
Author:Greg Denton
Committer:GitHub

GRPC changes for cuebot (#15) * Switch the servants to GRPC and remove the ICE servants. * Rename the top level entity objects to clear up naming conflicts with rpc objects. The naming convention is now: Action : rpc object ActionEntity : Object that inherits from Entity ActionInterface: Interface to the Entity * Updates to the Whiteboard to work with the gRPC objects. * Move methods in CueStatic into object specific classes since they are all static now. * Enums are capital by convention

Commit:1b05364
Author:Greg Denton
Committer:GitHub

Adding proto files for grpc (#11) * Adding protos for cuebot * gRPC setup for Allocations, Subscriptions, and Tasks * Update rqd and spi_cue to use new protos * Rename top level Allocation object to AllocationEntity and AllocationInterface

Commit:01677b9
Author:Greg Denton
Committer:GitHub

Initial GRPC setup for cue3 (#3) This is an initial commit showing the general process of how we are adding in gRPC. - Add grpc servers to cue3bot and rqd - Add cue and rqd proto files in cue3bot/src/main/proto - Add grpc applicationContext to spring - Switch Facility to using gRPC - Update spi_cue to use gRPC and add tests * Cleaning up whitespace * remove todo question