These 47 commits are when the Protocol Buffers files have changed:
Commit: | 94d1761 | |
---|---|---|
Author: | Ramon Figueiredo | |
Committer: | GitHub |
[cuegui] Add LockState Filter to "Monitor Hosts" window in CueCommander (#1679) - Add `LockStateSeq` message with `repeated LockState state` in `host.proto` and update HostSearchCriteria - Update `HostMonitor.py` to include the menu to filter by lock state - Update `HostSearch.java` to include lock state filtering - Add lock state handling in `_setOptions` function in `search.py` **Link the Issue(s) this Pull Request is related to.** https://github.com/AcademySoftwareFoundation/OpenCue/issues/1678
The documentation is generated from this commit.
Commit: | 81b0fe1 | |
---|---|---|
Author: | Jimmy Christensen | |
Committer: | GitHub |
[rqd] [cuegui] Add support for Loki for frame logs (#1577) **Link the Issue(s) this Pull Request is related to.** #1571 **Summarize your change.** This adds the ability to use loki as the backend for frame logs in rqd. It also adds a new plugin/widget in cuegui to read the logs from the loki server. This enables logs files to not be bound by a single namespace and also adds the potential to also store telemetry about the frame. This is still on an experimental stage and any inputs are appreciated. Screenshot of new widget :  The loki-urllib3-client python module is optional and if it's not detected, the cuegui widget will show and error message. Not yet implemented : - Search log - Timestamp in log view --------- Co-authored-by: Diego Tavares <dtavares@imageworks.com>
Commit: | 219444b | |
---|---|---|
Author: | Diego Tavares | |
Committer: | GitHub |
[rqd] Add frame recovery logic for docker mode (#1614) Whenever RQD restarts it loses track of all the frames launched by it that haven't finished. This change adds a new configurable option to backup frame states to a file, that is used to recover the frame cache state and try to re-bind to the running frames. This first version only works on docker mode --------- Signed-off-by: Diego Tavares <dtavares@imageworks.com>
Commit: | c00d214 | |
---|---|---|
Author: | Diego Tavares | |
Committer: | GitHub |
[rqd/cuebot] Hard and Soft memory limits (#1589) Currently, frames are created with minimal requirements, but limits are not enforced. This PR implements soft and hard limits on RQD when running on docker mode. For more information about soft and hard limits [read](https://docs.docker.com/engine/containers/resource_constraints/). Limits are calculated using the minimum memory defined for the layer and a multiplier that can be tuned at Dispatcher.java. Ideally, these values should be extracted to opencue.properties, but they are being used in a static context, and interpreted before the file is actually read. --------- Signed-off-by: Diego Tavares <dtavares@imageworks.com>
Commit: | 633df41 | |
---|---|---|
Author: | Diego Tavares | |
Committer: | GitHub |
[cuegui/pycue] Fix Local Booking widget (#1581) This feature has been inactive on opencue since the beginning. Changes to port from Ice to Grpc were not properly tested and this widget never really worked. Local Rendering if a feature that allows users to claim the ownership of a host (their workstation) and assign a job to execute frames on that host. This is very useful in situations where the farm is busy but user workstations aren't. To access the feature, right-click on a job/layer and select "Use local cores..". On the opened widget, the user can select how much cores, memory and gpu to allocate to execute cue jobs. When confirmed, cuebot will start dispatching frames to that host. --------- Signed-off-by: Diego Tavares <dtavares@imageworks.com>
Commit: | 291b694 | |
---|---|---|
Author: | Diego Tavares | |
Committer: | GitHub |
[cuebot/rqd] Add feature to run frames on a containerized environment using docker (#1549) ### Motivation Running OpenCue In a multi operational system environment requires segregating the farm, which means hosts have to be assigned to one OS and cannot be shared between shows that have different OS requirements. This can be a challenge when sharing resources between shows is necessary. ### Proposed solution A new execution mode on **rqd** `runDocker` to live alongside `runLinux`, `runWindows`, and `runDarwin` (macOs). This mode will launch the frame command on a docker container based on the frame expected OS. With this, rqd is now able to run jobs from different OSs on the same host. But to make this possible, a rqd host needs to advertise itself not with its own OS code (defined by `SP_OS` on rqd.conf), but with all the OSs of images it is capable of executing. ### Configuration changes The following sections were added to rqd.conf: ```ini [docker.config] # Setting this to True requires all the additional "docker.[]" sections to be filled RUN_ON_DOCKER=True # This section is only required if RUN_ON_DOCKER=True # List of volume mounts following docker run's format, but replacing = with : [docker.mounts] TEMP=type:bind,source:/tmp,target:/tmp,bind-propagation:slave NET=type:bind,source:/net,target:/net,bind-propagation:slave # This section is only required if RUN_ON_DOCKER=True # - keys represent OSs this rqd is capable of executing jobs in # - values are docker image tags [docker.images] centos7=centos7.3:latest rocky9=rocky9.3:latest ``` In this case, the rqd host would advertise itself with `OS=centos7,rocky9`, and the dispatch logic has been changed accordingly to account for dispatching frames to nodes that support multiple OSs. Feature has been documented at https://github.com/AcademySoftwareFoundation/opencue.io/pull/302 --------- Signed-off-by: Diego Tavares <dtavares@imageworks.com>
Commit: | 2c0a97e | |
---|---|---|
Author: | Diego Tavares | |
Committer: | GitHub |
Rest gateway (#1355) ## Summarize your change Create a service to expose a REST endpoint for the grpc interface. The motivation behind having a REST endpoint is to create a web version of cuegui (coming soon). ## How does it work See the [module README](https://github.com/AcademySoftwareFoundation/OpenCue/blob/1f3229599a58af64aa9c7feda7657f8fece9c97d/rest_gateway/README.md) for a full documentation. --------- Signed-off-by: Diego Tavares <dtavares@imageworks.com> Co-authored-by: Zachary Fong <zfong@imageworks.com> Co-authored-by: Ramon Figueiredo <rfigueiredo@imageworks.com>
Commit: | b117568 | |
---|---|---|
Author: | Diego Tavares | |
Committer: | GitHub |
Update placeholder branch for containerized_rqd (#1550) Signed-off-by: Diego Tavares <dtavares@imageworks.com>
Commit: | e67a8b3 | |
---|---|---|
Author: | Diego Tavares | |
Committer: | GitHub |
[cuebot/rqd] Prevent running frames on Swap memory (#1497) Improve logic previously implemented to handle Out-of-memory conditions to consider swap usage. When a host is using more than `dispatcher.oom_max_safe_used_physical_memory_threshold` if its physical memory and more than `dispatcher.oom_max_safe_used_swap_memory_threshold` of its swap memory, a logic that kills frames that are relying heavily on swap memory is triggered. This logic will automatically mark killed frames to be retried and possibly increase its parent layer memory requirements if it had been using more memory than initially reserved. Co-authored-by: Ramon Figueiredo <rfigueiredo@imageworks.com>
Commit: | 7bb39c2 | |
---|---|---|
Author: | Zach-Fong | |
Committer: | GitHub |
[cueweb] CueWeb improvements and add unit testing (#1457) Features and improvements 1. CueWeb authorization updates - Modified CueWeb to include authorization headers in all HTTP requests to the gRPC REST gateway. - Added functionality to generate JWT tokens using a secret for inclusion in authorization headers. 2. Include unit testing - Introduced Jest tests for CueWeb to cover: Authentication middleware, error handling, and JWT creation. Co-authored-by: Zachary Fong <zfong@imageworks.com> Co-authored-by: Ramon Figueiredo <rfigueiredo@imageworks.com>
Commit: | 1219377 | |
---|---|---|
Author: | Diego Tavares | |
Committer: | GitHub |
Rest gateway (#1449) Merge changes from local fork into ASWF repo --------- Signed-off-by: Diego Tavares <dtavares@imageworks.com>
Commit: | 4e35887 | |
---|---|---|
Author: | Jimmy Christensen | |
Committer: | GitHub |
Add command to layerDetails (#1436) **Summarize your change.** This adds the layer `command` attribute to the layer details in the cuegui attribute widget.
Commit: | 85050da | |
---|---|---|
Author: | Diego Tavares | |
Committer: | GitHub |
Remove rqd restart feature (#1435) The feature was never really functional. Its implementation relied on a hardcoded path only valid for the init.d deployment of rqd, and even for this use case it didn't work as expected. For the three existing rqd deployment modes (docker, init.d and systemd) restarting should be accomplished by calling stop and start sequentially. Out of the three modes, only Docker is not handled by this PR, as it requires external control of the rqd service on docker, and service definitions are out of the scope of this repo.
Commit: | 8ea92bf | |
---|---|---|
Author: | Rosa Behrens Camp | |
Committer: | GitHub |
Implement feature to override frame state display text/color in UI (#1246) Implement feature to override frame state display text/color in UI The scheduled task to remove old jobs has been failing for jobs with frames that used the new override feature due to a foreign key not being handled at the delete trigger. --------- Authored-by: RosaBehrensCamp <rbehrens@imageworks.com> Signed-off-by: Diego Tavares <dtavares@imageworks.com> Co-authored-by: Diego Tavares <dtavares@imageworks.com>
Commit: | 174f397 | |
---|---|---|
Author: | Diego Tavares | |
Committer: | GitHub |
Subscribe to a job using email (#1368) Summarize your change. This change give users the ability to subscribe to a job through the API or using the GUI. An user subscribed to a job will receive the same emails as the owner of the job. Signed-off-by: Diego Tavares <dtavares@imageworks.com> Co-authored-by: Akim Ruslanov <aruslanov@imageworks.com>
Commit: | a314c6c | |
---|---|---|
Author: | Diego Tavares | |
Committer: | GitHub |
Kill job reason (#1367) Adding requester information to JobKillRequest This feature requires more information from job kill actions requested through the API. Link the Issue(s) this Pull Request is related to. This feature is motivated by a situation where a script was misusing the API and calling kill on all the jobs for a show on a regular basis. Without this feature, finding where the requests were coming from was a big endeavor. Summarize your change. This change requires that a kill request also provide username, pid, host_kill and reason. --------- Signed-off-by: Diego Tavares <dtavares@imageworks.com> Co-authored-by: Roula O'Regan <roregan@imageworks.com>
Commit: | 9a033a9 | |
---|---|---|
Author: | Diego Tavares | |
Committer: | GitHub |
Fix Criteria.java query search parameters (#1369) Fix a bug in the overloaded method addPhraseRange for InRangeIntegerSearchCriterion. The method needs an extra search parameter and needs to call min/max and not value. Added two new parameters to the buildWhereClause for "greater than" and "lower than" memory proc searches. --------- Co-authored-by: Roula O'Regan <roregan@imageworks.com>
Commit: | d9d24f0 | |
---|---|---|
Author: | Rosa Behrens Camp | |
Committer: | GitHub |
Min mem increase (#1157) Added the ability for admins to set the minimum memory increase value on a service. OpenCue increases the minimum memory requirements every time a job is retried when it fails from running out of memory but the old memory increase (hardcoded to 2G) was causing some jobs to be retried many times before it finally succeeded.
Commit: | 6fafd62 | |
---|---|---|
Author: | Diego Tavares da Silva | |
Committer: | GitHub |
[rqd] Core affinity for cache optimization (#1171) When possible, try to book frames from the same Layer on the same core to leverage shared cache
Commit: | 81a272a | |
---|---|---|
Author: | Akim Ruslanov | |
Committer: | GitHub |
Add layer max cores. (#1125)
Commit: | 3893b87 | |
---|---|---|
Author: | Diego Tavares da Silva | |
Committer: | GitHub |
Improve view running procs (#1141) * Improve view running procs This changes the widget to only display info about the selected frame instead of all the running frames for the job. * pylint pass * pylint pass * Update VERSION.in Co-authored-by: Brian Cipriano <brian.cipriano@gmail.com> Co-authored-by: Brian Cipriano <brian.cipriano@gmail.com>
Commit: | 02cb67f | |
---|---|---|
Author: | roulaoregan-spi | |
Committer: | GitHub |
Add more exit codes to Frame state waiting (#1131) * Add more exit codes to Frame state waiting When determining the frame state cuebot needs to make sure that certain exit statuses put the frame state back into waiting, this will help save time for users when a frame fails for host hardware issues. (cherry picked from commit cbeda9387b09a8713bb76d9075fdee72c3c795f1) * Add more exit codes to Frame state waiting * Removed unused method updateFrameHostDown * Removed deprecated oracle FrameDaoJdbc file * Remove log debugging from FrameCompleteHandler * Reverting, adding updateFrameHostDown FrameDao method * Verion bump, add missing interface checkRetries method Co-authored-by: Diego Tavares da Silva <dtavares@imageworks.com>
Commit: | 5536904 | |
---|---|---|
Author: | roulaoregan-spi | |
Committer: | GitHub |
Add Proc's child PIDs to Host report stats (#1130) * Add Proc's child PIDs to Host report stats Users require additional information about the running frame.data pid. For each parent process (frame.pid) add to report.proto the child process: name, rss, vsize, state, cmdline, pid. This additional info will get stored in the Proc table, while proc is running; users can view child proc stats via Cuegui and rqlog will output the highest recorded values for rss for each child pid. * Fix Proc Pylint errors * Add more exit codes to Frame state waiting When determining the frame state cuebot needs to make sure that certain exit statuses put the frame state back into waiting, this will help save time for users when a frame fails for host hardware issues. (cherry picked from commit cbeda9387b09a8713bb76d9075fdee72c3c795f1) * Add more exit codes to Frame state waiting * Removed unused method updateFrameHostDown * Removed deprecated oracle FrameDaoJdbc file * Remove log debugging from FrameCompleteHandler * Fixes to RQD and unittests * Remove unrelated changes from FrameDao * Fix pylint errors for rqmachine.py * Remove unrelated frame changes from pycue * Removing unrelated change from DispatchSupportService * Fix merged conflicts for ProcDaoTests
Commit: | 9e03b95 | |
---|---|---|
Author: | Kazuki Sakamoto | |
Committer: | GitHub |
Add Job.shutdownIfCompleted API method. (#1033)
Commit: | c22fe12 | |
---|---|---|
Author: | Kazuki Sakamoto | |
Committer: | GitHub |
Add multiple GPU support #760 (#924)
Commit: | 847ee50 | |
---|---|---|
Author: | Kazuki Sakamoto | |
Committer: | GitHub |
Add GetDefault and SetDefault to AllcationInterface (#939)
Commit: | a77c1ff | |
---|---|---|
Author: | Kazuki Sakamoto | |
Committer: | GitHub |
Remove blackout (#946) * Remove blackout * Bump up version
Commit: | dff882c | |
---|---|---|
Author: | Lars van der Bijl | |
Committer: | GitHub |
feat: Add timeout and LLU timeout (#761) Add support for layers to have timeout. If a frame goes past it's hard timeout it get's killed. LLU timeout is usually a lower value that check when the last log update has happend. if no update happens in the LLU window it's also killed. Closes #462
Commit: | 3877834 | |
---|---|---|
Author: | George Pollard | |
Committer: | GitHub |
Make UID optional in frame submission (#618)
Commit: | a481531 | |
---|---|---|
Author: | Lars van der Bijl | |
Committer: | Brian Cipriano |
Fix passing message to kill frame (#546)
Commit: | 9e4fb1c | |
---|---|---|
Author: | Greg Denton | |
Committer: | GitHub |
Add basic limits functionality (#414) * Initial add of Limits to Layers. Allows users to specify arbitrary limits to job layers, to prevent running too many tasks. This is just the backend implementation, there will be separate commits for the front end and dispatcher updates. * Adding helper get limit names method to LayerDao, to make it easier to populate the Limit list on a returned layer object * cleanup * remove unnecessary import * initial add of limit client code * hooking up limit database queries * adding current_value to limit object * adding limits to pycue * update layer when properties change * adding current running column to LimitsWidget * Add functionality to edit limits on an existing layer * improving tests * fixing some layer tests * switch to use streams instead of loops
Commit: | cebab82 | |
---|---|---|
Author: | Greg Denton | |
Committer: | GitHub |
Fixes for Layer.getFrames (#447) * Remove unnecessary layer clean logic * Allow getFrames to set limit value
Commit: | 14b3b43 | |
---|---|---|
Author: | Brian Cipriano | |
Committer: | GitHub |
Add more MenuActions tests. (#410)
Commit: | 0ffe7b2 | |
---|---|---|
Author: | Greg Denton | |
Committer: | GitHub |
Adding unittests for group and host wrappers. (#324) NestedGroup asGroup function now returns a Group wrapper object. Correctly labeling allocation id as id in Host SetAllocation call.
Commit: | 2e95537 | |
---|---|---|
Author: | Brian Cipriano | |
Committer: | GitHub |
CueAdmin code cleanup and test coverage. (#295)
Commit: | 7f874e0 | |
---|---|---|
Author: | Greg Denton | |
Committer: | GitHub |
getJobWhiteboard returns groups only and a list of job ids (#264) * getJobWhiteboard returns groups only and a list of job ids, instead of all nested items. In response to Issue #251.
Commit: | 1addf71 | |
---|---|---|
Author: | Greg Denton | |
Committer: | GitHub |
Fixing subgroups for monitor cue plugin in cuegui (#236) * Fixing subgroups for monitor cue plugin in cuegui
Commit: | 9a5a114 | |
---|---|---|
Author: | Brian Cipriano | |
Committer: | GitHub |
Clean up PyCue unit tests and enable them in the docker build. (#201)
Commit: | 2489e55 | |
---|---|---|
Author: | Greg Denton | |
Committer: | GitHub |
opencue python renaming (#109) * opencue python renaming
Commit: | dd8b4ae | |
---|---|---|
Author: | Greg Denton | |
Committer: | GitHub |
Rqd cleanup & fixes (#98) * adding attrs so rqd doesnt need to overload the message object * grpc messages should not be overloaded and use snake case * switch syslog setup to to try to use /dev/log, if not available use localhost on all OSs.
Commit: | a280b6a | |
---|---|---|
Author: | Brian Cipriano | |
Committer: | GitHub |
Move proto files to be a top-level directory. (#47)
Commit: | cca6828 | |
---|---|---|
Author: | Brian Cipriano | |
Committer: | GitHub |
Migrate RQD to only use GRPC. (#32)
Commit: | 02f79f9 | |
---|---|---|
Author: | Brian Cipriano | |
Committer: | GitHub |
Migrate Criterion usage to GRPC. (#30)
Commit: | daf1b38 | |
---|---|---|
Author: | Greg Denton | |
Committer: | GitHub |
grpc for spi_cue (#23) * grpc for spi-cue
Commit: | d58f05a | |
---|---|---|
Author: | Greg Denton | |
Committer: | GitHub |
GRPC changes for cuebot (#15) * Switch the servants to GRPC and remove the ICE servants. * Rename the top level entity objects to clear up naming conflicts with rpc objects. The naming convention is now: Action : rpc object ActionEntity : Object that inherits from Entity ActionInterface: Interface to the Entity * Updates to the Whiteboard to work with the gRPC objects. * Move methods in CueStatic into object specific classes since they are all static now. * Enums are capital by convention
Commit: | 1b05364 | |
---|---|---|
Author: | Greg Denton | |
Committer: | GitHub |
Adding proto files for grpc (#11) * Adding protos for cuebot * gRPC setup for Allocations, Subscriptions, and Tasks * Update rqd and spi_cue to use new protos * Rename top level Allocation object to AllocationEntity and AllocationInterface
Commit: | 01677b9 | |
---|---|---|
Author: | Greg Denton | |
Committer: | GitHub |
Initial GRPC setup for cue3 (#3) This is an initial commit showing the general process of how we are adding in gRPC. - Add grpc servers to cue3bot and rqd - Add cue and rqd proto files in cue3bot/src/main/proto - Add grpc applicationContext to spring - Switch Facility to using gRPC - Update spi_cue to use gRPC and add tests * Cleaning up whitespace * remove todo question