These commits are when the Protocol Buffers files have changed: (only the last 100 relevant commits are shown)
| Commit: | de0fce6 | |
|---|---|---|
| Author: | Junlin Li | |
| Committer: | GitHub | |
feat(requeue): add frontend requeue support and follow-up fixes (#431)
The documentation is generated from this commit.
| Commit: | 4f315c2 | |
|---|---|---|
| Author: | junlinli | |
| Committer: | junlinli | |
feat(requeue): add frontend requeue support and follow-up fixes
The documentation is generated from this commit.
| Commit: | 8a73029 | |
|---|---|---|
| Author: | huerni | |
| Committer: | GitHub | |
feat: Add [account/user-partition] resource limit (#287) * feat: add partition resource limits and modify account handling * feat: add support for --show-partition flag in global flags * fix: format ModifyAccountRequest initialization for consistency * fix: standardize comment formatting in proto files for clarity * fix: add missing newline at end of proto files * fix: rename flag from --show-partition to --partition-limit for clarity * feat: add partition-specific resource limit error codes and messages * docs/test: Add partition resource limit help and test script - Update help.go to document new partition resource limit commands: - Add --partition-limit/-P flag description in show account/user - Add account partition resource limit options in modify section - Add user partition resource limit options in modify section - Add --partition-limit/-P to GLOBAL OPTIONS - Add scripts/test_partition_limit.sh: - Test setting account partition limits (maxJobs, maxSubmitJobs, maxTres, maxTresPerJob, maxWall, maxWallPerJob) - Test setting user partition limits (same fields) - Test show partition limits with --partition-limit and -P flags - Test error cases (missing partition in where clause) - Test job submission enforcement of partition limits - Test resetting partition limits * fix: remove newline from error messages and add comments for maxWall limit clarification * fix: add missing newlines to error messages for partition-specific resource limits * fix: remove test script for partition-specific resource limits * fix: update partition-specific resource limit error codes for consistency * fix: update MaxTresPerJob and MaxWallDurationPerJob enum values for resource limits * fix: add error message for PMIx error in error message ma
| Commit: | a42e507 | |
|---|---|---|
| Author: | RileyWen | |
proto: sync frontend definitions with backend
| Commit: | 9c3f389 | |
|---|---|---|
| Author: | RileyWen | |
Merge remote-tracking branch 'origin/master' into feat/metrics # Conflicts: # protos/Crane.proto # protos/PublicDefs.proto
| Commit: | f495dcb | |
|---|---|---|
| Author: | RileyWen | |
trace: add queue count and writer stats
| Commit: | 776fdc4 | |
|---|---|---|
| Author: | NamelessOIer | |
| Committer: | NamelessOIer | |
chore: sync step launch mode proto
| Commit: | 93482b8 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
feat: add partition-specific resource limit error codes and messages
| Commit: | 24ab4db | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
fix: standardize comment formatting in proto files for clarity
| Commit: | 22a50c5 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
fix: add missing newline at end of proto files
| Commit: | db8d8b7 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
fix: update partition-specific resource limit error codes for consistency
| Commit: | 6feb90a | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
fix: update MaxTresPerJob and MaxWallDurationPerJob enum values for resource limits
| Commit: | 6f99586 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
feat: add partition resource limits and modify account handling
| Commit: | 642c7c3 | |
|---|---|---|
| Author: | huerni | |
| Committer: | GitHub | |
feat: cattach (#325) * feat * feat: cancel task when cfored restart * feat: cattach recive msg * feat: when task finish cattach close * feat: cattach input * fix some bug * feat: add cattach history * refactor * feat: ctrl+c not input * fix: multi node * feat: --layout * refactor * merge master * feat * refactor ctldReplyChannelMapForCattachByStep * feat: cattach connect step * fix: execCranedIds is empty * fix: uid * refactor(cattach,cfored): rename task-based proto types to step-based Replace `TASK_CONNECT_REQUEST/REPLY` with `STEP_CONNECT_REQUEST/REPLY`, `TASK_COMPLETION_ACK_REPLY` with `STEP_COMPLETION_ACK_REPLY`, and `TASK_X11_FORWARD` with `STEP_X11_FORWARD` across cattach and cfored to align with updated proto definitions. Also remove redundant `TaskId` fields from IO and X11 forward request payloads, and fix stale log messages that referenced wrong component names (Crun instead of Cattach). * feat(cattach): handle task exit status and step cancel request Add handling for `TASK_EXIT_STATUS` reply in the `StateForwarding` method to capture and report non-zero exit codes. If the task was signaled, a "Terminated" message is printed; otherwise, the specific exit code is reported. The exit code is stored in `m.err` for propagation. Also add handling for `STEP_CANCEL_REQUEST` to log a trace message and passively wait for `STEP_COMPLETION_ACK_REPLY` instead of actively closing the step. * refactor(cfored): rename proto types to align with step/job semantics - Rename `StreamTaskIORequest` to `StreamStepIORequest` across cattach_server.go and server.go - Rename `TASK_X11_OUTPUT` to `STEP_X11_OUTPUT` and update X11 forward reply to include `CranedId` and `LocalId` fields - Rename `TASK_CANCEL_REQUEST` to `JOB_CANCEL_REQUEST` and `TASK_COMPLETION_REQUEST` to `JOB_COMPLETION_REQUEST` in ctld_client.go - Replace `InteractiveTaskType` with `InteractiveJobType` in completion request payloads - Clean up duplicate imports in cfored.go * refactor(cfored): consolidate X11 I/O forwarding and fix stream error handling - Remove `forwardRemoteIoToCrun` function and replace its usage with the existing `forwardRemoteIoToFront` for X11-related messages, eliminating redundant forwarding logic - Fix misplaced `toSupervisorStream.Send` error handling block: move it from the `cattachReq` branch to the `crunReq` branch where it belongs - Fix indentation of `channel <- nil` in `SupervisorCrashAndRemoveAllChannel` - Reorder imports in `cfored.go` and `x11.go` to follow Go conventions (stdlib first, then internal, then third-party) - Remove stale TODO comment in `StepIOStream` * fix(cfored): handle TASK_EXIT_STATUS and X11 msg types in cattach Previously, the default case would call `log.Fatalf` and crash cfored upon receiving unexpected message types. This change explicitly handles `TASK_EXIT_STATUS` by skipping it (completion is signaled via `JOB_COMPLETION_ACK_REPLY`), adds stub handling for `STEP_X11_CONN` and `STEP_X11_EOF` with TODO notes for future support, and replaces `log.Fatalf` with `log.Errorf` in the default case to avoid crashing cfored on truly unexpected message types. * fix(cattach): handle step completion race before IO forwarding - Add early detection in CattachWaitIOForward to check if the crun channel is still active before attempting to wait for supervisor channels, preventing indefinite blocking when a step completes early - Add a non-blocking pre-forwarding check for JOB_COMPLETION_ACK_REPLY from ctld to handle the case where the step finishes between state transitions - Add a second crun channel liveness check after the readyChannel wait to catch completions that occur during supervisor setup - Remove unused `forwardEstablished` atomic bool that was never read - Simplify switch statement indentation in cattach StateForwarding and add a default warning log for unhandled message types - Remove TASK_EXIT_STATUS and STEP_CANCEL_REQUEST handling from cattach StateForwarding as exit status reporting is now handled elsewhere * feat(cattach): add --label, --output-filter, --input-filter, and --quiet flags Introduce `TaskOutputMsg` to carry task ID alongside output data through internal channels, enabling per-task I/O control: - `--label`: prepend `[task_id]: ` to each output line via `applyLabel` - `--output-filter`: suppress output from tasks other than the specified one - `--input-filter`: direct stdin to a specific task instead of broadcasting - `--quiet`: suppress the "Task io forward ready" status message * refactor(cfored): rename task states to step/job in cattach server Replace `CattachWaitTaskMeta` with `CattachWaitStepMeta` and `CattachWaitTaskComplete` with `CattachWaitJobComplete` to align state names with step/job terminology. Also replace incorrect `End` state transitions with `DeadCattach` for broken connection handling, and update log messages to reflect the new naming conventions. * fix(cattach/cfored): fix goroutine leaks and missing cleanup on task end Call `taskFinishCb()` before transitioning to the End state in `StateForwarding` to ensure all IO goroutines (StdinReaderRoutine, StdoutWriterRoutine, input-forward) are signaled to exit cleanly and don't leak when the task finishes, encounters an error, or the Cfored connection is broken. Also add `broadcastStopWaiting()` in cfored's cattach and crun server state machines to wake up goroutines blocked in `waitSupervisorChannelsReady` so they can observe `stopWaiting == true` and exit promptly after receiving a completion/cancel reply from Ctld. * fix(cattach/crun/cfored): prevent channel deadlock on Ctrl+C or slow terminal Replace bare channel sends in `StateForwarding` and `StateWaitAck` with `select` statements that also listen on a cancellation context (`taskFinishCtx`/`stopStepCtx`). Without this guard, a slow terminal fills `chanOutputFromRemote`, permanently blocking the forwarding loop and making SIGINT/Ctrl+C ineffective. Additional fixes: - Pre-initialize `stopStepCtx`/`stopStepCb` in `crun.Init()` to avoid nil-pointer dereferences when `StateWaitAck` is reached without ever entering `StateForwarding`. - Add a `globalCtx.Done()` case in cfored's `WAIT_TASK_IO_FORWARD` loop so cfored transitions to `CancelJobOfDeadCrun` immediately on shutdown instead of racing against a synthetic `JOB_CANCEL_REQUEST` that may not arrive within the 30-second window. * fix(cfored): downgrade spurious "unknown crun/cattach" warning to debug Replace the Warning-level log with a Debug-level message when no front-end is connected for a step during IO forwarding. This condition is expected during the brief window between crun exiting and the supervisor stopping its output stream, so a Warning was misleading and noisy. The new message also better explains the situation. * fix(cfored): use non-blocking send to prevent supervisor IO deadlock Replace the blocking channel send in `forwardRemoteIoToFront` with a non-blocking `select` to prevent deadlocks caused by slow front-end consumers (e.g. terminals). Previously, a backed-up front-end channel could block the supervisor IO stream, causing cfored, the supervisor stream, and crun/cattach to wait on each other indefinitely. The new behavior: - Delivers messages immediately if the channel has capacity - Drops messages when the front-end channel is full, logging a trace - Returns early if cfored is shutting down (globalCtx cancelled) Dropping output lines for slow interactive consumers (crun/cattach) is an acceptable trade-off to keep the supervisor stream flowing. * refactor(cattach): replace X11 channels with X11SessionMgr Replace the single-channel X11 forwarding approach (`chanX11InputFromLocal`, `chanX11OutputFromRemote`) with an `X11SessionMgr` that handles per-session demultiplexing by `(CranedId, LocalId)` pairs. This mirrors the session manager used in crun and enables multi-task X11 jobs where each task can open concurrent X11 sessions. The old `StartX11ReaderWriterRoutine` is removed in favor of `X11SessionMgr.SessionMgrRoutine`, and reply routing now handles `STEP_X11_CONN`, `STEP_X11_FORWARD`, and `STEP_X11_EOF` messages through the manager. * style: fix indentation and alignment in cattach state machine code Correct misaligned switch/case blocks, const declarations, and select statements in cattach.go and cattach_server.go to properly reflect their logical nesting level. No functional changes were made. * refactor(cattach): remove X11 forwarding support from cattach Remove all X11 forwarding logic from the cattach state machine, including the X11SessionMgr field, X11 request/reply channel handling in StateForwarding, and X11 session initialization in StartIOForward. Also removes corresponding X11 message routing in cfored's cattach server (STEP_X11_FORWARD case) and cleans up related forwarding logic in forwardTaskMsgToCattach. * style: fix indentation of switch cases in cattach_server.go Corrected improper indentation of `case` blocks within the `switch` statement in the `CforedCattachStateMachineLoop` function to align with Go formatting conventions. * fix(cfored): correct map references, types, and IO routing in server - Fix wrong map reference from `taskIORequestChannelMap` to `stepIORequestChannelMap` in `setRemoteIoToFrontChannel` - Replace incorrect `StreamTaskIORequest` type with `StreamStepIORequest` in channel and buffer initialization - Remove duplicate `getStepDoneChannel` function definition - Fix `TASK_ERR_OUTPUT` to route via `forwardRemoteIoToFront` instead of `forwardRemoteIoToCrun` - Fix indentation in supervisor request switch statement * feat(cattach): add stderr forwarding support for attached tasks - Add `chanErrOutputFromRemote` channel to `StateMachineOfCattach` for handling stderr output separately from stdout - Implement `StderrWriterRoutine` to write remote stderr output to local `os.Stderr`, with proper draining on task completion - Handle new `TASK_ERR_OUTPUT_FORWARD` reply type in `StateForwarding` to route stderr messages to the dedicated channel - Update `cfored` to forward `TASK_ERR_OUTPUT` messages from supervisor to cattach clients using the new proto message type - Add `TaskIOErrOutputForwardReply` proto message and `TASK_ERR_OUTPUT_FORWARD` enum value to `StreamCattachReply` This enables cattach to properly separate and display stderr output from remote tasks instead of mixing it with stdout. * fix(crun): handle stopStepCtx and globalCtx in I/O forwarding states Add missing context cancellation cases in `StateForwarding` and `StateWaitAck` to prevent goroutine hangs when a job is terminated or cfored connection is lost: - In `StateForwarding`: transition to `JobKilling` when `stopStepCtx` is done during stdout/stderr forwarding - In `StateWaitAck`: silently drop output messages when `stopStepCtx` is done (allows draining replyChannel to receive `STEP_COMPLETION_ACK_REPLY`), and transition to `End` state when `globalCtx` is cancelled due to cfored shutdown or connection loss * feat(cattach): add read-only mode when step has exclusive stdin routing When a job step is started with `crun --input=<task_id>`, the step's `IoMeta.InputTaskId` is set, meaning stdin is exclusively routed to one specific task. In this case, `cattach` now automatically enters read-only mode: it displays task output but does not forward any stdin from the local terminal. Key changes: - Add `FlagReadOnly` flag in `cmd.go`, set automatically during `StateWaitForward` by inspecting `IoMeta.InputTaskId` (non-nil and non-PTY implies read-only) - Skip `StdinReaderRoutine` and the stdin-forwarding goroutine in `StartIOForward`/`StateForwarding` when `FlagReadOnly` is true - PTY mode is explicitly excluded from read-only since it always requires full interactive control * fix(cfored): fix channel leak and IO forwarding in cattach/supervisor - Remove ctldReplyChannelMapByPid entry when cattach disconnects before ctld replies, preventing a permanent channel leak that bypassed the normal cleanup path in the STEP_META_REPLY branch - Remove the non-blocking select default case in forwardRemoteIoToFront to stop silently dropping IO messages when the front-end channel is full - Delete the outer stepIORequestChannelMap entry when the last cattach disconnects so forwardRemoteIoToFront correctly detects no connected front-end instead of iterating over an empty map * fix(cfored): prevent supervisor IO stream blocking on slow front-end consumers Add a `default` branch to the channel send select in `forwardRemoteIoToFront` so that a full or already-exited front-end IO channel never blocks the supervisor IO forwarding goroutine. Messages are dropped with a warning log when the channel is full, ensuring the supervisor stream remains unblocked. * fix(cfored): reject cattach for non-interactive jobs Add a check for interactive metadata in the STEP_META_REPLY handler. If a job lacks interactive metadata (e.g., a batch job), cattach is now explicitly rejected with a clear failure reason instead of causing a potential nil pointer dereference. Also refactors the reply block and extracts `failureReason` earlier to be used consistently in all failure paths. * fix(cfored): increase channel capacities and use dedicated cattach map Increase `ctldReplyChannel` buffer from 2→8 in both calloc and cattach servers to prevent overflow when `WaitAllFrontEnd` pre-sends cancel and completion ACK messages alongside pre-existing ctld messages. Increase `TaskIoRequestChannel` buffer from 2→64 in cattach to reduce message drops in `forwardRemoteIoToFront` when the supervisor produces output faster than cattach consumes it. Replace usage of `ctldReplyChannelMapByPid` with the dedicated `ctldReplyChannelMapForCattachByPid` in cattach so that `WaitAllFrontEnd` sends the correct termination message type (`JOB_COMPLETION_ACK_REPLY`) instead of the calloc/crun-specific `JOB_ID_REPLY`, and ensures proper cleanup on early disconnects to avoid channel leaks. * fix(cattach): improve data handling in StateMachine and avoid potential leaks in cattach server * fix(cfored): improve error handling in CattachStream and replace fatal logs with error logs * fix(crun): remove unused stopStepCtx references and clean up StateForwarding logic * Implement feature X to enhance user experience and optimize performance * fix(cattach): align RootCmd variable declaration for improved readability * fix(cattach): add licensing information and improve code comments for clarity * fix(cattach): improve EOF handling in StateForwarding and enhance error messages for job and step IDs Co-authored-by: Copilot <copilot@github.com> * fix(cattach): capture pre-registration history in setRemoteIoToFrontChannel to prevent message duplication * fix(cattach): enhance crash signal handling and remove unused X11 forward messages from StreamCattachRequest * fix(cattach): update copyright year to 2026 in cattach.go * style: address PR #325 review comments from L-Xiafeng - LX1: add comment clarifying map key is cattach process PID - LX3: rename cattachRequestChannelMap -> cattachRequestChannel (the field is a chan, not a map; fix misleading name) - LX4: add Doxygen-style comments to TaskIOBuffer struct and its Push/GetHistory methods - LX7: rename getRemoteHistory(taskId) param to jobId for consistency with StepIdentifier.JobId and rest of the codebase * fix(cfored): send STEP_META_REPLY failure instead of JOB_COMPLETION_ACK_REPLY to cattach on cfored disconnect When WaitAllFrontEnd notifies cattach clients waiting for STEP_META_REPLY, sending JOB_COMPLETION_ACK_REPLY was semantically incorrect: CattachWaitStepMeta would transition to DeadCattach and send STEP_COMPLETION_ACK_REPLY back to the cattach client, making it look like the step completed successfully with no error. Fix by sending STEP_META_REPLY{Ok: false} instead. This reuses the existing normal failure path in CattachWaitStepMeta, which correctly sends STEP_CONNECT_REPLY{Ok: false, FailureReason: "Cfored is not connected to CraneCtld."} back to the cattach client, causing it to print an error message and exit with a non-zero exit code. * fix(cfored): update message handling to use STEP_META_REPLY instead of JOB_COMPLETION_ACK_REPLY for cattach sessions * fix(cfored): update comments to clarify usage of job ID and process PID in cattach and Crun/Cattach mappings * fix(cattach): clarify comment to specify removal of cattach by [jobid, stepId] in ctldReplyChannel * fix(cattach, cfored): improve handling of channel sends to prevent blocking and ensure graceful exits * refactor(cfored): generalize request forwarding to reduce code duplication * feat(cattach): add craned_task_map to StreamCtldReply for task ID mapping * refactor(cattach, cfored): replace StepToCtld with CattachStepInfo in StreamCtldReply and related structures * feat(cattach): enhance PrintStepLayout to display per-node task breakdown and add craned_task_map to CattachStepInfo * refactor(cfored): update StreamCtldReply to remove craned_task_map and adjust step_info usage * feat(cattach, cfored): implement task ID validation and improve stdin handling for input filters * refactor(cattach): improve error handling for non-interactive jobs in cattach * feat(goreleaser): add cattach binary to package contents --------- Co-authored-by: Copilot <copilot@github.com>
| Commit: | 6a53f6b | |
|---|---|---|
| Author: | Zhang Yanwen | |
| Committer: | GitHub | |
feat: Add array (#422) * feat: add array job support across CLI and protos * fix: address review comments and bug fixes * accept comments * fix * fix * delete dead code * fix * accept comments * cancel output --------- Co-authored-by: crane-dev <crane-dev@local>
| Commit: | da739ba | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
merge master
| Commit: | aa645d9 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
feat: cattach recive msg
| Commit: | 464c91b | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
feat: cattach connect step
| Commit: | d55fce9 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
refactor(cattach,cfored): rename task-based proto types to step-based Replace `TASK_CONNECT_REQUEST/REPLY` with `STEP_CONNECT_REQUEST/REPLY`, `TASK_COMPLETION_ACK_REPLY` with `STEP_COMPLETION_ACK_REPLY`, and `TASK_X11_FORWARD` with `STEP_X11_FORWARD` across cattach and cfored to align with updated proto definitions. Also remove redundant `TaskId` fields from IO and X11 forward request payloads, and fix stale log messages that referenced wrong component names (Crun instead of Cattach).
| Commit: | 9c11f09 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
fix(cattach): enhance crash signal handling and remove unused X11 forward messages from StreamCattachRequest
| Commit: | 1fb8fa1 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
Implement feature X to enhance user experience and optimize performance
| Commit: | 10e5fae | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
fix(cattach): add licensing information and improve code comments for clarity
| Commit: | 62b32b1 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
feat(cattach): enhance PrintStepLayout to display per-node task breakdown and add craned_task_map to CattachStepInfo
| Commit: | 9e7e971 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
fix(cfored): correct map references, types, and IO routing in server - Fix wrong map reference from `taskIORequestChannelMap` to `stepIORequestChannelMap` in `setRemoteIoToFrontChannel` - Replace incorrect `StreamTaskIORequest` type with `StreamStepIORequest` in channel and buffer initialization - Remove duplicate `getStepDoneChannel` function definition - Fix `TASK_ERR_OUTPUT` to route via `forwardRemoteIoToFront` instead of `forwardRemoteIoToCrun` - Fix indentation in supervisor request switch statement
| Commit: | 38ea077 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
feat(cattach): add craned_task_map to StreamCtldReply for task ID mapping
| Commit: | 5519716 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
feat(cattach): add --label, --output-filter, --input-filter, and --quiet flags Introduce `TaskOutputMsg` to carry task ID alongside output data through internal channels, enabling per-task I/O control: - `--label`: prepend `[task_id]: ` to each output line via `applyLabel` - `--output-filter`: suppress output from tasks other than the specified one - `--input-filter`: direct stdin to a specific task instead of broadcasting - `--quiet`: suppress the "Task io forward ready" status message
| Commit: | 918b55e | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
refactor(cfored): update StreamCtldReply to remove craned_task_map and adjust step_info usage
| Commit: | 7092998 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
refactor(cattach, cfored): replace StepToCtld with CattachStepInfo in StreamCtldReply and related structures
| Commit: | 10b09b2 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
feat(cattach): add stderr forwarding support for attached tasks - Add `chanErrOutputFromRemote` channel to `StateMachineOfCattach` for handling stderr output separately from stdout - Implement `StderrWriterRoutine` to write remote stderr output to local `os.Stderr`, with proper draining on task completion - Handle new `TASK_ERR_OUTPUT_FORWARD` reply type in `StateForwarding` to route stderr messages to the dedicated channel - Update `cfored` to forward `TASK_ERR_OUTPUT` messages from supervisor to cattach clients using the new proto message type - Add `TaskIOErrOutputForwardReply` proto message and `TASK_ERR_OUTPUT_FORWARD` enum value to `StreamCattachReply` This enables cattach to properly separate and display stderr output from remote tasks instead of mixing it with stdout.
| Commit: | 2a54365 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
refactor(cfored): rename proto types to align with step/job semantics - Rename `StreamTaskIORequest` to `StreamStepIORequest` across cattach_server.go and server.go - Rename `TASK_X11_OUTPUT` to `STEP_X11_OUTPUT` and update X11 forward reply to include `CranedId` and `LocalId` fields - Rename `TASK_CANCEL_REQUEST` to `JOB_CANCEL_REQUEST` and `TASK_COMPLETION_REQUEST` to `JOB_COMPLETION_REQUEST` in ctld_client.go - Replace `InteractiveTaskType` with `InteractiveJobType` in completion request payloads - Clean up duplicate imports in cfored.go
| Commit: | 00cceea | |
|---|---|---|
| Author: | huerni | |
| Committer: | GitHub | |
feat: Add CpuTopology message and integrate CPU socket configuration (#437) * feat: Update CPU topology definitions and add socket information * feat: Update CPU topology structure to NodeTopoInfo and adjust references * fix: Remove redundant comment in NodeTopoInfo message
| Commit: | 94a0ebd | |
|---|---|---|
| Author: | huerni | |
| Committer: | GitHub | |
feat: pmix v2 (#421) * feat * feat: merge master * style: fix alignment of FlagMpi variable declaration in calloc cmd.go * fix(calloc): remove unused MPI flag from calloc command * fix: add MpiTypePmix constant and validate MPI type in MainCrun function * fix(crun): validate FlagMpi before comparing with MpiTypePmix in MainCrun function
| Commit: | 61d62dd | |
|---|---|---|
| Author: | NamelessOIer | |
| Committer: | GitHub | |
feat: preempt_v2 (#429) * feat: preempt_v2 * style: cleanup * fix
| Commit: | 7592f86 | |
|---|---|---|
| Author: | huerni | |
feat: Add NodeTopoInfo to CranedInfo and implement topology formatting
| Commit: | 00ef75f | |
|---|---|---|
| Author: | RileyWen | |
Merge branch 'master' into feat/metrics Resolve proto conflicts: - Supervisor.proto: keep tracing fields + master's TerminateStepRequest - Crane.proto: keep Suspend/Resume/TerminateOrphaned RPCs from feature branch Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| Commit: | 4337761 | |
|---|---|---|
| Author: | huerni | |
refactor: Enhance error code documentation in Crane.proto
| Commit: | b05f276 | |
|---|---|---|
| Author: | Junlin Li | |
| Committer: | GitHub | |
feat: Add completing status (#423) * feat: Add completing status * sync protos
| Commit: | d36b65c | |
|---|---|---|
| Author: | RileyWen | |
Merge branch 'master' into feat/metrics Sync proto files with backend CraneSched feat/metrics branch. Resolve merge conflict in cacct.go by adopting new ResourceV3 API.
| Commit: | ea9b083 | |
|---|---|---|
| Author: | RileyWen | |
feat: add service name and span status to SpanInfo in Plugin.proto
| Commit: | 11a2ad7 | |
|---|---|---|
| Author: | NamelessOIer | |
| Committer: | GitHub | |
resource v3 (#424) * feat: resource v3 * fix
| Commit: | 0c18873 | |
|---|---|---|
| Author: | Zhang Yanwen | |
| Committer: | GitHub | |
feat: add TaskStatus Suspend (#335) * feat: add suspend and resume job commands * add suspended task status to TaskStatus enum * ban cgroupv1 * fix * add suspend in ccancel -t * fix * fix * fix * fix: correct JobStatus enum numbering for Suspended and Deadline Suspended = 10, Deadline = 11 to avoid proto enum collision. Matches backend proto definition.
| Commit: | 5c82c70 | |
|---|---|---|
| Author: | NamelessOIer | |
fix
| Commit: | cc5967d | |
|---|---|---|
| Author: | NamelessOIer | |
| Committer: | NamelessOIer | |
feat: resource v3
| Commit: | 166380c | |
|---|---|---|
| Author: | edragain | |
| Committer: | GitHub | |
feat: cbatch --deadline (#341) * feat: add cbatch --deadline (squash commits) * fix fmt
| Commit: | 89513ff | |
|---|---|---|
| Author: | Junlin Li | |
| Committer: | GitHub | |
feat: Batch input/output/error option with file pattern (#377) * feat: Crun Cbatch input/output/err fix: Usage showed when crun cmd return exit code 2 feat: Crun input output err feat: Crun varible from env feat: Batch input/output/error option with file pattern Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix: crun input Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix: Format Signed-off-by: lijunlin <xiafeng.li@foxmail.com> chore: Log format Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix: crun output Signed-off-by: lijunlin <xiafeng.li@foxmail.com> style: format Signed-off-by: lijunlin <xiafeng.li@foxmail.com> refactor: crun step res field, inherit job when not specified Signed-off-by: lijunlin <xiafeng.li@foxmail.com> refactor: Update terminology from task to job and pass jobId to StartTerminal refactor: Remove pending reason for step Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix: cacct elapsed time Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix: Auth uid Signed-off-by: lijunlin <xiafeng.li@foxmail.com> refactor step query Signed-off-by: lijunlin <xiafeng.li@foxmail.com> docs: Add error string Signed-off-by: lijunlin <xiafeng.li@foxmail.com> chore: Sync protos Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix Signed-off-by: lijunlin <xiafeng.li@foxmail.com> refactor feat: Crun step submit feat: Enhance job and step processing with unified data structure and parsing functions Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix:step erro msg fix: x11 multi conn Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix: X11 multi conn Signed-off-by: lijunlin <xiafeng.li@foxmail.com> feat: X11 multi conn Signed-off-by: lijunlin <xiafeng.li@foxmail.com> task exit status Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix: Crun app args Signed-off-by: lijunlin <xiafeng.li@foxmail.com> sync protos Signed-off-by: lijunlin <xiafeng.li@foxmail.com> feat: Introcduce multi level cgrop for step/task Signed-off-by: lijunlin <xiafeng.li@foxmail.com> format sync protos fix: Enhance error handling and cleanup in crun and supervisor channels (#419) docs: Add comments to clarify the purpose of stepDoneChannel in SupervisorChannelKeeper refactor: Improve IO handling with goroutines and context management in StateMachineOfCrun refactor: Improve context management for IO forwarding in StateMachineOfCrun * format * refactor: reorder fields in StepToD message for clarity
| Commit: | ed2090c | |
|---|---|---|
| Author: | RileyWen | |
Merge branch 'master' into feat/metrics
| Commit: | 3ac9370 | |
|---|---|---|
| Author: | junlinli | |
refactor: reorder fields in StepToD message for clarity
| Commit: | c9fa425 | |
|---|---|---|
| Author: | Junlin Li | |
| Committer: | junlinli | |
feat: Crun Cbatch input/output/err fix: Usage showed when crun cmd return exit code 2 feat: Crun input output err feat: Crun varible from env feat: Batch input/output/error option with file pattern Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix: crun input Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix: Format Signed-off-by: lijunlin <xiafeng.li@foxmail.com> chore: Log format Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix: crun output Signed-off-by: lijunlin <xiafeng.li@foxmail.com> style: format Signed-off-by: lijunlin <xiafeng.li@foxmail.com> refactor: crun step res field, inherit job when not specified Signed-off-by: lijunlin <xiafeng.li@foxmail.com> refactor: Update terminology from task to job and pass jobId to StartTerminal refactor: Remove pending reason for step Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix: cacct elapsed time Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix: Auth uid Signed-off-by: lijunlin <xiafeng.li@foxmail.com> refactor step query Signed-off-by: lijunlin <xiafeng.li@foxmail.com> docs: Add error string Signed-off-by: lijunlin <xiafeng.li@foxmail.com> chore: Sync protos Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix Signed-off-by: lijunlin <xiafeng.li@foxmail.com> refactor feat: Crun step submit feat: Enhance job and step processing with unified data structure and parsing functions Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix:step erro msg fix: x11 multi conn Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix: X11 multi conn Signed-off-by: lijunlin <xiafeng.li@foxmail.com> feat: X11 multi conn Signed-off-by: lijunlin <xiafeng.li@foxmail.com> task exit status Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix: Crun app args Signed-off-by: lijunlin <xiafeng.li@foxmail.com> sync protos Signed-off-by: lijunlin <xiafeng.li@foxmail.com> feat: Introcduce multi level cgrop for step/task Signed-off-by: lijunlin <xiafeng.li@foxmail.com> format sync protos fix: Enhance error handling and cleanup in crun and supervisor channels (#419) docs: Add comments to clarify the purpose of stepDoneChannel in SupervisorChannelKeeper refactor: Improve IO handling with goroutines and context management in StateMachineOfCrun refactor: Improve context management for IO forwarding in StateMachineOfCrun
| Commit: | 6a63477 | |
|---|---|---|
| Author: | NamelessOIer | |
| Committer: | GitHub | |
refactor: job/step/task misuse (#418) * fix job/step/task misuse * bug fix * fix * format * fix Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * fix * fix: x11 task to step * fix --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
| Commit: | de618f6 | |
|---|---|---|
| Author: | NamelessOIer | |
fix
| Commit: | 71c2db5 | |
|---|---|---|
| Author: | NamelessOIer | |
fix: x11 task to step
| Commit: | ad156a7 | |
|---|---|---|
| Author: | NamelessOIer | |
| Committer: | NamelessOIer | |
fix job/step/task misuse
| Commit: | 0565217 | |
|---|---|---|
| Author: | huerni | |
| Committer: | GitHub | |
Feat:Add Qos tres fields and Qos tres limit (#286) * feat: qos add max_jobs_per_account max_submit_jobs_per_user max_submit_jobs_per_account * refactor qos table * refactor * feat: qos add max_jobs_per_account max_submit_jobs_per_user max_submit_jobs_per_account * refactor qos table * feat * feat: qos tres modify * feat: add qos flags DenyOnLimit * feat: add ERR_MAX_TRES_PER_USER_BEYOND and ERR_MAX_TRES_PER_ACCOUNT_BEYOND * fix: flag empty when add qos * fix: 1. grpwall 2. mem bytes 3. CraneErrStr * fix: err print * refactor * refactor * refactor * fix * feat: modify add tres validate * refactor err code name * feat: add / parser * feat: cacctmgr update help * feat: refactor max memory * add help * fix flags format * fix: parseTres * fix: flags * refactor ParseGresForQosLimit * fix: max wall * feat: merge master
| Commit: | c477f37 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
refactor err code name
| Commit: | e7e4e27 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
refactor
| Commit: | ef4e296 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
refactor
| Commit: | 9f15a33 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
refactor
| Commit: | 706bb41 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
feat: add ERR_MAX_TRES_PER_USER_BEYOND and ERR_MAX_TRES_PER_ACCOUNT_BEYOND
| Commit: | b918cfb | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
feat: add qos flags DenyOnLimit
| Commit: | 9f83830 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
feat: qos tres modify
| Commit: | 3d3b0a0 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
feat
| Commit: | be145ce | |
|---|---|---|
| Author: | Junlin Li | |
| Committer: | GitHub | |
feat: X11 multi connection and multi task for step (#371) * fix: x11 multi conn Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix: X11 multi conn Signed-off-by: lijunlin <xiafeng.li@foxmail.com> feat: X11 multi conn Signed-off-by: lijunlin <xiafeng.li@foxmail.com> task exit status Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix: Crun app args Signed-off-by: lijunlin <xiafeng.li@foxmail.com> sync protos Signed-off-by: lijunlin <xiafeng.li@foxmail.com> feat: Introcduce multi level cgrop for step/task Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix:step erro msg * feat: ntasks (#392) * fix: set default CpusPerTask * fix mem limit * rename req_res_view to req_total_res_view * fix: defalut NodeNum * fix: Mem check * more friendly error message * fix: invalid value check --------- Co-authored-by: NamelessOIer <70872016+NamelessOIer@users.noreply.github.com>
| Commit: | a75bfb6 | |
|---|---|---|
| Author: | zhansan114514 | |
| Committer: | RileyWen | |
insert data into influxdb
| Commit: | a6a0213 | |
|---|---|---|
| Author: | edragain | |
| Committer: | GitHub | |
feat: cinfo --list-reasons/-R (#405) * cinfo -R * add single node mode which only used in cinfo -R yet * fix filter * change style * expand pattern * fix format * fix format(2) * fix * fix codestyle
| Commit: | 2dc74cb | |
|---|---|---|
| Author: | RileyWen | |
chore: sync proto with backend.
| Commit: | a82174b | |
|---|---|---|
| Author: | edragain | |
| Committer: | GitHub | |
feat: add field SubmitNode to ccontrol show job (#410) * feat:submit_node * add print * change name
| Commit: | 1fc4dab | |
|---|---|---|
| Author: | RileyWen | |
| Committer: | RileyWen | |
feat: crun input/output/error redirection and batch file pattern support - Add crun stdin/stdout/stderr redirection with file options - Add batch input/output/error option with file pattern support - Add crun environment variable passthrough from env - Fix x11 multi connection handling - Fix step error message display - Fix usage shown when crun cmd returns exit code 2
| Commit: | 6b4e7d0 | |
|---|---|---|
| Author: | Riley W | |
| Committer: | GitHub | |
feat: reset task id counter (#413) * feat: add ccontrol reset next-task-id / next-task-db-id commands Admin-only commands to reset task ID counters in ctld's embedded DB. Used by the test framework to reset state between test cases without restarting ctld. Usage: ccontrol reset next-task-id # reset to 1 ccontrol reset next-task-id 5 # reset to 5 ccontrol reset next-task-db-id # reset to 1 ccontrol reset next-task-db-id 5 # reset to 5 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add delete ALL --force and reset commands for all metadata Support bulk cleanup via CLI: - cacctmgr delete account/qos/wckey/resource ALL --force - ccontrol delete reservation ALL --force - ccontrol reset partition-acl (reload defaults from config) - ccontrol reset next-step-db-id Skip required params (user/server) when deleting ALL. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add ccontrol reset task-history command Purges all task/step history from ctld's embedded DB via RPC. Safe alternative to file-level wipe — works regardless of ctld restart state. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: standardize formatting in EntityType struct --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| Commit: | b9385e8 | |
|---|---|---|
| Author: | Riley W | |
| Committer: | GitHub | |
Merge branch 'master' into feat/reset-task-id-counter
| Commit: | 8637e43 | |
|---|---|---|
| Author: | 1daidai1 | |
| Committer: | GitHub | |
Add creport cmd (#357) * add creport * refactor * fix: Remove creport activate * refactor: gid parsing and remove unused functions * chore: Remove debug output * chore: Update json format * set default time to UTC time * refactor --------- Co-authored-by: junlinli <xiafeng.li@foxmail.com>
| Commit: | 53b3d8c | |
|---|---|---|
| Author: | RileyWen | |
feat: add ccontrol reset task-history command Purges all task/step history from ctld's embedded DB via RPC. Safe alternative to file-level wipe — works regardless of ctld restart state. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| Commit: | 185a5a3 | |
|---|---|---|
| Author: | RileyWen | |
feat: add delete ALL --force and reset commands for all metadata Support bulk cleanup via CLI: - cacctmgr delete account/qos/wckey/resource ALL --force - ccontrol delete reservation ALL --force - ccontrol reset partition-acl (reload defaults from config) - ccontrol reset next-step-db-id Skip required params (user/server) when deleting ALL. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| Commit: | 6a2aa78 | |
|---|---|---|
| Author: | RileyWen | |
| Committer: | RileyWen | |
feat: add ccontrol reset next-task-id / next-task-db-id commands Admin-only commands to reset task ID counters in ctld's embedded DB. Used by the test framework to reset state between test cases without restarting ctld. Usage: ccontrol reset next-task-id # reset to 1 ccontrol reset next-task-id 5 # reset to 5 ccontrol reset next-task-db-id # reset to 1 ccontrol reset next-task-db-id 5 # reset to 5 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| Commit: | 079d5a7 | |
|---|---|---|
| Author: | huerni | |
| Committer: | GitHub | |
feat: Add QOS fields: MaxJobsPerUser, MaxSubmitJobsPerUser, MaxJobsPerAccount, MaxSubmitJobsPerAccount (#252) * feat: qos add max_jobs_per_account max_submit_jobs_per_user max_submit_jobs_per_account * refactor * refactor qos table * refactor modify field array str * feat: task account_chain * refactor * refactor * fix fmt * feat: update doc
| Commit: | 440e743 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
feat: task account_chain
| Commit: | 31643a8 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
refactor modify field array str
| Commit: | 9a89113 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
refactor
| Commit: | 0350ae3 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
feat: qos add max_jobs_per_account max_submit_jobs_per_user max_submit_jobs_per_account
| Commit: | 1691cc8 | |
|---|---|---|
| Author: | huerni | |
| Committer: | GitHub | |
feat: delete wckey add force (#404)
| Commit: | 7d3a23b | |
|---|---|---|
| Author: | Yongkun Li | |
| Committer: | GitHub | |
feat: Add CNI for CoreDNS and DNS related options (#407) * add annotations * add ccon --dns * rebase * add cbatch --pod-dns and adjust format * add cni plugin * refactor: Refactor DNS related logic * refactor: Refactor ipv4 checking * fix: Align with backend * fix: Fix comments * fix: Fix validation bug in dns flags --------- Co-authored-by: edragain2nd <edragain@163.com>
| Commit: | 98db4ad | |
|---|---|---|
| Author: | huerni | |
feat: delete wckey add force
| Commit: | fab2756 | |
|---|---|---|
| Author: | huerni | |
| Committer: | GitHub | |
feat: cbatch signal (#394) * add cbatch signal * feat: add R|B and support multi signal * refactor * refactor * fix format * refactor * fix step mod * feat: crun and calloc not bash * fix err output --------- Co-authored-by: db <1301189887@qq.com>
| Commit: | b477f85 | |
|---|---|---|
| Author: | NamelessOIer | |
| Committer: | NamelessOIer | |
feat: ntasks
| Commit: | 69a3048 | |
|---|---|---|
| Author: | Junlin Li | |
| Committer: | junlinli | |
fix: x11 multi conn Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix: X11 multi conn Signed-off-by: lijunlin <xiafeng.li@foxmail.com> feat: X11 multi conn Signed-off-by: lijunlin <xiafeng.li@foxmail.com> task exit status Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix: Crun app args Signed-off-by: lijunlin <xiafeng.li@foxmail.com> sync protos Signed-off-by: lijunlin <xiafeng.li@foxmail.com> feat: Introcduce multi level cgrop for step/task Signed-off-by: lijunlin <xiafeng.li@foxmail.com> fix:step erro msg
| Commit: | 8f6fcab | |
|---|---|---|
| Author: | huerni | |
| Committer: | GitHub | |
feat: Crunprolog/epilog (#342) * feat: crunprolog and crunepilog * refactor * refactor * refactor * faet: crun add --task_prolog and --task_epilog * merge master * fix and add timeout
| Commit: | e8b9a35 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
merge master
| Commit: | b56f737 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
faet: crun add --task_prolog and --task_epilog
| Commit: | e4aef92 | |
|---|---|---|
| Author: | huerni | |
| Committer: | GitHub | |
feat: remote license (#372) * feat: licenses * feat: ccontrol show query for multiple specified licenses. * refactor * rebase master * feat: show lic * feat: show lic add json * feat * feat: add query resource * feat * refactor * refactor * feat: show lic add remote * feat: last update show * refactor modify to one request * feat: add flag * feat: add update * refactor * merge master * feat: support parse foorba@db * refactor * refactor * feat: add reserved last_deficit last_consumed in monitor * refactor * refactor * fix pattern * refactor * fix * fix timeStr and show lic * refactor * refactor
| Commit: | 011ff31 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
refactor
| Commit: | 0a08b26 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
refactor
| Commit: | cd04d05 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
refactor
| Commit: | d5b5a47 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
refactor
| Commit: | 446da43 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
feat: last update show
| Commit: | 2da59b7 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
refactor modify to one request
| Commit: | 83650e1 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
feat: add flag
| Commit: | 768986c | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
merge master
| Commit: | 9c401ef | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
feat: show lic add remote
| Commit: | c3ab77b | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
feat: add query resource
| Commit: | d84dba8 | |
|---|---|---|
| Author: | huerni | |
| Committer: | huerni | |
feat
| Commit: | 23848c9 | |
|---|---|---|
| Author: | edragain | |
| Committer: | GitHub | |
fix: duplicate modification message output (#367) * fix issue * add qos rich_error_list * remove brace