You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A user encountered an issue where when they submit ~1000 jobs to a cluster with ~100 nodes, at the end there are 4 jobs remain in PENDING state, and other jobs are in terminal states.
When checking the ray job list, it seems the latest job being ray job submit'ed is in PENDING state, although ray status shows all CPUs/GPUs are available, i.e. ray job does not start the job in PENDING state.
Version & Commit info:
sky -v: PLEASE_FILL_IN
sky -c: PLEASE_FILL_IN
The text was updated successfully, but these errors were encountered:
A user encountered an issue where when they submit ~1000 jobs to a cluster with ~100 nodes, at the end there are 4 jobs remain in PENDING state, and other jobs are in terminal states.
When checking the
ray job list
, it seems the latest job beingray job submit
'ed is in PENDING state, althoughray status
shows all CPUs/GPUs are available, i.e.ray job
does not start the job inPENDING
state.Version & Commit info:
sky -v
: PLEASE_FILL_INsky -c
: PLEASE_FILL_INThe text was updated successfully, but these errors were encountered: