Article #6188: Unscheduled Gilbreth outage
Gilbreth is experiencing scheduling issues and jobs have been paused while RCAC works to resolve this issue. Running jobs have also been impacted, so...
Gilbreth is experiencing scheduling issues and jobs have been paused while RCAC works to resolve this issue. Running jobs have also been impacted, so...
Fortress began experiencing issues with its tape library around 5:00PM. Engineers are currently diagnosing the issue and are working to identify a fix...
Multiple clusters have been powered off in MATH G109 datacenter due to a water issue in the building. Affected systems are Bell, Brown, Geddes, Gilbre...
At about noon today (Tuesday 12 September), we discovered an issue with the scheduler database related to the power outage last Sunday. Scheduling o...
Anvil is experiencing more issues related to the power outage yesterday in the Purdue Data Center. Users are currently unable to login via any method,...
Update: As of 3:45pm, the Bell cluster has returned to production status. Scheduling is still paused on the Negishi cluster, and we will have an updat...
Users of Data Depot on RCAC clusters are currently experiencing significant performance degradation. The symptoms manifest as delays in listing or ac...
Open OnDemand services for the Hammer cluster are currently offline. Engineers are investigating a boot disk failure on the server that hosts the gat...
The Hammer cluster began experiencing issues with the Slurm scheduler around 5:00am, Thursday, July 6th. The Slurm scheduler is non-responsive, as a r...
The Geddes cluster began experiencing issues overnight. Engineers are currently diagnosing the issue and are working to identify a fix. Workloads will...
The Data Depot began experiencing issues with its network drive mapping capability around 1:30pm EDT. The symptoms manifest as users being unable to...
Around 2:10p EST, the Brown cluster began experiencing issues with home directory mounts. Job scheduling on the Brown cluster has been paused while en...
The Anvil cluster began experiencing issues with its scratch filesystem around 6:45pm EDT. Access to scratch directories may be slow or hang. Engineer...
The Scholar cluster began experiencing issues with its Thinlinc remote desktop (desktop.scholar.rcac.purdue.edu) and its RStudio Server (rstudio.schol...
The Gilbreth cluster began experiencing issues with its scheduler spool filesystem around 10:30pm EDT on Saturday, March 18th, 2023. The problem manif...
The Bell cluster began experiencing issues with its scratch filesystem around 7:50pm EDT. File access operations (e.g. ls) may appear hanging. Logins...
The Bell cluster began experiencing issues with its scratch filesystem around 12:55pm EST. File access operations (e.g. ls) may appear hanging. Login...
The Anvil cluster began experiencing issues with its scratch and project file system around 10:00am EST. Access to scratch and project directories may...
The Bell cluster began experiencing issues with its Lustre scratch filesystem around 12:30pm EST. Engineers are currently diagnosing the issue and are...
The data depot began experiencing issues around 9:50am EST. While engineers work to diagnose and fix this issue, users may notice degraded performance...