HPC Cluster back up, test installations and restore files


Date: Fri, 10 Apr 2020 16:56:38 -0500
From: chtc-users@xxxxxxxxxxx
Subject: HPC Cluster back up, test installations and restore files
Greetings CHTC users,

This message is for users of our HPC cluster.

The HPC cluster is back up. During the downtime, jobs were removed from the queue, so they will need to be resubmitted.

The downtime for the cluster was prolonged due to instabilities and errors in the file system that contains user home directories. We have made our best effort at restoring the file system, but some home directories may still be missing files. We strongly recommend restoring software installations and files from your own backups and running test jobs to confirm that your job submissions are still functional.

As a reminder, because file system issues like this can happen and we do not maintain back up copies of files used in CHTC, it is crucial to remove files from CHTC systems when you are no longer using them and to keep backups of regularly used files.

Because itâs Friday afternoon, we will be less responsive to issues that may crop up over the weekend. Please be a good citizen and take the time to test your submissions and gradually scale up. Email us at chtc@xxxxxxxxxxx with any issues you see and we will get to them as soon as we can.

Best,
Your CHTC team
[← Prev in Thread] Current Thread [Next in Thread→]
  • HPC Cluster back up, test installations and restore files, chtc-users <=