[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HTCondor can't execute the job with error: Error: can't find resource with ClaimId



An interesting observation, if you wait and start jobs not immediately, but with an interval of 15-20 minutes, then the error does not occur. Or is it just my luck? Still need help, please, any ideas?


From: "HTCondor-Users Mail List" <htcondor-users@xxxxxxxxxxx>
To: "HTCondor-Users Mail List" <htcondor-users@xxxxxxxxxxx>
Cc: "Dmitry Golubkov" <dmitry.golubkov@xxxxxxxxxxxxxx>
Sent: Thursday, June 17, 2021 1:32:43 PM
Subject: [HTCondor-users] HTCondor can't execute the job with error: Error: can't find resource with ClaimId

Dear all,

I have the htcondor cluster configured to use partitionable slots. After job submit, htcondor creates dynamic slots and trying to execute the job, but fails immediately with the error "Error: can't find resource with ClaimId (...) for 444 (ACTIVATE_CLAIM)" (please take a look at the log in attachment). After some time it re-creates dynamic slots and passes the execution succecefully,  but why? Is it a known issue? This situation happens very often and slows down the execution of the jobs. Any ideas, how this can be solved? Or it is my own specific issue? The latest version of htcondor behaves the same way.

Thanks in advance,
Dmitry.



_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/