[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Checkpointing Errors



Simon David Hammond wrote:
Hi All,

[snip]

Our LOWPORT is 9000 and HIGHPORT is 9500 for servers and 9060 for clients. I'm confused as to why the checkpointing system is picking 53211 and I can't seem to find a configuration option to change it!

Is Condor configured to send the checkpoint back to the condor_shadow process, or have you configured a checkpoint server?

If the former, do you consider the machine where the condor_shadow runs (your submit machine) to be a client or a server? If a client, perhaps 60 ports isn't enough --- how many jobs are simultaneously running from the submit machine?

Finally, there may be some good clues in the ShadowLog file from the submit machine at the same time the job is trying to checkpoint.

regards
Todd