[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Remote cluster test failed when using condor_remote_cluster command



Dear HTCondor development Team,
    I can access two campus clusters, which one is LSF based, the other is Slurm based. Since i am not a administrator of these cluster and i still want to use them to execute one workflow simultaneously, I think i can use condor_remote_cluster to achieve my goal. First question: Can I utilize the two cluster by HTCondor to execute a workflow simultaneously?
    Until now, I have done some effort to achieve my goal. I installed HTCondor(MiniCondor) on my PC workstation in the same local area network of campus clusters. I tried to use condor_remote_cluster command to add the LSF cluster and Slurm cluster. I added them successfully and they are shown in the remote cluster list. However, when I try to test using "condor_remote_cluster -t" command. The task can't be dispatched to the remote cluster. There will be an idle task in the condor_q.
Could you provide some suggestions to help me set up my environment? Is it possible for me to achieve my goals without root access of cluster? Looking forward to your reply.

****Log from my PC workstation****

root@ubuntu:~/bosco-test/boscotest.p3SGb# condor_remote_cluster -t cse-liyf@xxxxxxxxxxxx

Testing ssh to cse-liyf@xxxxxxxxxxxxxxxxxxxxx!

Testing remote submission...Passed!

Submission and log files for this job are in /root/bosco-test/boscotest.2DBlK

Waiting for jobmanager to accept job...Passed

Checking for submission to remote lsf cluster (could take ~30 seconds)...grep: /root/bosco-test/boscotest.2DBlK/logfile: No such file or directory

grep: /root/bosco-test/boscotest.2DBlK/logfile: No such file or directory

grep: /root/bosco-test/boscotest.2DBlK/logfile: No such file or directory

grep: /root/bosco-test/boscotest.2DBlK/logfile: No such file or directory

grep: /root/bosco-test/boscotest.2DBlK/logfile: No such file or directory

Then failed.



Yifei Li