[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor-G - job submission problem - updated



On Sep 1, 2005, at 9:27 AM, duane waktu wrote:

Thanks for your email. I just had a chance to get back to playing around with condor again today.

Btw, I followed your suggestion to simply use '/tmp' directory and it 'seemed' to work fine. I said it 'seemed' because I was not sure if everything went smoothly. The GridmanagerLog file still shows that I got SIGTERM at the end, although I seemed to get the expected output and the condor queue shows no more jobs to be submitted.

The SIGTERM is normal. When the gridmanager has no more jobs to manage, it exits by sending itself a SIGTERM. A little unusual, but that's how it works.


Another question is what do I need to do so that Condor transfers the jobs to other machines too?
Before I was able to do that when I was testing the simple Hello.java example from Condor guide, which was using JAVA universe. Using this example, I can see the job went to the Condor Manager and executed there instead of in the local machine.


However, once I changed to GRID universe, the job seems to be executed only on the local machine, e.g. didn't get transfer to other machines. Is this because my 'globusscheduler' is set to my local machine? What do I need to do in order to have my jobs transferred to other machines?

Below is my submit file:
=============================================
 ...
 globusscheduler = https://<my_local_machine>:8443
 jobmanager_type = Fork
 should_transfer_files = YES
 when_to_transfer_output = ON_EXIT
 queue
=============================================

First, the machine you want to submit to has to have GT4 installed. Then you have to change globusscheduler in your submit file to point at that machine.


If you want Condor to pick the machine using match-making (like it does for the other universes), you have to make the machine advertise itself to your central manager. There are instructions for how to do this in the Condor manual:
http://www.cs.wisc.edu/condor/manual/ v6.7/5_3Grid_Universe.html#SECTION00634000000000000000


Jaime, back to the permission problem, I hope you don't mind me asking simple questions as follows:
1. What is sticky bit?

% ls -ld /tmp drwxrwxrwt 7 root root 8192 Sep 1 04:26 /tmp/

The 't' at the end of the permissions means the sticky bit is set. It means that although all users can write files in the directory, they can rename and remove only their own files. If the sticky bit isn't set, user A can delete user B's files (even if he's not allowed to read them).

2. What do I need to do to set a sticky bit on a directory?

You can set the sticky bit on a directory like so: chmod +t /scratch/grid-jobs

+----------------------------------+---------------------------------+
|            Jaime Frey            |  Public Split on Whether        |
|        jfrey@xxxxxxxxxxx         |  Bush Is a Divider              |
|  http://www.cs.wisc.edu/~jfrey/  |         -- CNN Scrolling Banner |
+----------------------------------+---------------------------------+