[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Failed to chmod file



Simon,

Do you run condor as root? In my case, I have a condor user and there is a file named condor with condor_master in the /etc/init.d directory, so condor is automatically started as root when the machine restart. Have you change the /etc/condor/condor_config.root and do other things with that? I have always the same problem, and I think about trying this.

Sophie

>>> Si Hammond <simon.hammond@xxxxxxxxx> 15/08/2007 20:31 >>>

Has anyone got Condor to work with ACLs in Linux? We seem to have  
tried a lot between us and can't this to work.


Si Hammond


On 15 Aug 2007, at 14:14, Sophie Prieur wrote:

> Hi Simon,
>
> No the problem still remains the same, from user account a job for  
> Unix doesn't work but for a job for Windows there is no problem, on  
> Windows no problem to submit a job, and on condor user no problem  
> to submit a job.
>
> Sophie
>
>>>> "Simon Hammond" <simon.hammond@xxxxxxxxx> 15/08/2007 10:57 >>>
> Sophie,
>
> Have you managed to fix this, I've been experimenting here and we  
> are having
> the same problems. The ACL solution doesn't seem to work here.
>
> Thanks,
>
> Si Hammond
>
>
> On 08/08/07, Sophie Prieur <s.prieur@xxxxxxxxxxxxxx> wrote:
>>
>> Simon,
>>
>> Sorry, but I have not understand your answer. The job is running  
>> well and
>> give the results with the condor account, but with another account  
>> which is
>> not condor, the job has these problems. I am using NFS, and I have  
>> this when
>> I submit the job : WARNING: Log file
>> /home/sp5978/Simul/condorRes/neutral/out1/condor_log is on NFS.
>> This could cause log file corruption and is _not_ recommended.
>>
>> Thank you
>> Sophie
>>
>>>>> Si Hammond <simon.hammond@xxxxxxxxx> 07/08/2007 18:25 >>>
>>
>> Sophie,
>>
>> Are you running Stork to handle the file transfers? If you're not
>> using a shared filesystem then you might need that.
>>
>>
>>
>> S.
>>
>>
>> On 7 Aug 2007, at 10:46, Sophie Prieur wrote:
>>
>>> Thank you very much Simon
>>> I have tried to change acl, as I am running Condor not as root. but
>>> it's always the same, and I have this in the StarterLog :
>>> ******************************************************
>>> 8/7 10:41:49 Using config source: /home/condor/condor_config
>>> 8/7 10:41:49 Using local config sources:
>>> 8/7 10:41:49    /home/condor/hosts/balsa/condor_config.local
>>> 8/7 10:41:49 DaemonCore: Command Socket at <143.234.88.55:63601>
>>> 8/7 10:41:49 Done setting resource limits
>>> 8/7 10:41:49 Communicating with shadow <143.234.88.55:63599>
>>> 8/7 10:41:49 Submitting machine is "balsa.macaulay.ac.uk"
>>> 8/7 10:41:50 File transfer completed successfully.
>>> 8/7 10:41:51 Starting a VANILLA universe job with ID: 176.0
>>> 8/7 10:41:51 IWD: /home/condor/hosts/balsa/execute/dir_24354
>>> 8/7 10:41:51 Output file: /home/condor/hosts/balsa/execute/
>>> dir_24354/condor_output
>>> 8/7 10:41:51 Error file: /home/condor/hosts/balsa/execute/dir_24354/
>>> condor_error
>>> 8/7 10:41:51 About to exec /home/condor/hosts/balsa/execute/
>>> dir_24354/condor_exec.exe Simul --batch -cfg /home/sp5978/simul2/
>>> configFiles/neutral/config-neutral0
>>> 8/7 10:41:51 Create_Process succeeded, pid=24357
>>> 8/7 10:42:03 Process exited, pid=24357, status=134
>>> 8/7 10:42:03 condor_write(): send() 65536 bytes to unknown source
>>> returned -1, timeout=30, errno=32 (Broken pipe).  Assuming failure.
>>> 8/7 10:42:03 ReliSock::put_bytes_nobuffer: Send failed.
>>> 8/7 10:42:03 ReliSock::put_file: failed to put 65536 bytes
>>> (put_bytes_nobuffer() returned -1)
>>> 8/7 10:42:03 DoUpload: STARTER at 143.234.88.55 failed to send file
>>> (s) to <143.234.88.55:63599>: error sending /home/condor/hosts/
>>> balsa/execute/dir_24354/core.176.0; SHADOW at 143.234.88.55 failed
>>> to receive file /home/sp5978/simul2/condorRes/neutral/out0/
>>> condor_output
>>> 8/7 10:42:03 File transfer failed, forcing disconnect.
>>> 8/7 10:42:03 JIC::allJobsDone() failed, waiting for job lease to
>>> expire or for a reconnect attempt
>>> 8/7 10:42:03 Accepted request to reconnect from <0.0.0.0:0>
>>> 8/7 10:42:03 Ignoring old shadow <143.234.88.55:63599>
>>> 8/7 10:42:03 Communicating with shadow <143.234.88.55:63599>
>>> 8/7 10:42:04 condor_write(): send() 65536 bytes to unknown source
>>> returned -1, timeout=30, errno=32 (Broken pipe).  Assuming failure.
>>> 8/7 10:42:04 ReliSock::put_bytes_nobuffer: Send failed.
>>> 8/7 10:42:04 ReliSock::put_file: failed to put 65536 bytes
>>> (put_bytes_nobuffer() returned -1)
>>> 8/7 10:42:04 DoUpload: STARTER at 143.234.88.55 failed to send file
>>> (s) to <143.234.88.55:63599>: error sending /home/condor/hosts/
>>> balsa/execute/dir_24354/core.176.0; SHADOW at 143.234.88.55 failed
>>> to receive file /home/sp5978/simul2/condorRes/neutral/out0/
>>> condor_output
>>> 8/7 10:42:04 File transfer failed, forcing disconnect.
>>> 8/7 10:42:04 JIC::allJobsDone() failed, waiting for job lease to
>>> expire or for a reconnect attempt
>>> 8/7 10:42:04 Accepted request to reconnect from <0.0.0.0:0>
>>> 8/7 10:42:04 Ignoring old shadow <143.234.88.55:63599>
>>> 8/7 10:42:04 Communicating with shadow <143.234.88.55:63599>
>>> 8/7 10:42:04 condor_write(): send() 65536 bytes to unknown source
>>> returned -1, timeout=30, errno=32 (Broken pipe).  Assuming failure.
>>> 8/7 10:42:04 ReliSock::put_bytes_nobuffer: Send failed.
>>> 8/7 10:42:04 ReliSock::put_file: failed to put 65536 bytes
>>> (put_bytes_nobuffer() returned -1)
>>> 8/7 10:42:04 DoUpload: STARTER at 143.234.88.55 failed to send file
>>> (s) to <143.234.88.55:63599>: error sending /home/condor/hosts/
>>> balsa/execute/dir_24354/core.176.0; SHADOW at 143.234.88.55 failed
>>> to receive file /home/sp5978/simul2/condorRes/neutral/out0/
>>> condor_output
>>> 8/7 10:42:04 File transfer failed, forcing disconnect.
>>> 8/7 10:42:04 JIC::allJobsDone() failed, waiting for job lease to
>>> expire or for a reconnect attempt
>>> 8/7 10:42:04 Accepted request to reconnect from <0.0.0.0:0>
>>> 8/7 10:42:04 Ignoring old shadow <143.234.88.55:63599>
>>> 8/7 10:42:04 Communicating with shadow <143.234.88.55:63599>
>>> 8/7 10:42:05 condor_write(): send() 65536 bytes to unknown source
>>> returned -1, timeout=30, errno=32 (Broken pipe).  Assuming failure.
>>> 8/7 10:42:05 ReliSock::put_bytes_nobuffer: Send failed.
>>> 8/7 10:42:05 ReliSock::put_file: failed to put 65536 bytes
>>> (put_bytes_nobuffer() returned -1)
>>> 8/7 10:42:05 DoUpload: STARTER at 143.234.88.55 failed to send file
>>> (s) to <143.234.88.55:63599>: error sending /home/condor/hosts/
>>> balsa/execute/dir_24354/core.176.0; SHADOW at 143.234.88.55 failed
>>> to receive file /home/sp5978/simul2/condorRes/neutral/out0/
>>> condor_output
>>> 8/7 10:42:05 File transfer failed, forcing disconnect.
>>> 8/7 10:42:05 JIC::allJobsDone() failed, waiting for job lease to
>>> expire or for a reconnect attempt
>>> 8/7 10:42:05 Accepted request to reconnect from <0.0.0.0:0>
>>> 8/7 10:42:05 Ignoring old shadow <143.234.88.55:63599>
>>> 8/7 10:42:05 Communicating with shadow <143.234.88.55:63599>
>>> 8/7 10:42:05 condor_write(): send() 65536 bytes to unknown source
>>> returned -1, timeout=30, errno=32 (Broken pipe).  Assuming failure.
>>> 8/7 10:42:05 ReliSock::put_bytes_nobuffer: Send failed.
>>> 8/7 10:42:05 ReliSock::put_file: failed to put 65536 bytes
>>> (put_bytes_nobuffer() returned -1)
>>> 8/7 10:42:05 DoUpload: STARTER at 143.234.88.55 failed to send file
>>> (s) to <143.234.88.55:63599>: error sending /home/condor/hosts/
>>> balsa/execute/dir_24354/core.176.0; SHADOW at 143.234.88.55 failed
>>> to receive file /home/sp5978/simul2/condorRes/neutral/out0/
>>> condor_output
>>> 8/7 10:42:05 JIC::allJobsDone() failed, waiting for job lease to
>>> expire or for a reconnect attempt
>>> 8/7 10:42:18 Got SIGQUIT.  Performing fast shutdown.
>>> 8/7 10:42:18 ShutdownFast all jobs.
>>> 8/7 10:42:18 **** condor_starter (condor_STARTER) EXITING WITH
>>> STATUS 0
>>>
>>> I can see the results in the condor_output file, but the job  
>>> restarts.
>>>
>>>>>> "Simon Hammond" <simon.hammond@xxxxxxxxx> 07/08/2007 09:53 >>>
>>> I guess you are running Condor not as root?
>>>
>>> If not, you can use ACL's to give the user Condor is running as
>>> access to
>>> the file
>>>
>>> e.g. setfacl -m u:condor:rwx ./myfile.txt
>>>
>>> This will enable just the Condor user to read/write the file. You
>>> may need
>>> to adjust the mask to get this to work correctly.
>>>
>>>
>>> On 07/08/07, Sophie Prieur <s.prieur@xxxxxxxxxxxxxx> wrote:
>>>>
>>>>  Hi everybody,
>>>>
>>>> I have a problem when I submit a job, I have this in ShadowLog :
>>>> ReliSock::get_file_with_permissions(): Failed to chmod file
>>>> '/home/sp5978/simul2/condorRes/neutral/out1/condor_output': Not  
>>>> owner
>>>> (errno: 1)
>>>> and this in StarterLog
>>>> DoUpload: STARTER at 143.234.88.55 failed to send file(s) to <
>>>> 143.234.88.55:51883>; SHADOW at 143.234.88.55 failed to receive  
>>>> file
>>>> /home/sp5978/simul2/condorRes/neutral/out0/condor_output
>>>>
>>>> The submit file is this :
>>>> Universe = vanilla
>>>> Executable = /software/guiswarm/swarm-2.2/bin/javaswarm
>>>>
>>>> Log = condor_log
>>>> Error = condor_error
>>>> Output = condor_output
>>>>
>>>> getenv = true
>>>>
>>>> requirements = ((( OpSys == "SOLARIS29" ) && ( Arch == "SUN4u" ))
>>>> || ((
>>>> OpSys == "SOLARIS28" ) && ( Arch == "SUN4u" )))
>>>> transfer_input_files = /home/sp5978/simul2/bin/Simul.class,
>>>> /home/sp5978/simul2/bin/BatchSwarm.class,
>>>> /home/sp5978/simul2/bin/Beq0.class, /home/sp5978/simul2/bin/
>>>> Cell.class,
>>>> /home/sp5978/simul2/bin/DNA.class, /home/sp5978/simul2/bin/
>>>> DsupK.class,
>>>> /home/sp5978/simul2/bin/incorrectValue.class,
>>>> /home/sp5978/simul2/bin/Individual.class, /home/sp5978/simul2/bin/
>>>> Map.class,
>>>> /home/sp5978/simul2/bin/missingValue.class,
>>>> /home/sp5978/simul2/bin/ModelSwarm.class,
>>>> /home/sp5978/simul2/bin/noEnoughValues.class,
>>>> /home/sp5978/simul2/bin/ObserverSwarm.class,
>>>> /home/sp5978/simul2/bin/Param.class, /home/sp5978/simul2/bin/
>>>> Plant.class,
>>>> /home/sp5978/simul2/bin/Project.class, /home/sp5978/simul2/bin/
>>>> Seed.class,
>>>> /home/sp5978/simul2/bin/Specie.class,
>>>> /home/sp5978/simul2/bin/SwarmUtils.class
>>>> transfer_files = ALWAYS
>>>>
>>>> InitialDir = /home/sp5978/simul2/condorRes/neutral/out0
>>>> Arguments = Simul --batch -cfg
>>>> /home/sp5978/simul2/configFiles/neutral/config-neutral0
>>>> Queue
>>>> InitialDir = /home/sp5978/simul2/condorRes/neutral/out1
>>>> Arguments = Simul --batch -cfg
>>>> /home/sp5978/simul2/configFiles/neutral/config-neutral1
>>>> Queue
>>>> And the right for the condor_output :
>>>> bash-2.03$ ls -l ../condorRes/neutral/out0
>>>> total 377
>>>> -rw-rw-r--    1 sp5978   staff           0 Aug  6 11:18  
>>>> condor_error
>>>> -rw-rw-r--    1 sp5978   staff      370097 Aug  7 09:30 condor_log
>>>> -rwxrwxrwx    1 sp5978   staff         400 Aug  7 09:29  
>>>> condor_output
>>>>
>>>> The job is running, I can see the results in the condor_output
>>>> files but
>>>> it doesn't stop, it still remains in the queue and restart after a
>>>> while.
>>>> Can someone help me?
>>>> Thanks in advance
>>>> Sophie
>>>>
>>>>
>>>> --
>>>> Please note that the views expressed in this e-mail are those of  
>>>> the
>>>> sender and do not necessarily represent the views of the Macaulay
>>>> Institute. This email and any attachments are confidential and are
>>>> intended solely for the use of the recipient(s) to whom they are
>>>> addressed. If you are not the intended recipient, you should not
>>>> read,
>>>> copy, disclose or rely on any information contained in this e-
>>>> mail, and
>>>> we would ask you to contact the sender immediately and delete the
>>>> email
>>>> from your system. Thank you.
>>>> Macaulay Institute and Associated Companies, Macaulay Drive,
>>>> Craigiebuckler, Aberdeen, AB15 8QH.
>>>> _______________________________________________
>>>> Condor-users mailing list
>>>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx 
>>>> with a
>>>> subject: Unsubscribe
>>>> You can also unsubscribe by visiting
>>>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users 
>>>>
>>>> The archives can be found at:
>>>> https://lists.cs.wisc.edu/archive/condor-users/ 
>>>>
>>>>
>>>
>>> --
>>> Please note that the views expressed in this e-mail are those of the
>>> sender and do not necessarily represent the views of the Macaulay
>>> Institute. This email and any attachments are confidential and are
>>> intended solely for the use of the recipient(s) to whom they are
>>> addressed. If you are not the intended recipient, you should not  
>>> read,
>>> copy, disclose or rely on any information contained in this e-mail,
>>> and
>>> we would ask you to contact the sender immediately and delete the
>>> email
>>> from your system. Thank you.
>>> Macaulay Institute and Associated Companies, Macaulay Drive,
>>> Craigiebuckler, Aberdeen, AB15 8QH.
>>>
>>>
>>> _______________________________________________
>>> Condor-users mailing list
>>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx 
>>> with a
>>> subject: Unsubscribe
>>> You can also unsubscribe by visiting
>>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users 
>>>
>>> The archives can be found at:
>>> https://lists.cs.wisc.edu/archive/condor-users/ 
>>
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx  
>> with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users 
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/ 
>>
>> --
>> Please note that the views expressed in this e-mail are those of the
>> sender and do not necessarily represent the views of the Macaulay
>> Institute. This email and any attachments are confidential and are
>> intended solely for the use of the recipient(s) to whom they are
>> addressed. If you are not the intended recipient, you should not  
>> read,
>> copy, disclose or rely on any information contained in this e- 
>> mail, and
>> we would ask you to contact the sender immediately and delete the  
>> email
>> from your system. Thank you.
>> Macaulay Institute and Associated Companies, Macaulay Drive,
>> Craigiebuckler, Aberdeen, AB15 8QH.
>>
>>
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx  
>> with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users 
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/ 
>>
>
> -- 
> Please note that the views expressed in this e-mail are those of the
> sender and do not necessarily represent the views of the Macaulay
> Institute. This email and any attachments are confidential and are
> intended solely for the use of the recipient(s) to whom they are
> addressed. If you are not the intended recipient, you should not read,
> copy, disclose or rely on any information contained in this e-mail,  
> and
> we would ask you to contact the sender immediately and delete the  
> email
> from your system. Thank you.
> Macaulay Institute and Associated Companies, Macaulay Drive,
> Craigiebuckler, Aberdeen, AB15 8QH.
>
>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx  
> with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users 
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/ 

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users 

The archives can be found at: 
https://lists.cs.wisc.edu/archive/condor-users/

-- 
Please note that the views expressed in this e-mail are those of the
sender and do not necessarily represent the views of the Macaulay
Institute. This email and any attachments are confidential and are
intended solely for the use of the recipient(s) to whom they are
addressed. If you are not the intended recipient, you should not read,
copy, disclose or rely on any information contained in this e-mail, and
we would ask you to contact the sender immediately and delete the email
from your system. Thank you.
Macaulay Institute and Associated Companies, Macaulay Drive,
Craigiebuckler, Aberdeen, AB15 8QH.