[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] StartLog Error



Thanks for the suggestions. ÂI was able to get condor_master running on the machines. ÂHowever, I now have another issue. ÂI have this setup.

Machine 1 - Submitter
Machine 2 - Master
Machine 3 - Executor
Machine 4 - Executor

When I submit a job from Machine 1, it makes it to Machine 3 where it should be executed. ÂIt is successfully sent there by Machine 2 which chooses it as the machine to run the job. ÂHowever, once getting there, the output cannot be sent back to Machine 1. ÂThe error I get when viewing condor_q -analyze <job_id> is below.

014.000:Â Request is held.


Hold reason: Error from slot1@xxxxxxxxxxxxxxxxxxxxxxx: Failed to open '/Users/condor/Desktop/HTCondor/tests/simple.out' as standard output: No such file or directory (errno 2)


I am running condor_master as root on the executors and on the submitter. ÂWhy would this be unable to write to the submitter machine and what is the correct way to go about fixing this (without minimizing security).

Thanks for all your help!


Mike



On Wed, May 28, 2014 at 12:36 PM, Todd Tannenbaum <tannenba@xxxxxxxxxxx> wrote:
Mike, another idea re the below....


Although you did not tell us what platform you are using, it looks from the logs like you are using a Mac, and it looks like you are struggling a bit to install HTCondor on a Mac. ÂIf so, one option is to install HTCondor via MacPort. ÂI am not a Mac person, but supposedly MacPort makes installing software correctly on a Mac very easy (similar to RPMs on Linux). ÂFollow this URL if interested:


http://fumodibit.blogspot.com/2013/05/installing-and-configuring-personal.html

regards,
Todd




On 5/28/2014 2:28 PM, Todd Tannenbaum wrote:
On 5/28/2014 2:12 PM, Mike Ferraco wrote:
I receive this error in my StartLog each time I start condor_master on an
execute machine.

*** Last 20 line(s) of file
/Users/condor/HTCondor/local.highland/log/StartLog:
slot type 0: Cpus: 1, Memory: auto, Swap: auto, Disk: auto
slot type 0: Cpus: 1, Memory: 1024, Swap: 12.50%, Disk: 12.50%
slot type 0: Cpus: 1, Memory: auto, Swap: auto, Disk: auto
slot type 0: Cpus: 1, Memory: 1024, Swap: 12.50%, Disk: 12.50%
slot type 0: Cpus: 1, Memory: auto, Swap: auto, Disk: auto
slot type 0: Cpus: 1, Memory: 1024, Swap: 12.50%, Disk: 12.50%
slot type 0: Cpus: 1, Memory: auto, Swap: auto, Disk: auto
slot type 0: Cpus: 1, Memory: 1024, Swap: 12.50%, Disk: 12.50%
slot type 0: Cpus: 1, Memory: auto, Swap: auto, Disk: auto
slot type 0: Cpus: 1, Memory: 1024, Swap: 12.50%, Disk: 12.50%
05/28/14 15:09:25 slot1: New machine resource allocated
05/28/14 15:09:25 slot2: New machine resource allocated
05/28/14 15:09:25 slot3: New machine resource allocated
05/28/14 15:09:25 slot4: New machine resource allocated
05/28/14 15:09:25 slot5: New machine resource allocated
05/28/14 15:09:25 slot6: New machine resource allocated
05/28/14 15:09:25 slot7: New machine resource allocated
05/28/14 15:09:25 slot8: New machine resource allocated
05/28/14 15:09:25 WARNING: /Users/condor/HTCondor/local.highland/execute
root-squashed or not condor-owned: requiring world-writability
05/28/14 15:09:25 ERROR "chmod exec path
(/Users/condor/HTCondor/local.highland/execute),
errno: 1 (Operation not permitted)" at line 165 in file
/Volumes/MacintoshHD2/condor/slot1/dir_56609/userdir/src/
condor_startd.V6/util.cpp

Can anyone help identify the source of this issue?


At first blush, it looks like whatever directory you specified as
LOCAL_DIR in your condor_config file is living on a shared fileserver
(such as NFS). ÂTry to make it a subdirectory on your local disk. ÂIf
you cannot (i.e. your machine does not have much local disk, or is
diskless), then make sure the ownership of this directory is "condor"
and/or the permissions are 01777. ÂI.e. ls -l should look like this:

ingwe.cs.wisc.edu{tannenba}8: ls -ld `condor_config_val execute`
drwxrwxrwt 2 condor condor 4096 Dec Â8 Â2011 /scratch/condor/execute/


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@cs.wisc.edu with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


--
Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
Center for High Throughput Computing  Department of Computer Sciences
HTCondor Technical Lead        Â1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132 Â Â Â Â Â Â Â Â ÂMadison, WI 53706-1685

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@cs.wisc.edu with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/