[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Default value for "Iwd" classad? (Python-Condor)



Oh shoot, those are the classads for a job that ran fine (I temporarily set the Iwd to "/home/ubuntu", as I knew that existed).

Classads for failing job:

ImageSize = 1
LeaveJobInQueue = true
JobNotification = 2
TransferExecutable = false
StreamIn = false
AutoClusterId = 1
StreamErr = false
ShouldTransferFiles = "YES"
>JobStatus = 1
LastJobStatus = 0
Owner = "ubuntu"
MyType = "Job"
Cmd = "/usr/bin/blender"
WhenToTransferOutput = "ON_EXIT"
GlobalJobId = "
<machine-ip>#670.22#1364323301"
PeriodicRemove = false
ImageSize_RAW = 1
User = "ubuntu@<machine-ip>"
CurrentTime = time()
PeriodicHold = false
RootDir = "/"
Iwd = "/"
>AutoClusterAttrs = "JobUniverse,LastCheckpointPlatform,NumCkpts,jordan,Requirements,NiceUser,ConcurrencyLimits"
QDate = 1364323304
ClusterId = 670
PeriodicRelease = false
Requirements = OpSys == "LINUX" && Arch == "INTEL"
StreamOut = false
Arguments = "-b dolphin.blend -o //render_# -F PNG -x 1 -f $(Process)"
TargetType = "Machine"
TransferInput = "<url>"
RemoteUserCpu = 0
JobPrio = 0
JobUniverse = 5
ProcId = 22
ServerTime = 1364324445

Hold error:

Error from <execute-node>: STARTER at
<execute-node> failed to send file(s) to <execute-node>; SHADOW at <execute-node> failed to write to file //_condor_stdout: (errno 13) Permission denied

Here, I tried using "/" as the Iwd. If I used something like "/etc", the error would say "failed to write to file /etc/_condor_stdout", etc.

On Tue, Mar 26, 2013 at 2:34 PM, Brian Bockelman <bbockelm@xxxxxxxxxxx> wrote:
Hi Jordan,

Looks like things are running right now.  What is the hold message you eventually receive?

FWIW - it would also be interesting to see the ClassAd you give to the Schedd object for submission.

Brian

On Mar 26, 2013, at 1:29 PM, Jordan Williamson <jordan.williamson@xxxxxxxxxxx> wrote:

Classads:

DiskUsage_RAW = 319
Requirements = OpSys == "LINUX" && Arch == "INTEL"
RemoteUserCpu = 0.0
JobFinishedHookDone = 1364322130
>
GlobalJobId = "<machine-ip>#669.23#1364321911"
NumJobStarts = 1
ExitCode = 0
StreamIn = false
ImageSize = 15000
CurrentTime = time()
JobStartDate = 1364322127
CurrentHosts = 0
JobCurrentStartDate = 1364322127
TargetType = "Machine"
ServerTime = 1364322453
LastPublicClaimId = "<machine-ip>#1364246102#73#..."
Cmd = "/usr/bin/blender"
>
TransferExecutable = false
JobUniverse = 5
BytesRecvd = 74.000000
RemoteWallClockTime = 3.000000
JobNotification = 2
Iwd = "/home/ubuntu"
RemoteSysCpu = 0.0
MachineAttrCpus0 = 1
Owner = "ubuntu"
LastJobStatus = 2
MemoryUsage = ( ( ResidentSetSize + 1023 ) / 1024 )
WhenToTransferOutput = "ON_EXIT"
EnteredCurrentStatus = 1364322130
LastJobLeaseRenewal = 1364322130
PeriodicHold = false
AutoClusterId = 1
JobCurrentStartExecutingDate = 1364322129
BytesSent = 24849.000000
JobPrio = 0
RootDir = "/"
PeriodicRelease = false
NumJobMatches = 1
LastMatchTime = 1364322127
PeriodicRemove = false
LeaveJobInQueue = true
StreamOut = false
CommittedSlotTime = 3.000000
DiskUsage = 325
AutoClusterAttrs = "JobUniverse,LastCheckpointPlatform,NumCkpts,jordan,Requirements,NiceUser,ConcurrencyLimits"
ClusterId = 669
CommittedTime = 3
CompletionDate = 1364322130
SpooledOutputFiles = "render_0.png"
StartdPrincipal = "unauthenticated@unmapped/10.194.169.234"
JobCurrentStartTransferOutputDate = 1364322130
TransferInput = "<url>"
CumulativeSlotTime = 3.000000
MyType = "Job"
JobRunCount = 1
LastRemoteHost = "<machine-ip>"
StreamErr = false
ResidentSetSize = 0
ProcId = 23
User = "ubuntu@<machine-ip>"
ExitBySignal = false
Arguments = "-b dolphin.blend -o //render_# -F PNG -x 1 -f $(Process)"
ResidentSetSize_RAW = 0
LastSuspensionTime = 0
JobStatus = 4
NumShadowStarts = 1
OrigMaxHosts = 1
MachineAttrSlotWeight0 = 1
ImageSize_RAW = 14260
ShouldTransferFiles = "YES"
QDate = 1364321914
TerminationPending = true

On Tue, Mar 26, 2013 at 2:19 PM, Brian Bockelman <bbockelm@xxxxxxxxxxx> wrote:
Hi Jordan,

What do the ClassAds you are submitting look like?

Iwd should refer to a directory on the submit machine (or the spool directory, if you are using spooling).  By default, Iwd is set to the $PWD of the submitting process.

Brian

On Mar 26, 2013, at 1:09 PM, Jordan Williamson <jordan.williamson@xxxxxxxxxxx> wrote:

I'm trying to run some jobs using the python bindings for Condor 7.9.4. They keep being held because the "Iwd" classad seems to be required, but I can't find a general "default" value for it that would work on any execute machine (that is, if I set it to some hard-coded directory, it would error out on a machine that didn't have that exact directory structure).

Is it possible to leave this classad out and let the execute nodes take care of it? (If so, I can't seem to find any classads that would enable this, and just leaving it out altogether produces errors) Is there a default value for Iwd that would enable this action? I've tried "/", "." and the directory it's being submitted from on the submit machine, but none of those worked.
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/