Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] HoldReason = "Streaming not supported"

Date: Tue, 8 Jul 2008 16:40:12 -0500
From: Jaime Frey <jfrey@xxxxxxxxxxx>
Subject: Re: [Condor-users] HoldReason = "Streaming not supported"

On Jun 27, 2008, at 12:36 PM, Sean Manning wrote:

Jaime Frey wrote:

On Jun 23, 2008, at 6:50 PM, Sean Manning wrote:

I am working on a Web Services interface to submit jobs to ourGlobus

grid.  It uses the condor and birdbath Java packages.  We can
successfully submit the attached JDL on the command line of a condor
head node (the metascheduler of our grid)  and see it complete, but

when we submit it with the Java program from an external Condorclient

machine the job stays Idle then Halts with an error.  Running the
condor daemons as root got rid of one error, but now we get another
one: HoldReason = "Streaming not supported".  I can't find any
information about this error in the usergroup archives.  Does anyone
here have an idea what could be causing this?

For GT4 GRAM jobs, if StreamOut and StreamErr aren't explicitly settoFalse in the job ad, then Condor assumes you want stdout and stderrto

be streamed, which isn't supported by Condor for GT4 GRAM jobs. This

appears to be a bug, as the default behavior for other job types isno

streaming.

If you add the following two attributes to your job ads, it should
eliminate the problem:
StreamOut = False
StreamErr = False

Thanks and regards,
Jaime Frey
UW-Madison Condor Team


Dear Jaime,

 Thanks for the reply.

 I made that change, but jobs are still hanging with HoldReason =
"Streaming not supported."  I can submit the new file with

condor_submit from the grid metascheduler and see it appear on thehead

node of a worker cluster, when condor_config has SOAP enabled.  The

output and error come back to the machine I submitted the job fromjustlike they are supposed to. But when I submit the same JDL to thegridmetascheduler using our Web Services code, the job always holdsafter a

delay.

 Right now, the Condor daemons are running as root.  The web services
code is running on my personal account (seangwm) on my workstation.
The spool directory on the metascheduler
($CONDOR_LOCATION/local.babargt4/spool) belongs to condor:root..  We
have been changing the owner of the job folder on the spool
($CONDOR_LOCATION/local.babargt4/spool/cluster5252.proc0.subproc0) by

hand from root:root to my personal account and group, because jobsstayidle until I do so. I think that this has to do with the fact thatthe

proxy file must have very specific permissions so the grid will trust
it.  If I change the owner of the spool folder to root:root I get a
HoldReason = "Failed to get expiration time of proxy" instead.

 In principle, if we can submit a job to the grid using condor_submit,
then the web services submission should work as well.  I would be very
grateful if you have any further advice about what I am missing.

 I have attached our main Java class for job submittion and the JDL
which I have been trying with the Web Services code.  In the attached
files, babargt4 is the grid metascheduler and ugdev07 is the head of
one of the clusters of worker nodes.

Can you look at the values of StreamOut and StreamErr in the classadof the held job in the schedd? I'm guessing they're either missing orset to the string "False". They need to set to False (no quotes). I'llbet your JobHelper class isn't handling these attributes correctly.


Thanks and regards,
Jaime Frey
UW-Madison Condor Team

Prev by Date: Re: [Condor-users] "Bad CONDOR_JOB_STATUS_CONSTRAINED Result" [Sec=Unclassified]
Next by Date: Re: [Condor-users] Checkpoint Platform error
Previous by thread: Re: [Condor-users] condor_store_cred
Next by thread: Re: [Condor-users] HoldReason = "Streaming not supported"
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [Condor-users] HoldReason = "Streaming not supported"