[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Web Services JDL Parsing



Hi,  

  I appreciate that my last email was somewhat lengthy, and I have made 
some progress since then.  I now have a very specific question about 
how to stage back output in a grid environment.

  Again, I am working on Web Services code using the birdbath and 
condor Java packages.  I can submit a job (see the attached JDL) using 
my Web Services interface from my account, and see it appear in the 
condor queue of the grid metascheduler.  The input files get 
transferred correctly from my client machine to the metascheduler (they 
go to the folder 
/opt/condor/local.babargt4/spool/cluster1234.proc0.subproc0 or 
similar), but the folder and its contents belong to root (the user who 
is running Condor) not myself (the user who submitted the job).  Unless 
I change the owner of the files to myself by hand, I get an error 
HoldReason = "Failed to get expiration time of Proxy" because the job 
and the proxy certificate must be owned by the same user.

  When we changed the owner of the spool/cluster folder and its 
contents to myself, the job can create a gridftp wrapper and start 
running.  We can see it on the head node of one of our clusters, and 
see it create a scratch folder (in /hepuser/gcprod01/.globus/scratch on 
our NFS) and store the output and error there.  But the output does not 
get staged back from the head node to the metascheduler to the client, 
and the job hangs in mode C = Completed.  We have tried several variant 
JDL files without success.

  In other words, we have two problems:

(i) How can we run the jobs as the user who submits them, not the user 
who owns condor?

(ii) How can we get output to stage back from the cluster to the 
metascheduler and the client machine?

  Can anyone advise how to solve either of these problems?

Thanks,

Sean Manning

Attachment: testjdl-gt4
Description: Binary data