[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] birdbath dag submission



Allright, I'm still stock with this job submition. I have no idea why my submission is not working, first I thought it was because I was submitting my job from java, but I have the same probleme submitting this job by condor_submit !!!

so my job is basicly a script with different steps. This one works perfectly when lunch outside condor! but got the following log when using it:
000 (508.000.000) 10/02 23:09:00 Job submitted from host: <127.0.0.1:8181>
...
001 (508.000.000) 10/02 23:09:04 Job executing on host: <127.0.0.1:51445>
...
006 (508.000.000) 10/02 23:09:12 Image size of job updated: 11968
...
010 (508.000.000) 10/02 23:12:06 Job was suspended.
        Number of processes actually suspended: 5
...
006 (508.000.000) 10/02 23:12:13 Image size of job updated: 70820
...
011 (508.000.000) 10/02 23:22:11 Job was unsuspended.
...
004 (508.000.000) 10/02 23:22:12 Job was evicted.
        (0) Job was not checkpointed.
                Usr 0 00:00:13, Sys 0 00:00:11  -  Run Remote Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
        0  -  Run Bytes Sent By Job
        0  -  Run Bytes Received By Job
...
001 (508.000.000) 10/02 23:29:08 Job executing on host: <127.0.0.1:51445>
...
005 (508.000.000) 10/02 23:29:09 Job terminated.
        (1) Normal termination (return value 1)
                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
        0  -  Run Bytes Sent By Job
        0  -  Run Bytes Received By Job
        0  -  Total Bytes Sent By Job
        0  -  Total Bytes Received By Job


what mean the evected thing ?
sounds like my job is placed back in the queue, then tryed to be reexecuted but from begining so then crash because some file allready exist!

what is going wrong ?
thx

Jerome







-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx on behalf of Matthew Farrellee
Sent: Mon 10/1/2007 6:39 AM
To: Condor-Users Mail List
Subject: Re: [Condor-users] birdbath dag submission
 


Mariette, Jerome wrote:
> 
> For the full path ... what should I use ? where the job is runing

You shouldn't need a full path at all unless you want Condor to write 
the out/err to a special place on the execute machine. Just give the 
file name. You do not need to (and should not) assume the layout of the 
machine your job might be run on.


> that is kind of obscure to me as I sent the job to condor and then
> don't know where condor is gonna write the file !! I'm sure I have

Condor will write the files to a location it manages and then bring them 
back for you (into a spool directory, use ListSpool/GetFile SOAP calls 
to access it).


> the right to write here and the /home/jerome folder exist as it's my
> home directory! Condor is runing job under jerome user so should be
> allowed to write it here, right ?

Should, if the directory exists on all machines that might run your job. 
Generally you should not make assumptions about the machines that will 
run your job.


> Sorry for my english on the last question ... my job execute a script
> and the point is that this script is not executed until the end (I'm
> sure the script works as if I run it throught command line it 's
> working fine) ! the job is not runing not in Idle as well, there is a
> C for the state of the job but no job is in the Queue !

C stands for "Completed" meaning your job has run and successfully 
(return code of 0) finished.


Best,


matt

> What does it mean ??
> 
> 
> thx
> 
> 
> 
> 
> 
> 
> 
> 
> -----Original Message----- From: condor-users-bounces@xxxxxxxxxxx on
> behalf of Matthew Farrellee Sent: Fri 9/28/2007 1:08 PM To:
> Condor-Users Mail List Subject: Re: [Condor-users] birdbath dag
> submission
> 
> Kent had an interesting idea for submitting dag, why not submit 
> condor_submit_dag instead of condor_dagman? It'd work if you can deal
>  with parsing the output of condor_submit_dag to figure out the job
> id of your condor_dagman job.
> 
> Mariette, Jerome wrote:
>> * Hi I kind of give up for now on DAGs but keep working on
>> submitting jobs throught java code so still need your help :) I
>> can't succed to make Condor write my out, err files !! and don't
>> know why: this is my code
>> 
>> ClassAdStructAttr[] extraAttributes = { new
>> ClassAdStructAttr("Out", 
>> ClassAdAttrType.value3,"/home/jerome/temp.out"), new 
>> ClassAdStructAttr("Err", 
>> ClassAdAttrType.value3,"/home/jerome/temp.err"), };
> 
> Is there a particular reason you are specifying a full path to these?
> Do you know if the full path exists on the execute machine or if
> Condor has permissions to write to it?
> 
> 
>> xact.submit(cluster, job, "jerome", UniverseType.VANILLA, 
>> "/home/jerome/aved/scripts/runAVEDworkflow-condor", "-i " + video +
>> " -d " + AvedService.SCRATCH_DIR, null, extraAttributes, null); 
>> xact.commit();
>> 
>> 
>> moreover, the script I lunch throught that (because the job runs 
>> perfectly) is not executing until the end !! is that normal ?? when
>> I lunch my script by hand this one execute perfectly. I was
>> wondering if when opening connection throught Condor web service
>> there was some time deadline, like for this line:
> 
> I'm not sure I understand the question. Executing until the end of
> what?
> 
> 
>> xact.begin(30);
>> 
>> not sure what it does mean?
> 
> Submission happens on an all or nothing basis. xact.begin(30) means
> you are starting a transaction that will stay open as long as you
> don't wait more than 30 seconds between operations within the
> transaction.
> 
> Best,
> 
> 
> 
> matt
> 
> 
>> thx for your help, Jerome
>> 
>> 
>> 
>> -----Original Message----- From: condor-users-bounces@xxxxxxxxxxx
>> on behalf of R. Kent Wenger Sent: Thu 9/27/2007 9:11 AM To:
>> Condor-Users Mail List Subject: Re: [Condor-users] birdbath dag
>> submission
>> 
>> On Mon, 24 Sep 2007, Mariette, Jerome wrote:
>> 
>>> I'm pretty new in Condor world and have some troubles submitting 
>>> dag. Here is my probleme. I'm using birdbath wraper to do it and 
>>> I'm submitting the dag file like that:
>>> 
>>> Schedd schedd = new Schedd (new URL("http://localhost:8181";)); 
>>> Transaction xact = schedd.createTransaction(); xact.begin(30);
>>> int cluster = xact.createCluster(); int job =
>>> xact.createJob(cluster); xact.submit(cluster, job, "jerome",
>>> UniverseType.SCHEDULER, "/opt/condor-6.8.5/bin/condor_dagman", /*
>>> Path to the dagman binarie */ "-f -l . -Debug 3 " + "-Lockfile
>>> myLockFile -Dag myDag -Rescue myRescuDag -Condorlog myLog", null,
>>> null, null); xact.commit();
>>> 
>>> what am I doing wrong ? (the Dag File is ok because tryed by 
>>> command it's working) thx
>> At least one more thing you need from the DAGMan end -- 
>> _CONDOR_DAGMAN_LOG must be set in DAGMan's environment.  (This
>> needs to point to a file DAGMan can log to.)
>> 
>> It sounds like you've been able to run condor_submit_dag on the 
>> command line, so take a look at the .condor.sub file it produces to
>>  see how _CONDOR_DAGMAN_LOG is set there.
>> 
>> Kent Wenger Condor Team 
>> _______________________________________________ Condor-users
>> mailing list To unsubscribe, send a message to 
>> condor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You
>> can also unsubscribe by visiting 
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>> 
>> The archives can be found at: 
>> https://lists.cs.wisc.edu/archive/condor-users/
>> 
>> 
>> 
>> ------------------------------------------------------------------------
>> 
>> 
>> 
>> _______________________________________________ Condor-users
>> mailing list To unsubscribe, send a message to 
>> condor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You
>> can also unsubscribe by visiting 
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>> 
>> The archives can be found at: 
>> https://lists.cs.wisc.edu/archive/condor-users/
> _______________________________________________ Condor-users mailing
> list To unsubscribe, send a message to
> condor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can
> also unsubscribe by visiting 
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at: 
> https://lists.cs.wisc.edu/archive/condor-users/
> 
> 
> 
> ------------------------------------------------------------------------
> 
> 
> _______________________________________________ Condor-users mailing
> list To unsubscribe, send a message to
> condor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can
> also unsubscribe by visiting 
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at: 
> https://lists.cs.wisc.edu/archive/condor-users/
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: 
https://lists.cs.wisc.edu/archive/condor-users/

<<winmail.dat>>