[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] basic submission?



Allright I guess I got what was my problem!
When runing my jobs from my java code, there is no environment variables known at all! I figure out by submitting a /bin/env job! so when I set my Environment varible in my java code everything is working!! 

I guess the submit command parse the variables known, right ?

thx so much for your help,
Jerome









-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx on behalf of Matthew Farrellee
Sent: Sat 10/6/2007 12:48 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] basic submission?
 
When you use condor_submit some assumptions are made that lets Condor  
"transfer back" your results. Your results are always transferred  
back, just into the Schedd's spool directory and when you think of  
"transferred back" you really mean copied from the Schedd's spool  
directory to the directory where you ran condor_submit. The  
assumptions are broken when you use SOAP to submit your job, or if  
you pass -s or -r to condor_submit. Basically, the Schedd can't be  
sure where to send results except to its spool directory. This means  
you need to transfer the results yourself. Either use  
condor_transfer_data or the SOAP ListSpool/GetFile commands. Make  
sure you don't try to transfer your output back before the job has  
completed...


matt

On Oct 6, 2007, at 12:45 PM, Mariette, Jerome wrote:

>
> Allright,
> I took a look at the condor-6.8.5/local.lacalypso/execute/ folder  
> when I execute my jobs!
> and sounds like it's processing well, the files are well transfert  
> to the pool but nothing is transfert back !! so I don't have my  
> result !!
>
> I changed my java to add some attributes to specify I want the  
> results to be transfert !! without any success !!
>
>
>
> 			Schedd schedd = new Schedd (new URL("http://localhost:8181";));
> 		    Transaction xact = schedd.createTransaction();
> 		    xact.begin(30);
> 		    int cluster = xact.createCluster();
> 		    int job = xact.createJob(cluster);
>
> 		    File[] files = { new File("/home/jerome/aved/scripts/ 
> runmbarivision"), new File("/home/jerome/aved/scripts/ 
> runAVEDworkflow-condor"), new File(video) };
> 		
> 		    ClassAdStructAttr[] extraAttributes =
> 		    {
> 		    		new ClassAdStructAttr("Out",  
> ClassAdAttrType.value3,"temp.out"),
> 		    		new ClassAdStructAttr("Err",  
> ClassAdAttrType.value3,"temp.err"),
> 		    		new ClassAdStructAttr("should_transfer_files",  
> ClassAdAttrType.value3,"YES"),
> 		    		new ClassAdStructAttr("when_to_transfer_output",  
> ClassAdAttrType.value3,"ON_EXIT"),
> 		    		new ClassAdStructAttr("Log",  
> ClassAdAttrType.value3,"temp.log"),
> 		    		new ClassAdStructAttr("getenv",  
> ClassAdAttrType.value3,"true"),
> 		    		new ClassAdStructAttr("environment",  
> ClassAdAttrType.value3,"AVED_BIN=/home/jerome/aved/mbarivision/bin/ 
> mbarivision")
> 		    };
>
> 		    xact.submit(cluster, job, "jerome", UniverseType.VANILLA,  
> "runAVEDworkflow-condor", "-i Neptune-2006-03-30_1730.mp2", null,  
> extraAttributes, files);
> 		    xact.commit();
>
>
>
> What is wrong ? because the files are transfered back if I'm using  
> a job file and the condor_submit command!
> thx
>
>
>
>
>
>
> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx on behalf of Matthew Farrellee
> Sent: Wed 10/3/2007 10:09 AM
> To: Condor-Users Mail List
> Subject: Re: [Condor-users] basic submission?
>
> You'll either have to specify your job cannot be restarted, which  
> would
> be unfortunate, or change your script to allow for restarting. It's
> probably an issue with having a scratch directory outside the execute/
> directory where Condor runs your script?
>
> Others on the list might have more experience with this than I do.
>
>
> matt
>
> Mariette, Jerome wrote:
>> I don't care for sure, not sure it can really help:
>> The point is the job is executing, then stoped to be in idle and  
>> then started again, but it begins since the begining as it tries  
>> to make the first step again (and crashes because the video result  
>> is already written ... that's what I got from the .err file)
>>
>>
>> #!/bin/bash
>> # Name:runAVEDworkflow-condor
>> #
>> # Usage:  runAVEDworkflow-condor < filename.AVI , filename.MOV >   
>> <scratch directory to process in>
>> ##################################################################### 
>> ##############
>> # Print usage
>> print_usage()
>> {
>>   echo "  "
>>   echo "  "
>>   echo -e "\033[1m USAGE:  runAVEDworkflow-condor [OPTION] -i  
>> [filename.AVI,MOV,MPG] -d path/to/scratch \033[0m"
>>   echo "  "
>>   echo "  "
>>   echo "OPTION"
>>   echo "  "
>>   echo "  "
>>   echo -e "\033[1m -p \033[0m"
>>   echo "  "
>>   echo "      Use parallel AVED code for procesing "
>>   echo "  "
>>   echo "      (Example:  runAVEDworkflow-condor -p -d  
>> mydagfile.dag -i filename.AVI or filename.MOV)"
>>   echo " "
>>   echo -e ""
>> }
>> ##################################################################### 
>> ##############
>> if test $# -lt 1
>>   then print_usage
>>   exit 1
>> fi
>>
>> # Check arguments
>> while getopts d:i:p option
>> do
>>   case $option in
>>    i)  file="$OPTARG";;
>>    d)  scratchDir="$OPTARG";;
>>    p)  useparallel=1;;
>>    *)  echo "Unimplemented option chosen."
>>        echo "  "
>>        print_usage;;
>>   esac
>> done
>>
>> basefile=$(basename $file)
>> filestem=${basefile%.*}
>>
>> if [ $file == 0 -o $scratchDir == 0 ]
>>   then print_usage
>>   exit 1
>> else
>>   mkdir -p $scratchDir/$filestem
>>
>>   ffmpeg -i $file -vcodec copy $scratchDir/$filestem/$filestem".mpg"
>>
>>   echo "transcode -i $scratchDir/$filestem/$filestem".mpg"  -y  
>> ppm,null -o $scratchDir/$filestem/ppms/f"
>>   transcode -i $scratchDir/$filestem/$filestem".mpg"  -y ppm,null - 
>> o $scratchDir/$filestem/f
>>   rm -f $scratchDir/$filestem/$filestem".mpg"
>>   runmbarivision -i $scratchDir/$filestem
>>   rmall f0*
>>
>> fi
>>
>> exit 0
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: condor-users-bounces@xxxxxxxxxxx on behalf of Matthew Farrellee
>> Sent: Wed 10/3/2007 8:56 AM
>> To: Condor-Users Mail List
>> Subject: Re: [Condor-users] basic submission? (was: birdbath dag  
>> submission)
>>
>> It is definitely helpful to know your job will run with simply
>> condor_submit before you start using the SOAP interface to submit  
>> your job.
>>
>> Eviction happens when a machine decides to run some other job  
>> instead of
>> yours. Maybe it had better priority? Anyway, that shouldn't be an  
>> issue.
>>
>> You job isn't successfully terminating though. Since you said it is a
>> script you should make sure you aren't embedding any paths that might
>> not exist on the execution machine (and in the execute machine's  
>> execute
>> directory).
>>
>> Care to share your script?
>>
>>
>>
>> matt
>>
>> Mariette, Jerome wrote:
>>> Allright, I'm still stock with this job submition. I have no idea  
>>> why my submission is not working, first I thought it was because  
>>> I was submitting my job from java, but I have the same probleme  
>>> submitting this job by condor_submit !!!
>>>
>>> so my job is basicly a script with different steps. This one  
>>> works perfectly when lunch outside condor! but got the following  
>>> log when using it:
>>> 000 (508.000.000) 10/02 23:09:00 Job submitted from host:  
>>> <127.0.0.1:8181>
>>> ...
>>> 001 (508.000.000) 10/02 23:09:04 Job executing on host:  
>>> <127.0.0.1:51445>
>>> ...
>>> 006 (508.000.000) 10/02 23:09:12 Image size of job updated: 11968
>>> ...
>>> 010 (508.000.000) 10/02 23:12:06 Job was suspended.
>>>         Number of processes actually suspended: 5
>>> ...
>>> 006 (508.000.000) 10/02 23:12:13 Image size of job updated: 70820
>>> ...
>>> 011 (508.000.000) 10/02 23:22:11 Job was unsuspended.
>>> ...
>>> 004 (508.000.000) 10/02 23:22:12 Job was evicted.
>>>         (0) Job was not checkpointed.
>>>                 Usr 0 00:00:13, Sys 0 00:00:11  -  Run Remote Usage
>>>                 Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
>>>         0  -  Run Bytes Sent By Job
>>>         0  -  Run Bytes Received By Job
>>> ...
>>> 001 (508.000.000) 10/02 23:29:08 Job executing on host:  
>>> <127.0.0.1:51445>
>>> ...
>>> 005 (508.000.000) 10/02 23:29:09 Job terminated.
>>>         (1) Normal termination (return value 1)
>>>                 Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
>>>                 Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
>>>                 Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote  
>>> Usage
>>>                 Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
>>>         0  -  Run Bytes Sent By Job
>>>         0  -  Run Bytes Received By Job
>>>         0  -  Total Bytes Sent By Job
>>>         0  -  Total Bytes Received By Job
>>>
>>>
>>> what mean the evected thing ?
>>> sounds like my job is placed back in the queue, then tryed to be  
>>> reexecuted but from begining so then crash because some file  
>>> allready exist!
>>>
>>> what is going wrong ?
>>> thx
>>>
>>> Jerome
>>>
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx  
>> with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
>>
>>
>>
>> --------------------------------------------------------------------- 
>> ---
>>
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx  
>> with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx  
> with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>
> <winmail.dat>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx  
> with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: 
https://lists.cs.wisc.edu/archive/condor-users/

<<winmail.dat>>