[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor-C to PBS



Hisham Ihshaish <hisham@xxxxxxxxxxxxx> escribió: 
>

 Hi all,
>
>Thanks very much Francesco and Mark for your information and interest !
>However i can explain more about what is going on as you have (Francesco 
>and Mark) asked me :
>
>1) About the remote_grid resource, to be 5 or grid, I actually did it with
>
>the  grid, and even so, it keeps excecuting on the condor pool ....
>
>2) In the log (Gridmanager), nothing appears, i actually watch it while 
>sending the job till recieving the results, only i can see some lines in 
>this log when i invoke the comand (condor_gridmanager), and it writes that
>
>it started and working but the last sentence indicates an error which is 
>(can´t determine schedd address)..then I think that no attempt to use the
>BLAH_JOB_SUBMIT, I mean the BLAH doesn´t play in the story.
>
>well, I need to see a working examle, at least how the file 
>batch_gahp.config is configured if it is available, i would be grateful for
>
>recieving it !
>
>Thanks alot for your notes,
>and in advance for any help or idea !!
>
>Regards,
>Hisham
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>Francesco Prelz <Francesco.Prelz@xxxxxxxxxx> escribió: 
>>
>
>On Thu, 17 May 2007, Hisham Ihshaish wrote:
>>
>>> It also executes in the Condor pool, and doesn´t go to the pbs !!
>>
>>Do you know if it is trying and failing to go there ? Do you see a
>>gridmanager starting and any attempts at BLAH_JOB_SUBMIT logged in the
>>Gridmanager log ?
>>
>>> however 
>>> I am really confused, how to configure this environment? and how the
>>remote
>>> machine knows that it should execute the job in a certain pbs pool,
>while
>>as
>>> i mensioned before, that there are 4 pbs pools managed by machine
>>(B)....
>>
>>Could you give an example of a valid 'qsub' command for the various pools
>>?
>>I add the colleagues in charge of the development of BLAH in Cc:,
>>so they can help in checking this out. BLAH contains script callouts to
>>adapt the PBS submit file based on arbitrary constraints passed in
>>the 'CERequirements' attribute of the job ad. This looks like one
>possible
>>mechanism that could be used to select pools.
>>
>>> I would be grateful for your help, if there are some examples of
>>> configuration files of the (batch_gahp.config ) file, submit files ,
>any
>>> available configuration examples ! I look forward to hear from you !!
>>
>>'blah' has seen recent bugfixing and development in the EU EGEE context,
>>which probably hasn't made it yet into the version distributed with 
>>Condor. If necessary, we could try putting together a test with a
>>more recent version of BLAH.
>>
>>Francesco Prelz
>>INFN Milano
>>
>>
>>
>
>Original
>Hi all, 
>Dear Mark, Francesco 
>
>I have been trying these two weeks to send condor-c jobs to a PBS pool,
>but
>it is not working with me, i have configured the batch_gahp.config to
>point
>to the PBS installation, and here i describe the environment of my work :
>
>I send a job from my local machine (A), which has the condor deamons :
>Schedd, Master, and Collector.
>
>The remote machine (B) which recieves the job, is the central manager of a
>condor pool and it has master, collector, schedd, ..... deamons, where
>also
>it is the head node of four (4) PBS pools.
>
>When I send this job, using the universe GRID,  specifying that the REMOTE
>GRID RESOURCE is condor, it works fine and excecutes in the remote condor
>pool managed by (B), and here is the submit file: 
>
>        Universe   = grid
>        Executable = hello.sh
>        Output     = hello.out
>        Error      = hello.err
>        Log        = hello.log
>
>        Should_transfer_files = YES
>        When_to_transfer_output = ON_EXIT
>
>        grid_resource = condor hisham@xxxxxxxxxxxxxxx aocegrid.uab.es
>        +remote_jobuniverse = 5
>        +remote_requirements = True
>        +remote_ShouldTransferFiles = "YES"
>        +remote_WhenToTransferOutput = "ON_EXIT"
>        +remote_grid_resource = condor
>        queue
>
>
>
>Now, when I send it with this submit file :
>
>
>        Universe   = grid
>        Executable = hello.sh
>        Output     = hello.out
>        Error      = hello.err
>        Log        = hello.log
>
>        Should_transfer_files = YES
>        When_to_transfer_output = ON_EXIT
>
>        grid_resource = condor hisham@xxxxxxxxxxxxxxx aocegrid.uab.es
>        +remote_jobuniverse = 5
>        +remote_requirements = True
>        +remote_ShouldTransferFiles = "YES"
>        +remote_WhenToTransferOutput = "ON_EXIT"
>        +remote_grid_resource = pbs
>         queue 
>
>It also executes in the Condor pool, and doesn´t go to the pbs !!!,
>however 
>I am really confused, how to configure this environment? and how the
>remote
>machine knows that it should execute the job in a certain pbs pool, while
>as
>i mensioned before, that there are 4 pbs pools managed by machine (B)....
>
>I would be grateful for your help, if there are some examples of
>configuration files of the (batch_gahp.config ) file, submit files , any
>available configuration examples ! I look forward to hear from you !!
>
>Regards,
>Hisham
>
>
>
>
>  
>-- 
>/*****************************************
>* Hisham W. Ihshaish
>*
>* DACSO-CAOS
>* ETSE-UAB-08193 Bellaterra (Barcelona)
>* Office: QC 3088
>* Phone: +34 93 581 28 88
>* e-mail: hisham@xxxxxxxxxxxxx
>*****************************************/
>
>
>
>