[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Jobs will not run in another pool

I've tried the same sort of thing before with no success, and I know there was no communications issue. It seems that flocking negotiations do not work in this way. If you specify requirements that can be met only by machines in the other pool, Condor never seems to match them.

- dave

Kewley, J (John) wrote:
Do either/both of the pools have firewalls?
Do either of the submitting or the potentially executing nodes in the flocked pool have machine firewalls?
Are either pools on private networks?
All are worth checking before expecting flocking to work. Remember ALL submit nodes must have routes through any intervening firewalls to all potential execute nodes, even in another pool. Such routes include TCP and UDP access across a variety
of ports. For more details on this, checkout
If you are happy that there are no firewalls in the way, then you need to post results of
condor_q -anal
or the appropriate log files
Cheers JK

    -----Original Message-----
    *From:* condor-users-bounces@xxxxxxxxxxx
    [mailto:condor-users-bounces@xxxxxxxxxxx]*On Behalf Of *Junaid N.
    *Sent:* Wednesday, April 05, 2006 7:08 AM
    *To:* Condor-Users Mail List
    *Subject:* [Condor-users] Jobs will not run in another pool


    I flocked a job to another pool using the following cmd file

    executable      = HADAMeanFilter
    universe        = vanilla

    Requirements   =  HOSTNAME != caudate-nh.nsw.cmis.csiro.au
    should_transfer_files = YES
    when_to_transfer_output = ON_EXIT

    transfer_input_files =

    notification    = COMPLETE
    notify_user     = Oscar.AcostaTamayo@xxxxxxxx
    output          = output.HADAMeanFilterI.vanilla.dynamic
    error           = error.HADAMeanFilterI.vanilla.dynamic
    log             = log.HADAMeanFilterI.vanilla.dynamic
    arguments       = -n ResampledProstate7.out.nostripes.raw -x  256 -y
    349 -z 348  -r 0 -s 8 -L 64 -q 0 -t 255 -K 2 -u 3 -v 3 -w 3 -o
    49x348_CAutHomgnty -m 2 -X 20 -c 5 -A 1 -T 0.125 -E 0.0001 -I 0 -H 1
    -Z 0

    my job is stuck in the queue and is not executing although there is
    a load full of resources available.

    you can check that over here http://condorview.csiro.au/

    the reason i have placed the requirement is because i wanted to
    check flocking.

    so i have said that run on any machine except caudate-nh( which is
    the only machine in my pool right now. i have turned off the rest of

    any problems?

    *Junaid N. Sahibzada*
    *Cell # (+61) 404 998 494 *
    *International Student MSc Internetworking, UTS, Australia*
    *Bachelor of Information Technology, NUST, Pakistan*

    Yahoo! Messenger with Voice. Make PC-to-Phone Calls
    to the US (and 30+ countries) for 2¢/min or less.


Condor-users mailing list