[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] 回复: parallel job submission



Thank you very much!

- Yang Guang
 
 
------------------ 原始邮件 ------------------
发送时间: 2011年8月28日(星期天) 凌晨0:36
收件人: "Condor-Users Mail List"<condor-users@xxxxxxxxxxx>;
主题: Re: [Condor-users] parallel job submission
 
I don't have any experience with the parallel universe yet, but you should check out our Condor project wiki for the troubleshooting page dedicated to this problem: <http://servo.cs.wlu.edu/dokuwiki/doku.php/condor/submit/troubleshoot>.  Basically, it's possible that the job is being submitted but then exits immediately with an error.  If this is the case, figure out which slot it's being run on and look at the corresponding StarterLog.SlotX log file to see what error your program is encountering.
Hope this helps!

Best Regards,
 ~ Garrett K.

On Aug 27, 2011, at 5:34 AM, 关中大侠 wrote:

I am new in condor, and I install the condor use the command --make-personal-condor.
And it can run vanilla job.
I make a new job use the universe parallel, and write the description file as the example in the manual:
--------------------------------------------------
universe                =       parallel
executable              =       /bin/sleep
arguments               =       30
machine_count           =       1
log                     =       log
queue
--------------------------------------------------

Note that I specify the machine_count to 1.
When I run condor_status, the output is:
--------------------------------------------------
Name               OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime

slot1@xxxxxxxx     LINUX      X86_64 Unclaimed Idle     0.000  1428  0+01:02:09
slot2@xxxxxxxx     LINUX      X86_64 Unclaimed Idle     0.000  1428  0+01:02:10
slot3@xxxxxxxx     LINUX      X86_64 Unclaimed Idle     1.000  1428  0+01:40:07
slot4@xxxxxxxx     LINUX      X86_64 Unclaimed Idle     0.040  1428  0+01:40:08
                     Total Owner Claimed Unclaimed Matched Preempting Backfill

        X86_64/LINUX     4     0       0         4       0          0        0

               Total     4     0       0         4       0          0        0
------------------------------------------------------------
then I submit the job:
condor_submit <desc>
But it can't run, I use the command condor_q -analyze to see what went wrong, the output is :
------------------------------------------------------------
-- Submitter: tree.org : <125.216.243.80:43021> : tree.org
---
016.000:  Request has not yet been considered by the matchmake
---------------------------------------------------------------------------
Could anyone help me with the problem? I wonder if I can run the parallel universe job on my computer which has been configured to be person-condor.

Thank you!

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/