[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] issues in heterogeneous pool



 
 
Hi,
I have a heterogeneous pool, including IA64 and X86_64. The X86_64 server is the submitter, and others are work nodes. I compiled my source file on worker nodes, and submited it from the submitter. Submiting my job, I used condor_q to query job, and the result is as follows.
 
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE  CMD   
7046.0   zhxue           9/14 17:41   0+00:27:19   R    0   0.0   ia64              
7047.0   zhxue           9/14 17:41   0+00:00:01   H    0   0.0   data 
 
Why 7047.0 is generated?
 
Furthermore, I use "condor_q -analyze" command, and it prompts the following:
 
7046.000:  Request is being serviced
---
7047.000:  Request is held.
Hold reason: Error from starter on slot2@**.**.**: Failed to execute '/home/zhxue/.globus/.gass_cache/local/md5/58/1da5713002eb7a2d6fe3f76e3f673a/md5/b5/f7f0ea2e16e03c4fdb16fcbbb5abd9/data': Exec format error
 
It seems 7047.0 is the execution process, but it can not been scheduled to IA64 servers. (slot2@**.**.** is a core with x86_64 architecture).
 
I specified "requirements" in the submit script, but it seems not work. The script is as follows:
 
universe=grid
grid_resource = gt2   ***.***.***:/jobmanager-condor
requirements = Arch == "IA64" && OpSys == "Linux"
output = ......
error=....
log = ......
queue 
 
Would you like to help me? Any suggestion is appreciated. Thanks.