[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] shadow exception error?



On Wed, Jul 19, 2006 at 11:35:26PM -0600, Jun Wang wrote:
> 
> When I typed condor_q and condor_status on the master node(central manager) and slave nodes(compute nodes), I got the normal screen output which told me how many jobs are running, etc. and which machines are in my pool. Then I tried to run the test example "sh_loop" under condor-6.6.11/examples as user condor by condor_submit sh_loop.cmd on my master node. The job terminated normally. However, when I tried to submit the sh_loop.cmd on my slave node I got shadow exception error message in file sh_loop.log as below:
<snip>
> 
> Does anybody know the possible reason? 

I'm new to Condor myself, and this is a lesson I learned today: Shadow
exceptions can happen for lots of reasons, and you need to look at the
StarterLog on the node where the job was running in order to get more
information.  Reference:

http://docs.optena.com/display/CONDOR/Shadow+Exception

That site, based at:

http://docs.optena.com/display/CONDOR/Troubleshooting

...has lots of good info for newbies who don't know what's going wrong
with their Condor cluster.

HTH,
--Michael