[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] FW: Jobs not running ->>No condor_shadow installedthatsupports vanilla jobs



Hi,

All the condor daemons are running,

ps -ef | grep condor

condor   19294     1  0 Mar21 ?        00:00:03 /usr/local/condor/sbin/condor_master

condor   19295 19294  0 Mar21 ?        00:00:11 condor_collector -f

condor   19296 19294  0 Mar21 ?        00:00:03 condor_negotiator -f

condor   19297 19294  0 Mar21 ?        00:00:06 condor_startd -f

condor   19298 19294  4 Mar21 ?        00:03:25 condor_schedd -f -p 9600

 

But the pool couldn’t not run job, turn into hold and keep complaining about No condor_shadow installed ………….

 

Is it possible to fix it with out reinstalling condor ?

 


From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Natarajan, Senthil
Sent: Thursday, March 22, 2007 12:38 AM
To: Condor-Users Mail List
Subject: [Condor-users] FW: Jobs not running ->>No condor_shadow installedthatsupports vanilla jobs
Importance: High

 

Hi,

Looking into further,

condor_schedd is exited with status 44. And I am seeing this file under log.

“dprintf_failure.SCHEDD”

 

Here is the file content.

 

3/21 18:55:48 dprintf() had a fatal error in pid 6362

Can't link(/u/condor/log/SchedLog,/u/condor/log/SchedLog.old)

errno: 17 (File exists)

euid: 32768, ruid: 0

 

 

 


From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Natarajan, Senthil
Sent: Thursday, March 22, 2007 12:09 AM
To: Condor-Users Mail List
Subject: [Condor-users] Jobs not running ->>No condor_shadow installed thatsupports vanilla jobs
Importance: High

 

Hi,

We are using condor 6.8 pool  (around 150 nodes), until now it use to run fine. Looks like somebody submitted 2000 jobs seems.

After that none of the jobs are running, I am keep getting email from condor like this

 

Condor job 13221.0 has been put on hold.

No condor_shadow installed that supports vanilla jobs on V6.3.3 or newer resources Please correct this problem and release the job with "condor_release"

 

Here is the SchedLog. Even I killed condor master and restarted, then released all the jobs. But still having the same problem all the jobs are going to hold.

Could you please help me, what might be the problem? How to fix this.

 

Thanks,

Senthil

 

SchedLog

***********

3/22 00:00:22 Marked job 14508.0 as IDLE

3/22 00:00:22 Job 14508.0 put on hold: No condor_shadow installed that supports vanilla jobs on V6.3.3 or newer resources

3/22 00:00:22 abort_job_myself: 14508.0 action:Hold log_hold:true notify:true

3/22 00:00:22 Writing record to user logfile=/u/jum18/AIRTP/Patient_278.log owner=jum18

3/22 00:00:22 Forking Mailer process...

3/22 00:00:22 start next job after 2 sec, JobsThisBurst 0

3/22 00:00:22 DaemonCore: No more children processes to reap.

3/22 00:00:24 Job prep for 14509.0 will not block, calling aboutToSpawnJobHandler() directly

3/22 00:00:24 aboutToSpawnJobHandler() completed for job 14509.0, attempting to spawn job handler

3/22 00:00:24 Trying to run a VANILLA job on a 6.3.3 or later resource, but you do not have condor_shadow that will work, aborting.

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR