[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Identifying condor_shadow PID in Windows?



Thanks Ben, yep I've ended up getting it from the ShadowLog file/s. 

I was after something "automatic" to "reset" "hung" jobs. Unfortunately
condor_hold/condor_release, condor_vacate_job, etc. did not work, although
would have provided an elegant solution by using the -constraint option.

I was checking for jobs that have obviously "run-away" and are taking
far too long. There is a condor_shadow running for them BUT the execute
node where it thinks it's running is now unclaimed/idle. The only way was
to kill the associated condor_shadow as NO condor_commands would do it
(not even condor_vacate_job -f).

The decidedly inelegant DOS windows solution was the following batch script
(I've included it in the unlikely event someone else wants to do something similar!)

cd c:\progra~1\condor\log
type ShadowLog.* | findstr ACCEPTED > accepted1.txt
for /f "tokens=3,4" %%i in (accepted1.txt) do echo %%i %%j >> accepted2.txt
for /f "tokens=1,2 delims=(" %%i in (accepted2.txt) do echo %%i %%j >> accepted3.txt
for /f "tokens=1,2 delims=)" %%i in (accepted3.txt) do echo %%i %%j >> accepted4.txt
condor_q -run -currentrun -constraint "(JobStatus == 2) && (CurrentTime - JobCurrentStartDate) > 1800" | findstr + > jobs1.txt
for /f %%i in (jobs1.txt) do echo %%i >> jobs2.txt
type accepted4.txt | findstr /G:jobs2.txt > jobs3.txt
rem for /f "tokens=1,2" %%i in (jobs3.txt) do echo taskkill /f /pid %%j >> jobs4.bat
for /f "tokens=1,2" %%i in (jobs3.txt) do echo taskkill /f /pid %%j
del accepted*.*
del jobs*.*

Hope it doesn't put too many heads in a spin! :)

Cheers

Greg

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Burnett, Ben
Sent: Tuesday, 20 July 2010 2:30 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] Identifying condor_shadow PID in Windows?

You should be able to parse the shadow log file for the information you are looking for (pidd in the banner, and the job cluster.process will be listed later below).

-B

On 2010-07-19, at 10:07 PM, <Greg.Hitchen@xxxxxxxx> <Greg.Hitchen@xxxxxxxx> wrote:

> Is it possible on Windows to identify which condor_shadow process (PID number) is associated
> with which running job number?
> 
> Thanks.
> 
> Cheers
> 
> Greg
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/