[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] SCHEDD dying on multiple process submission



Hi John:

Could you please send a stack trace from one of these dead processes?  You
should have received some in the admin e-mail account.

-B

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Scillieri, John
Sent: Thursday, May 17, 2007 6:57 AM
To: condor-users@xxxxxxxxxxx
Subject: [Condor-users] SCHEDD dying on multiple process submission

All,

I'm having trouble with a job that sporadically kills the SCHEDD on the
submission host. I think the job in question is our large job that
queues up multiple processes (about 25). I've attached the majority of
the job description file below, is there something I'm doing that is bad
etiquette or unsupported?  The MasterLog file reports "The SCHEDD (pid
XXXX) died due to exception ACCESS_VIOLATION" if that helps anyone. I've
submitted the job both as a standalone submission and as a piece within
a DAG and it happens both ways.

Also, because the schedd keeps dying on start-up there's no way for me
to use condor_rm to remove the bad jobs. Is there another way to
manually remove a job from scheduling?  

Any help would be great, thanks a lot for your time.

John Scillieri

-------------------- CONDOR JOB SUBMIT FILE --------------
JobName = PORTFOLIOS
LogHeader = $(JobName).$(Cluster).$(Process).$$(Machine)
 
input =
\\NAS-OMF-01\cpsshare\All\Risk\Software\R\prod\Energy\VaR\Overnight\run.
one.VaR.R
output =
\\NAS-OMF-01\cpsshare\All\Risk\Reports\VaR\prod\CondorNightlyLogs\$(LogH
eader).out
error =
\\NAS-OMF-01\cpsshare\All\Risk\Reports\VaR\prod\CondorNightlyLogs\$(LogH
eader).error
log =
\\NAS-OMF-01\cpsshare\All\Risk\Reports\VaR\prod\CondorNightlyLogs\dagLog
.log
Executable = C:\Program Files\R\R-2.5.0\bin\Rscript.exe
Arguments = --vanilla - 

transfer_executable = False
should_transfer_files = No
getenv = True
EnvSettings = TMP=C:\temp TEMP=C:\temp
run_as_owner = true
Requirements = UidDomain == "DOMAIN" && FileSystemDomain == "DOMAIN" 
Universe = vanilla
 
# Begin individual job section
environment = "portfolio=$(portfolio) $(EnvSettings)"
 
portfolio = 'Portfolio 1'
Queue
portfolio = 'Portfolio 2'
Queue

... And so on ...

portfolio = 'Portfolio 30'
Queue
>>> This e-mail and any attachments are confidential, may contain legal,
professional or other privileged information, and are intended solely for
the
addressee.  If you are not the intended recipient, do not use the
information
in this e-mail in any way, delete this e-mail and notify the sender. CEG-IP2

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR