[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Schedd crash by setting concurrency limits by schedd_cron



Hello!

I get the following error on my test pool and i don't know how to fix it. Sorry for my bad English i try to explain so good i / google translator can. ;D
I set up a pool and defines for the schedd a cron job list with two cron job names corresponding to two concurrency limits (free_low_limit and free_high_limit). As i know the schedd creates new ClassAds for the cron joblist names on start. Now i want to execute periodically two scripts to set new concurrency limit values at run time. Persistent and Runtime config is enabled. But if i start condor the schedd crashes with the following errors.
Without the _LIMIT shedd starts but not set the concurrency limits.

 

condor_config:

SCHEDD_CRON_JOBLIST = FREE_LOW_LIMIT FREE_HIGH_LIMIT

SCHEDD_CRON_FREE_LOW_LIMIT_EXECUTABLE = $(LIBEXEC)/low_limit

SCHEDD_CRON_FREE_LOW_LIMIT_PERIOD = 30s

SCHEDD_CRON_FREE_HIGH_LIMIT_EXECUTABLE = $(LIBEXEC)/high_limit

SCHEDD_CRON_FREE_HIGH_LIMIT_PERIOD = 30s

 

ScheddLog:

04/10/15 17:55:45 (pid:25416) Setting maximum file descriptors to 4096.

04/10/15 17:55:45 (pid:25416) ******************************************************

04/10/15 17:55:45 (pid:25416) ** condor_schedd (CONDOR_SCHEDD) STARTING UP

04/10/15 17:55:45 (pid:25416) ** /opt/condor-8.2.9/sbin/condor_schedd

04/10/15 17:55:45 (pid:25416) ** SubsystemInfo: name=SCHEDD type=SCHEDD(5) class=DAEMON(1)

04/10/15 17:55:45 (pid:25416) ** Configuration: subsystem:SCHEDD local:<NONE> class:DAEMON

04/10/15 17:55:45 (pid:25416) ** $CondorVersion: 8.2.9 Aug 12 2015 BuildID: 335399 $

04/10/15 17:55:45 (pid:25416) ** $CondorPlatform: x86_64_RedHat7 $

04/10/15 17:55:45 (pid:25416) ** PID = 25416

04/10/15 17:55:45 (pid:25416) ** Log last touched 4/10 17:51:23

04/10/15 17:55:45 (pid:25416) ******************************************************

04/10/15 17:55:45 (pid:25416) Using config source: /etc/condor/condor_config

04/10/15 17:55:45 (pid:25416) Using local config sources:

04/10/15 17:55:45 (pid:25416)    /home/condor/.condor/condor_config.local

04/10/15 17:55:45 (pid:25416) config Macros = 183, Sorted = 183, StringBytes = 3821, TablesBytes = 6636

04/10/15 17:55:45 (pid:25416) CLASSAD_CACHING is ENABLED

04/10/15 17:55:45 (pid:25416) Daemon Log is logging: D_ALWAYS D_ERROR

04/10/15 17:55:45 (pid:25416) SharedPortEndpoint: waiting for connections to named socket 7945_0bde_5

04/10/15 17:55:45 (pid:25416) DaemonCore: command socket at <192.168.13.71:41158?sock=7945_0bde_5>

04/10/15 17:55:45 (pid:25416) DaemonCore: private command socket at <192.168.13.71:41158?sock=7945_0bde_5>

04/10/15 17:55:45 (pid:25416) History file rotation is enabled.

04/10/15 17:55:45 (pid:25416)   Maximum history file size is: 20971520 bytes

04/10/15 17:55:45 (pid:25416)   Number of rotated history files is: 2

04/10/15 17:55:45 (pid:25416) Launched startd for local jobs with pid 25418

04/10/15 17:55:45 (pid:25416) NOTE: QUEUE_ALL_USERS_TRUSTED=TRUE - all queue access checks disabled!

04/10/15 17:55:45 (pid:25416) CronJobList: Adding job 'FREE_LOW_LIMIT'

04/10/15 17:55:45 (pid:25416) CronJobList: Adding job 'FREE_HIGH_LIMIT'

Stack dump for process 25416 at timestamp 1428681345 (15 frames)

/opt/condor-8.2.9/sbin/../lib/libcondor_utils_8_2_9.so(dprintf_dump_stack+0x72)[0x7fe0b6b0e812]

/opt/condor-8.2.9/sbin/../lib/libcondor_utils_8_2_9.so(_Z18linux_sig_coredumpi+0x24)[0x7fe0b6c441a4]

/lib64/libpthread.so.0(+0xf890)[0x7fe0b2532890]

/lib64/libc.so.6(+0x95487)[0x7fe0b2210487]

/opt/condor-8.2.9/sbin/../lib/libcondor_utils_8_2_9.so(_ZN8MyString10assign_strEPKci+0x4e)[0x7fe0b6b08f8e]

/opt/condor-8.2.9/sbin/../lib/libcondor_utils_8_2_9.so(_ZN8MyStringaSERKS_+0x1d)[0x7fe0b6b08fdd]

/opt/condor-8.2.9/sbin/../lib/libcondor_utils_8_2_9.so(_ZN14ClassAdCronJob10InitializeEv+0x4c)[0x7fe0b6aea05c]

/opt/condor-8.2.9/sbin/../lib/libcondor_utils_8_2_9.so(_ZN17CondorCronJobList13InitializeAllEv+0x22)[0x7fe0b6b3bd62]

/opt/condor-8.2.9/sbin/../lib/libcondor_utils_8_2_9.so(_ZN10CronJobMgr8DoConfigEb+0xa5)[0x7fe0b6ad1165]

/opt/condor-8.2.9/sbin/../lib/libcondor_utils_8_2_9.so(_ZN10CronJobMgr10InitializeEPKc+0x27)[0x7fe0b6ad11d7]

condor_schedd(_ZN9Scheduler4InitEv+0x2067)[0x4b34d7]

condor_schedd(_Z9main_initiPPc+0xb4)[0x474294]

/opt/condor-8.2.9/sbin/../lib/libcondor_utils_8_2_9.so(_Z7dc_mainiPPc+0x139c)[0x7fe0b6c4772c]

/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fe0b219cb05]

condor_schedd[0x44a5f1]

 

 

Thomas