[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] JobRouter fails on long classad



Hi Max,

Glancing at the hook handling code, I see an undocumented configuration variable PIPE_BUFFER_MAX. It is set to 10240. Does configuring a larger value change the behavior in your problem case?

--Dan

On 2/23/15 12:29 PM, Brian Bockelman wrote:
Hi Max,

I'm not aware of any specific buffer limits (although I suppose something must exist!).

What happens if you try to parse the classad with the python bindings?  Do they similarly fail?  This might allow one to differentiate between failures in the classad library versus failures in the JR.

(Apologies - I can't quite test this for you, my mail client mauled the formatting of your email.)

On to slightly crazier ideas: in other context when I've needed to pass huge amounts of data via classads (i.e., there's some reason I can't do it as an input file), I've had good luck in passing the string through gzip and converting the binary output to base64.

Brian

On Feb 23, 2015, at 5:06 AM, Fischer, Max (SCC) <max.fischer@xxxxxxx> wrote:

Hi All,

we use the JobRouter to update scheduling information from external sources with Hooks [1]. When doing an information scale test, we found that the JobRouter crashes for large job ClassAds. This seems to be a problem with how the Router parses the ClassAd.

The JobRouter fails to extract the Owner from the job ClassAd[1], even though it is enforced by the Route [2]. (Checking condor_q also correctly shows the Owner.)
In a similar fashion, if the hook produces too much output (the translate hook *must* output the entire ClassAd), we see errors that indicate the ClassAd is not fully digested. E.g. parsing of RequestMemory stops before the argument is done.

This appears to be due to a limited buffer size. For our hooks, we can pass the information also via files, but
is there a fixed limitation to ClassAd sizes we need to be aware of? Is this exclusive to the JobRouter?

Cheers,
Max


[1] JobRouter Hooks
# launch job router to hook into submitted jobs
DAEMON_LIST = $(DAEMON_LIST), JOB_ROUTER

# intrinsic job router
JOB_ROUTER_ENTRIES = \
        [ \
                requirements = (target.INPUT_FILES isnt undefined); \
                GridResource = "NONE"; \
                name = "HPDA"; \
                OverrideRoutingEntry = True; \
                TargetUniverse = 5;\
                set_HPDA_Route = True; \
                set_HookKeyword = "HPDA"; \
        ]
# router may poll frequently as hooks may skip frequent updates
JOB_ROUTER_POLLING_PERIOD = 10
# add external HPDA hooks
JOB_ROUTER_HOOK_KEYWORD = HPDA
HPDA_HOOK_TRANSLATE_JOB = /opt/hpda/repo/bin/htc_translate.py
HPDA_HOOK_UPDATE_JOB_INFO = /opt/hpda/repo/bin/htc_update.py
HPDA_HOOK_JOB_EXIT = /opt/hpda/repo/bin/htc_finalize.py
HPDA_HOOK_JOB_FINALIZE = /opt/hpda/repo/bin/htc_finalize.py
# Disable PROCD as job_router runs as submitting user
JOB_ROUTER.USE_PROCD = False

[2] JobRouterLog
02/23/15 10:20:34 (D_ALWAYS:2) HookClient /opt/hpda/repo/bin/htc_translate.py (pid 31454) exited with status 0
02/23/15 10:20:34 (D_ALWAYS:2) JobRouter (route=HPDA): Setting attribute HookKeyword
02/23/15 10:20:34 (D_ALWAYS:2) JobRouter (route=HPDA): Setting attribute HPDA_Route
CoreSize = 0
CumulativeSlotTime = 0
BufferBlockSize = 32768
ExecutableSize_RAW = 1
CurrentTime = time()
WantCheckpoint = false
ManagedManager = ""
CommittedTime = 0
RoutedBy = "jobrouter"
TargetType = "Machine"
WhenToTransferOutput = "ON_EXIT"
Cmd = "/usr/users/mfischer/condor-test/test.sh"
JobUniverse = 5
TransferIn = false
Iwd = "/usr/users/mfischer/condor-test"
CommittedSuspensionTime = 0
NumSystemHolds = 0
CumulativeSuspensionTime = 0
HookKeyword = "HPDA"
Environment = ""
HPDA_REQUIREMENTS = HPDA_RANK > 0
HPDA_LOCATORS = "http://ekpsg03.physik.uni-karlsruhe.de:8081";
MinHosts = 1
JobNotification = 0
NumCkpts = 0
LastSuspensionTime = 0
NumJobStarts = 0
WantRemoteSyscalls = false
JobPrio = 0
RootDir = â/"
LONG = "0-+-|-+-+-10+-|-+-+-20+-|-+-+-30+-|-+-+-40+-|-+-+-50+-|-+-+-60+-|-+-+-70+-|-+-+-80+-|-+-+-90+-|-+-+-100-|-+-+-110-|-+-+-120-|-+-+-130-|-+-+-140-|-+-+-150-|-+-+-160-|-+-+-170-|-+-+-180-|-+-+-190-|-+-+-200-|-+-+-210-|-+-
+-220-|-+-+-230-|-+-+-240-|-+-+-250-|-+-+-260-|-+-+-270-|-+-+-280-|-+-+-290-|-+-+-300-|-+-+-310-|-+-+-320-|-+-+-330-|-+-+-340-|-+-+-350-|-+-+-360-|-+-+-370-|-+-+-380-|-+-+-390-|-+-+-400-|-+-+-410-|-+-+-420-|-+-+-430-|-+-+-440-|-+-+-450-|-+-+-460-|-+-+-470-|-+-+-480-|-+-+-490-|-+-+-500-|-+-+-510-|-+-+-520-|-+-+-530-|-+-+-540-|-+-+-550-|-+-+-560-|-+-+-570-|-+-+-580-|-+-+-590-|-+-+-600-|-+-+-610-|-+-+-620-|-+-+-630-|-+-+-640-|-+-+-650-|-+-+-660-|-+-+-670-|-+-+-680-|-+-+-690-|-+-+-700-|-+-+-710-|-+-+-720-|-+-+-730-|-+-+-740-|-+-+-750-|-+-+-760-|-+-+-770-|-+-+-780-|-+-+-790-|-+-+-800-|-+-+-810-|-+-+-820-|-+-+-830-|-+-+-840-|-+-+-850-|-+-+-860-|-+-+-870-|-+-+-880-|-+-+-890-|-+-+-900-|-+-+-910-|-+-+-920-|-+-+-930-|-+-+-940-|-+-+-950-|-+-+-960-|-+-+-970-|-+-+-980-|-+-+-990-|-+-+-1000|-+-+-1010|-+-+-1020|-+-+-1030|-+-+-1040|-+-+-1050|-+-+-1060|-+-+-1070|-+-+-1080|-+-+-1090|-+-+-1100|-+-+-1110|-+-+-1120|-+-+-1130|-+-+-1140|-+-+-1150|-+-+-1160|-+-+-1170|-+-+-1180|-+-+-1190|-+-+-1200|-+-+-1210|-+-+-1220|-+-+-1230|-+-+-1240|-+-+-1250|-+-+-1260|-+-+-1270|-+-+-1280|-+-+-1290|-+-+-1300|-+-+-1310|-+-+-1320|-+-+-1330|-+-+-1340|-+-+-1350|-+-+-1360|-+-+-1370|-+-+-1380|-+-+-1390|-+-+-1400|-+-+-1410|-+-+-1420|-+-+-1430|-+-+-1440|-+-+-1450|-+-+-1460|-+-+-1470|-+-+-1480|-+-+-1490|-+-+-1500|-+-+-1510|-+-+-1520|-+-+-1530|-+-+-1540|-+-+-1550|-+-+-1560|-+-+-1570|-+-+-1580|-+-+-1590|-+-+-1600|-+-+-1610|-+-+-1620|-+-+-1630|-+-+-1640|-+-+-1650|-+-+-1660|-+-+-1670|-+-+-1680|-+-+-1690|-+-+-1700|-+-+-1710|-+-+-1720|-+-+-1730|-+-+-1740|-+-+-1750|-+-+-1760|-+-+-1770|-+-+-1780|-+-+-1790|-+-+-1800|-+-+-1810|-+-+-1820|-+-+-1830|-+-+-1840|-+-+-1850|-+-+-1860|-+-+-1870|-+-+-1880|-+-+-1890|-+-+-1900|-+-+-1910|-+-+-1920|-+-+-1930|-+-+-1940|-+-+-1950|-+-+-1960|-+-+-1970|-+-+-1980|-+-+-1990|-+-+-2000|-+-+-2010|-+-+-2020|-+-+-2030|-+-+-2040|-+-+-2050|-+-+-2060|-+-+-2070|-+-+-2080|-+-+-2090|-+-+-2100|-+-+-2110|-+-+-2120|-+-+-2130|-+-+-2140|-+-+-2150|-+-+-2160|-+-+-2170|-+-+-2180|-+-+-2190|-+-+-2200|-+-+-2210|-+-+-2220|-+-+-2230|-+-+-2240|-+-+-2250|-+-+-2260|-

+-+-2270|-+-+-2280|-+-+-2290|-+-+-2300|-+-+-2310|-+-+-2320|-+-+-2330|-+-+-2340|-+-+-2350|-+-+-2360|-+-+-2370|-+-+-2380|-+-+-2390|-+-+-2400|-+-+-2410|-+-+-2420|-+-+-2430|-+-+-2440|-+-+-2450|-+-+-2460|-+-+-2470|-+-+-2480|-+-+-2490|-+-+-2500|-+-+-2510|-+-+-2520|-+-+-2530|-+-+-2540|-+-+-2550|-+-+-2560|-+-+-2570|-+-+-2580|-+-+-2590|-+-+-2600|-+-+-2610|-+-+-2620|-+-+-2630|-+-+-2640|-+-+-2650|-+-+-2660|-+-+-2670|-+-+-2680|-+-+-2690|-+-+-2700|-+-+-2710|-+-+-2720|-+-+-2730|-+-+-2740|-+-+-2750|-+-+-2760|-+-+-2770|-+-+-2780|-+-+-2790|-+-+-2800|-+-+-2810|-+-+-2820|-+-+-2830|-+-+-2840|-+-+-2850|-+-+-2860|-+-+-2870|-+-+-2880|-+-+-2890|-+-+-2900|-+-+-2910|-+-+-2920|-+-+-2930|-+-+-2940|-+-+-2950|-+-+-2960|-+-+-2970|-+-+-2980|-+-+-2990|-+-+-3000|-+-+-3010|-+-+-3020|-+-+-3030|-+-+-3040|-+-+-3050|-+-+-3060|-+-+-3070|-+-+-3080|-+-+-3090|-+-+-3100|-+-+-3110|-+-+-3120|-+-+-3130|-+-+-3140|-+-+-3150|-+-+-3160|-+-+-3170|-+-+-3180|-+-+-3190|-+-+-3200|-+-+-3210|-+-+-3220|-+-+-3230|-+-+-3240|-+-+-3250|-+-+-3260|-+-+-3270|-+-+-3280|-+-+-3290|-+-+-3300|-+-+-3310|-+-+-3320|-+-+-3330|-+-+-3340|-+-+-3350|-+-+-3360|-+-+-3370|-+-+-3380|-+-+-3390|-+-+-3400|-+-+-3410|-+-+-3420|-+-+-3430|-+-+-3440|-+-+-3450|-+-+-3460|-+-+-3470|-+-+-3480|-+-+-3490|-+-+-3500|-+-+-3510|-+-+-3520|-+-+-3530|-+-+-3540|-+-+-3550|-+-+-3560|-+-+-3570|-+-+-3580|-+-+-3590|-+-+-3600|-+-+-3610|-+-+-3620|-+-+-3630|-+-+-3640|-+-+-3650|-+-+-3660|-+-+-3670|-+-+-3680|-+-+-3690|-+-+-3700|-+-+-3710|-+-+-3720|-+-+-3730|-+-+-3740|-+-+-3750|-+-+-3760|-+-+-3770|-+-+-3780|-+-+-3790|-+-+-3800|-+-+-3810|-+-+-3820|-+-+-3830|-+-+-3840|-+-+-3850|-+-+-3860|-+-+-3870|-+-+-3880|-+-+-3890|-+-+-3900|-+-+-3910|-+-+-3920|-+-+-3930|-+-+-3940|-+-+-3950|-+-+-3960|-+-+-3970|-+-+-3980|-+-+-3990|-+-+-4000|-+-+-4010|-+-+-4020|-+-+-4030|-+-+-4040|-+-+-4050|-+-+-4060|-+-+-4070|-+-+-4080|-+-+-4090|-+-+-4100|-+-+-4110|-+-+-4120|-+-+-4130|-+-+-4140|-+-+-4150|-+-+-4160|-+-+-4170|-+-+-4180|-+-+-4190|-+-+-4200|-+-+-4210|-+-+-4220|-+-+-4230|-+-+-4240|-+-+-4250|-+-+-4260|-+-+-4270|-+-+-4280|-+-+-4290|-+-+-4300|-+-+-4310

|-+-+-4320|-+-+-4330|-+-+-4340|-+-+-4350|-+-+-4360|-+-+-4370|-+-+-4380|-+-+-4390|-+-+-4400|-+-+-4410|-+-+-4420|-+-+-4430|-+-+-4440|-+-+-4450|-+-+-4460|-+-+-4470|-+-+-4480|-+-+-4490|-+-+-4500|-+-+-4510|-+-+-4520|-+-+-4530|-+-+-4540|-+-+-4550|-+-+-4560|-+-+-4570|-+-+-4580|-+-+-4590|-+-+-4600|-+-+-4610|-+-+-4620|-+-+-4630|-+-+-4640|-+-+-4650|-+-+-4660|-+-+-4670|-+-+-4680|-+-+-4690|-+-+-4700|-+-+-4710|-+-+-4720|-+-+-4730|-+-+-4740|-+-+-4750|-+-+-4760|-+-+-4770|-+-+-4780|-+-+-4790|-+-+-4800|-+-+-4810|-+-+-4820|-+-+-4830|-+-+-4840|-+-+-4850|-+-+-4860|-+-+-4870|-+-+-4880|-+-+-4890|-+-+-4900|-+-+-4910|-+-+-4920|-+-+-4930|-+-+-4940|-+-+-4950|-+-+-4960|-+-+-4970|-+-+-4980|-+-+-4990|-+-+-5000|-+-+-5010|-+-+-5020|-+-+-5030|-+-+-5040|-+-+-5050|-+-+-5060|-+-+-5070|-+-+-5080|-+-+-5090|-+-+-5100|-+-+-5110|-+-+-5120|-+-+-5130|-+-+-5140|-+-+-5150|-+-+-5160|-+-+-5170|-+-+-5180|-+-+-5190|-+-+-5200|-+-+-5210|-+-+-5220|-+-+-5230|-+-+-5240|-+-+-5250|-+-+-5260|-+-+-5270|-+-+-5280|-+-+-5290|-+-+-5300|-+-+-5310|-+-+-5320|-+-+-5330|-+-+-5340|-+-+-5350|-+-+-5360|-+-+-5370|-+-+-5380|-+-+-5390|-+-+-5400|-+-+-5410|-+-+-5420|-+-+-5430|-+-+-5440|-+-+-5450|-+-+-5460|-+-+-5470|-+-+-5480|-+-+-5490|-+-+-5500|-+-+-5510|-+-+-5520|-+-+-5530|-+-+-5540|-+-+-5550|-+-+-5560|-+-+-5570|-+-+-5580|-+-+-5590|-+-+-5600|-+-+-5610|-+-+-5620|-+-+-5630|-+-+-5640|-+-+-5650|-+-+-5660|-+-+-5670|-+-+-5680|-+-+-5690|-+-+-5700|-+-+-5710|-+-+-5720|-+-+-5730|-+-+-5740|-+-+-5750|-+-+-5760|-+-+-5770|-+-+-5780|-+-+-5790|-+-+-5800|-+-+-5810|-+-+-5820|-+-+-5830|-+-+-5840|-+-+-5850|-+-+-5860|-+-+-5870|-+-+-5880|-+-+-5890|-+-+-5900|-+-+-5910|-+-+-5920|-+-+-5930|-+-+-5940|-+-+-5950|-+-+-5960|-+-+-5970|-+-+-5980|-+-+-5990|-+-+-6000|-+-+-6010|-+-+-6020|-+-+-6030|-+-+-6040|-+-+-6050|-+-+-6060|-+-+-6070|-+-+-6080|-+-+-6090|-+-+-6100|-+-+-6110|-+-+-6120|-+-+-6130|-+-+-6140|-+-+-6150|-+-+-6160|-+-+-6170|-+-+-6180|-+-+-6190|-+-+-6200|-+-+-6210|-+-+-6220|-+-+-6230|-+-+-6240|-+-+-6250|-+-+-6260|-+-+-6270|-+-+-6280|-+-+-6290|-+-+-6300|-+-+-6310|-+-+-6320|-+-+-6330|-+-+-6340|-+-+-6350|-+-+-63

60|-+-+-6370|-+-+-6380|-+-+-6390|-+-+-6400|-+-+-6410|-+-+-6420|-+-+-6430|-+-+-6440|-+-+-6450|-+-+-6460|-+-+-6470|-+-+-6480|-+-+-6490|-+-+-6500|-+-+-6510|-+-+-6520|-+-+-6530|-+-+-6540|-+-+-6550|-+-+-6560|-+-+-6570|-+-+-6580|-+-+-6590|-+-+-6600|-+-+-6610|-+-+-6620|-+-+-6630|-+-+-6640|-+-+-6650|-+-+-6660|-+-+-6670|-+-+-6680|-+-+-6690|-+-+-6700|-+-+-6710|-+-+-6720|-+-+-6730|-+-+-6740|-+-+-6750|-+-+-6760|-+-+-6770|-+-+-6780|-+-+-6790|-+-+-6800|-+-+-6810|-+-+-6820|-+-+-6830|-+-+-6840|-+-+-6850|-+-+-6860|-+-+-6870|-+-+-6880|-+-+-6890|-+-+-6900|-+-+-6910|-+-+-6920|-+-+-6930|-+-+-6940|-+-+-6950|-+-+-6960|-+-+-6970|-+-+-6980|-+-+-6990|-+-+-7000|-+-+-7010|-+-+-7020|-+-+-7030|-+-+-7040|-+-+-7050|-+-+-7060|-+-+-7070|-+-+-7080|-+-+-7090|-+-+-7100|-+-+-7110|-+-+-7120|-+-+-7130|-+-+-7140|-+-+-7150|-+-+-7160|-+-+-7170|-+-+-7180|-+-+-7190|-+-+-7200|-+-+-7210|-+-+-7220|-+-+-7230|-+-+-7240|-+-+-7250|-+-+-7260|-+-+-7270|-+-+-7280|-+-+-7290|-+-+-7300|-+-+-7310|-+-+-7320|-+-+-7330|-+-+-7340|-+-+-7350|-+-+-7360|-+-+-7370|-+-+-7380|-+-+-7390|-+-+-7400|-+-+-7410|-+-+-7420|-+-+-7430|-+-+-7440|-+-+-7450|-+-+-7460|-+-+-7470|-+-+-7480|-+-+-7490|-+-+-7500|-+-+-7510|-+-+-7520|-+-+-7530|-+-+-7540|-+-+-7550|-+-+-7560|-+-+-7570|-+-+-7580|-+-+-7590|-+-+-7600|-+-+-7610|-+-+-7620|-+-+-7630|-+-+-7640|-+-+-7650|-+-+-7660|-+-+-7670|-+-+-7680|-+-+-7690|-+-+-7700|-+-+-7710|-+-+-7720|-+-+-7730|-+-+-7740|-+-+-7750|-+-+-7760|-+-+-7770|-+-+-7780|-+-+-7790|-+-+-7800|-+-+-7810|-+-+-7820|-+-+-7830|-+-+-7840|-+-+-7850|-+-+-7860|-+-+-7870|-+-+-7880|-+-+-7890|-+-+-7900|-+-+-7910|-+-+-7920|-+-+-7930|-+-+-7940|-+-+-7950|-+-+-7960|-+-+-7970|-+-+-7980|-+-+-7990|-+-+-8000|-+-+-8010|-+-+-8020|-+-+-8030|-+-+-8040|-+-+-8050|-+-+-8060|-+-+-8070|-+-+-8080|-+-+-8090|-+-+-8100|-+-+-8110|-+-+-8120|-+-+-8130|-+-+-8140|-+-+-8150|-+-+-8160|-+-+-8170|-+-+-8180|-+-+-8190|-+-+-8200|-+-+-8210|-+-+-8220|-+-+-8230|-+-+-8240|-+-+-8250|-+-+-8260|-+-+-8270|-+-+-8280|-+-+-8290|-+-+-8300|-+-+-8310|-+-+-8320|-+-+-8330|-+-+-8340|-+-+-8350|-+-+-8360|-+-+-8370|-+-+-8380|-+-+-8390|-+-+-8400|-+-+-

8410|-+-+-8420|-+-+-8430|-+-+-8440|-+-+-8450|-+-+-8460|-+-+-8470|-+-+-8480|-+-+-8490|-+-+-8500|-+-+-"
WantRemoteIO = true
OnExitRemove = true
Managed = "Schedd"
DiskUsage = 1
PeriodicRemove = false
HPDA_ROUTE = true
LocalUserCpu = 0.0
LastRejMatchTime = 1424683225
ClusterId = 8017
RoutedFromJobId = "8017.0"
CompletionDate = 0
RemoteWallClockTime = 0.0
Rank = ( 0.0 ) + HPDA_RANK
LeaveJobInQueue = false
TargetVar = "$$(HOSTNAME:Undefined).host"
CondorVersion = "$CondorVersion: 8.2.1 Jun 27 2014 BuildID: 256063 $"
LastRejMatchReason = "no match found"
StreamErr = false
DiskUsage_RAW = 1
TEST_HPDA = true
RouteName = "HPDA"
ProcId = 0
PeriodicHold = false
User = "mfischer@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
LastJobStatus = 0
Arguments = ""
Out = "job.8017.0.stdout"
MyVar = myCustomValue
HPDA_RANK = 0 + ( ( machine == "ekpsg01.physik.uni-karlsruhe.de" ) * 3 )
UserLog = "/usr/users/mfischer/condor-test/job.8017.log"
JobStatus = 1
PeriodicRelease = false
MaxHosts = 1
TotalSuspensions = 0
CommittedSlotTime = 0
TransferInputSizeMB = 0
CondorPlatform = "$CondorPlatform: x86_64_RedHat6 $"
ShouldTransferFiles = "IF_NEEDED"
EnteredCurrentStatus = 1424679507
QDate = 1424679507
MyHost = "foo.host.var"
02/23/15 10:20:34 (D_ALWAYS|D_FAILURE) ERROR "Failed to find Owner in job ad." at line 34 in file /slots/04/dir_29130/userdir/src/condor_utils/set_user_priv_from_ad.cpp

[3] JobRouterLog
02/23/15 11:46:23 (D_ALWAYS:2) JobRouter: umbrella constraint: ( (( target.INPUT_FILES isnt undefined )) ) && (target.ProcId >= 0 && target.JobStatus == 1 && (target.StageInStart is undefined || target.StageInFinish isnt undefined) && target.Managed isnt "ScheddDone" && target.Managed isnt "External" && target.Owner isnt Undefined && target.RoutedBy isnt "job routerâ)

[4] JobRouterLog
02/23/15 11:46:44 (D_ALWAYS) TranslateClient::hookExited (src=8035.0,route=HPDA): Failed to insert "REQUESTMEMORY = ifthenelse(MemoryUsage =!= undefined,Memo" into ClassAd, ignoring invalid hook output

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature