Hi Stefano, for Q1 maybe the quantize() macro might be useful set_MyDefaultMemPerCore = 3000 set_MyMemScaling = xcount * MyDefaultMemPerCore set_TmpScaledMem = quantize(RequestMemory,MyMemScaling)but I am unsure, if it would catch highmem jobs reasonably (might be vice versa necessary to scale the core count up, if the original mem per core request exceeds your defaults)
--- For Q2 my interpretation is, that the xcount reflects in OriginalCpus = 4since the xcount ad is AFAIK only something CE internal and gets copied over to the RequestCpus & OriginalCpus ads
But maybe you can check, if your route got actually applied to your job?E.g., we set a few defaults with [1] - note that the ad is added to JOB_ROUTER_DEFAULTS (the route has not been touched since CE4 and is in the []-syntax)
For specific rules like [2], it might be best for testing to always include a Requirements rule to distinguish which route a job takes and add the route to JOB_ROUTE_NAMES/JOB_ROUTE_ENTRIES. I prefer also adding a 'tag' like "DESYROUTEPRIO" to routes so that I can easier identify where a job went.
Cheers, Thomas [1] MERGE_JOB_ROUTER_DEFAULT_ADS=True DESYDEFAULTS @=end [ set_DESYDEFAULTSSET = True; set_default_xcount = 1; set_default_maxWallTime = 5760; set_default_maxMemory = 2048; set_requirements= ... ] @end JOB_ROUTER_DEFAULTS = $(JOB_ROUTER_DEFAULTS) $(DESYDEFAULTS) [2] DESYPRIO @=end [ TargetUniverse = 5; name = "DESYPRIO"; set_DESYROUTEPRIO = True; Requirements = x509UserProxyVOName =?= "ops" ... ; # some more ads ] @end JOB_ROUTER_ENTRIES = $(JOB_ROUTER_ENTRIES) $(DESYPRIO) JOB_ROUTE_NAMES = $(JOB_ROUTE_NAMES) $(DESYPRIO) On 31/08/2021 18.08, Stefano Dal Pra wrote:
Hello,i'm working to configure a htcondor-ce 5.1 and have a few doubts on how to properly set default job limits.I'm following the examples from here: https://htcondor.github.io/htcondor-ce/v5/configuration/writing-job-routes/ such as this one:|JOB_ROUTER_ROUTE_Condor_Pool @=jrt UNIVERSE VANILLA # Set the requested memory to 1 GB default_maxMemory = 1000 @jrt JOB_ROUTER_ROUTE_NAMES = Condor_Pool|Q1: Is it possible to set default_maxMemory to a value proportional to RequestCpus of the incoming job? i.e.something like default_maxMemory = $(RequestCpus:1) * 3000 Q2: I applied the following defaults: JOB_ROUTER_ROUTE_t1_defaults @=jrt  UNIVERSE VANILLA  default_xcount = 4  default_maxMemory = 4321  default_maxWallTime = 61 @jrt ÂBut I'm a bit confused with the overall results: 0) I submit a minimal test job:[sdalpra@ui-htc htjobs]$ condor_submit -pool ce01t-htc.cr.cnaf.infn.it:9619 -remote ce01t-htc.cr.cnaf.infn.it ce_testp308.subSubmitting job(s). 1 job(s) submitted to cluster 610. 1) The job is routed [root@ce01t-htc ~]# condor_ce_q 610. -af routedtojobid 8428.0 2) I check classads from the routed job[root@ce01t-htc ~]# condor_q 8428.0 -af:jln jobstatus CpusProvisioned xcount requestcpus OriginalCpus remote_NodeNumber remote_SMPGranularity BatchRuntime OriginalMemory remote_OriginalMemory OriginalCpus remote_NodeNumber remote_SMPGranularityID = 8428.0 Âjobstatus = 2 ÂCpusProvisioned = 1 Âxcount = undefined Ârequestcpus = 1 ÂOriginalCpus = 4 Âremote_NodeNumber = 4 Âremote_SMPGranularity = 4 ÂBatchRuntime = 3660 ÂOriginalMemory = 4321 Âremote_OriginalMemory = 4321 ÂOriginalCpus = 4 Âremote_NodeNumber = 4 Âremote_SMPGranularity = 4 So this is where i'm puzzled: - I would expect to see xcount = 4 but it is undefined instead. - The running job reports CpusProvisioned = 1, and that makes me think that remote_NodeNumber = 4, remote_SMPGranularity = 4, OriginalCpus = 4 are somehow ignored.- BatchRuntime is there, with the proper value set as expected (61 * 60) however i'm not sure on the meaning. The htcondor manual says: << For *batch* grid universe jobs, a limit in seconds on the jobâs execution time, enforced by the remote batch system.>> who is "remote" in this context? Does that mean that condor-ce would stop the running routed job after 61 minutes? Moreover,we have here a Vanilla universe job, at both CE and batch side: [root@ce01t-htc ~]# condor_ce_q 610. -l | grep -i univer JobUniverse = 5 [root@ce01t-htc ~]# condor_q -l 8428.0 | grep -i univer JobUniverse = 5 Remote_JobUniverse = 5 Thanks for any comment Stefano || _______________________________________________ HTCondor-users mailing list To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users The archives can be found at: https://lists.cs.wisc.edu/archive/htcondor-users/
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature