[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Increase memory on release



Iâve tried this on my own machine (9.0.14), and MEMORY values print as expected for condor_q -autocluster. You can add the -long flag to see which attributes the schedd is returning to condor_q:

% condor_q -autocluster -long
AutoClusterId = 2
DiskUsage = 5
JobCount = 4
JobIds = "2101.0 ... 2101.3"
Rank = 0.0
RequestCpus = 1
RequestDisk = DiskUsage
RequestMemory = ifthenelse(((LastHoldReasonCode != 34) || IsUndefined(Memoryshmovisioned)),2048,4096)
Requirements = (false) && (TARGET.Arch == "arm64") && (TARGET.OpSys == "macOS") && (TARGET.Disk >= RequestDisk) && (TARGET.Memory >= RequestMemory) && (TARGET.HasFileTransfer)
ServerTime = 1654698108

You can alternatively add the -af flag to see how the RequestMemory attribute evaluates in the returned autocluster ad:

% condor_q -autocluster -af requestmemory
2048

I did notice a problem where the job universe isnât always returned by the schedd, which should be a simple fix.

 - Jaime

On May 29, 2022, at 7:14 AM, Dudu Handelman <duduhandelman@xxxxxxxxxxx> wrote:

Hi Jaime,
I think it's only a display issue with condor_q --autocluster

As you can see bellow it's an autocluster of 15 jobs and and the Negotiator treat this as a single autocluster.


Using condor_version 9.0.1


Thanks
David.

Submit lines:


InitialMemorySize = 2048 IncreasedMemorySize = 4096 RequestMemory = ifthenelse(((LastHoldReasonCode != 34) || IsUndefined(Memoryshmovisioned)), $(InitialMemorySize), $(IncreasedMemorySize)) periodic_release = (JobStatus == 5) && (HoldReasonCode == 34) && (Memoryshmovisioned <= $(IncreasedMemorySize))

--------------------------------
condor_q --autocluster output: -- Schedd: fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo : <192.20.9.70:20123?... @ 05/29/22 10:04:31    ID COUNT UINVERSE CPUS MEMORY     DISK REQUIREMENTS                                                                                                                                                              87     0 Vanilla     1 [????]    15360 TARGET.HasDocker && (TARGET.Disk >= RequestDisk) && (TARGET.Memory >= RequestMemory) && (TARGET.HasFileTransfer)                                                       dudu@fleetnetworks-sbmt01:~$ ---------------------------------

Negotiator log: 05/29/22 10:04:20   Negotiating with guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx at <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> 05/29/22 10:04:20 0 seconds so far for this submitter 05/29/22 10:04:20 0 seconds so far for this schedd 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 1 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.39.26:9618?addrs=192.3.39.26-9618&alias=shmo-server75.gorgo&noUDP&sock=startd_6181_3af9> slot1@xxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 2 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.29.56:9618?addrs=192.3.29.56-9618&alias=shmo-server1074.gorgo&noUDP&sock=startd_16567_2999> slot1@xxxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 3 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.32.131:9618?addrs=192.3.32.131-9618&alias=shmo-server95.gorgo&noUDP&sock=startd_1208_2a7e> slot1@xxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 4 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.53.112:9618?addrs=192.3.53.112-9618&alias=shmo-server10.gorgo&noUDP&sock=startd_12323_21a9> slot1@xxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 5 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.21.177:9618?addrs=192.3.21.177-9618&alias=shmo-server1086.gorgo&noUDP&sock=startd_24598_c2b0> slot1@xxxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 6 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.106.141:9618?addrs=192.3.106.141-9618&alias=GLUS171087.gorgo&noUDP&sock=startd_31737_854c> slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:20       Successfully matched with slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 7 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.30.44:9618?addrs=192.3.30.44-9618&alias=shmo-server152.gorgo&noUDP&sock=startd_432_e500> slot1@xxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 8 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.3.69:9618?addrs=192.3.3.69-9618&alias=NIR-SSD029.gorgo&noUDP&sock=startd_25279_deaf> slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:20       Successfully matched with slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 9 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.63.176:9618?addrs=192.3.63.176-9618&alias=NIR-069.gorgo&noUDP&sock=startd_6202_a276> slot1@xxxxxxxxxxxxx 05/29/22 10:04:20       Successfully matched with slot1@xxxxxxxxxxxxx 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 10 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.59.148:9618?addrs=192.3.59.148-9618&alias=shmo-server1149.gorgo&noUDP&sock=startd_12655_db1a> slot1@xxxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:21       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:21     Request 02537.00000: autocluster 86 (request count 11 of 15) 05/29/22 10:04:21       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.64.122:9618?addrs=192.3.64.122-9618&alias=NIR-SSD017.gorgo&noUDP&sock=startd_27893_db3f> slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:21       Successfully matched with slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:21     Request 02537.00000: autocluster 86 (request count 12 of 15) 05/29/22 10:04:21       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.3.73:9618?addrs=192.3.3.73-9618&alias=NIR-SSD024.gorgo&noUDP&sock=startd_27636_9368> slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:21       Successfully matched with slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:21     Request 02537.00000: autocluster 86 (request count 13 of 15) 05/29/22 10:04:21       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.47.15:9618?addrs=192.3.47.15-9618&alias=shmo-server1168.gorgo&noUDP&sock=startd_13945_b046> slot1@xxxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:21       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:21     Request 02537.00000: autocluster 86 (request count 14 of 15) 05/29/22 10:04:21       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.3.71:9618?addrs=192.3.3.71-9618&alias=NIR-SSD036.gorgo&noUDP&sock=startd_7313_b02e> slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:21       Successfully matched with slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:21     Request 02537.00000: autocluster 86 (request count 15 of 15) 05/29/22 10:04:21       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.30.127:9618?addrs=192.3.30.127-9618&alias=shmo-server116.gorgo&noUDP&sock=startd_11345_4e09> slot1@xxxxxxxxxxxxxxxxxxxx



From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Dudu Handelman <duduhandelman@xxxxxxxxxxx>
Sent: 28 May 2022 11:08
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Cc: Condor-Users Mail List <condor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Increase memory on release
 
Thanks Jamie. 
I will recreate this in the lab and provide the information.


Thanks 
David 


From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Jaime Frey <jfrey@xxxxxxxxxxx>
Sent: Thursday, May 26, 2022, 03:41
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Cc: Condor-Users Mail List <condor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Increase memory on release

Can you tell us the _expression_ youâre using, and the command and output with question marks?

 - Jaime

On May 20, 2022, at 1:09 AM, Dudu Handelman <duduhandelman@xxxxxxxxxxx> wrote:

Hi All. 
I'm trying to increase memory on failed jobs. 
It's actually working using if statement on requested memory. 
But the auto cluster is unable to handle it. 
While looking at autocluster queue it displays question marks at the memory column

I personally think it's a very important feature. 

Thanks 
David 


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/