[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Increase memory on release



Hi Jaime,
I think it's only a display issue with condor_q --autocluster

As you can see bellow it's an autocluster of 15 jobs and and the Negotiator treat this as a single autocluster.


Using condor_version 9.0.1


Thanks
David.

Submit lines:


InitialMemorySize = 2048 IncreasedMemorySize = 4096 RequestMemory = ifthenelse(((LastHoldReasonCode != 34) || IsUndefined(Memoryshmovisioned)), $(InitialMemorySize), $(IncreasedMemorySize)) periodic_release = (JobStatus == 5) && (HoldReasonCode == 34) && (Memoryshmovisioned <= $(IncreasedMemorySize))

--------------------------------
condor_q --autocluster output: -- Schedd: fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo : <192.20.9.70:20123?... @ 05/29/22 10:04:31    ID COUNT UINVERSE CPUS MEMORY     DISK REQUIREMENTS                                                                                                                                                              87     0 Vanilla     1 [????]    15360 TARGET.HasDocker && (TARGET.Disk >= RequestDisk) && (TARGET.Memory >= RequestMemory) && (TARGET.HasFileTransfer)                                                       dudu@fleetnetworks-sbmt01:~$ ---------------------------------

Negotiator log: 05/29/22 10:04:20   Negotiating with guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx at <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> 05/29/22 10:04:20 0 seconds so far for this submitter 05/29/22 10:04:20 0 seconds so far for this schedd 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 1 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.39.26:9618?addrs=192.3.39.26-9618&alias=shmo-server75.gorgo&noUDP&sock=startd_6181_3af9> slot1@xxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 2 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.29.56:9618?addrs=192.3.29.56-9618&alias=shmo-server1074.gorgo&noUDP&sock=startd_16567_2999> slot1@xxxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 3 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.32.131:9618?addrs=192.3.32.131-9618&alias=shmo-server95.gorgo&noUDP&sock=startd_1208_2a7e> slot1@xxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 4 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.53.112:9618?addrs=192.3.53.112-9618&alias=shmo-server10.gorgo&noUDP&sock=startd_12323_21a9> slot1@xxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 5 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.21.177:9618?addrs=192.3.21.177-9618&alias=shmo-server1086.gorgo&noUDP&sock=startd_24598_c2b0> slot1@xxxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 6 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.106.141:9618?addrs=192.3.106.141-9618&alias=GLUS171087.gorgo&noUDP&sock=startd_31737_854c> slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:20       Successfully matched with slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 7 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.30.44:9618?addrs=192.3.30.44-9618&alias=shmo-server152.gorgo&noUDP&sock=startd_432_e500> slot1@xxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 8 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.3.69:9618?addrs=192.3.3.69-9618&alias=NIR-SSD029.gorgo&noUDP&sock=startd_25279_deaf> slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:20       Successfully matched with slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 9 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.63.176:9618?addrs=192.3.63.176-9618&alias=NIR-069.gorgo&noUDP&sock=startd_6202_a276> slot1@xxxxxxxxxxxxx 05/29/22 10:04:20       Successfully matched with slot1@xxxxxxxxxxxxx 05/29/22 10:04:20     Request 02537.00000: autocluster 86 (request count 10 of 15) 05/29/22 10:04:20       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.59.148:9618?addrs=192.3.59.148-9618&alias=shmo-server1149.gorgo&noUDP&sock=startd_12655_db1a> slot1@xxxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:21       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:21     Request 02537.00000: autocluster 86 (request count 11 of 15) 05/29/22 10:04:21       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.64.122:9618?addrs=192.3.64.122-9618&alias=NIR-SSD017.gorgo&noUDP&sock=startd_27893_db3f> slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:21       Successfully matched with slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:21     Request 02537.00000: autocluster 86 (request count 12 of 15) 05/29/22 10:04:21       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.3.73:9618?addrs=192.3.3.73-9618&alias=NIR-SSD024.gorgo&noUDP&sock=startd_27636_9368> slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:21       Successfully matched with slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:21     Request 02537.00000: autocluster 86 (request count 13 of 15) 05/29/22 10:04:21       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.47.15:9618?addrs=192.3.47.15-9618&alias=shmo-server1168.gorgo&noUDP&sock=startd_13945_b046> slot1@xxxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:21       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxxx 05/29/22 10:04:21     Request 02537.00000: autocluster 86 (request count 14 of 15) 05/29/22 10:04:21       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.3.71:9618?addrs=192.3.3.71-9618&alias=NIR-SSD036.gorgo&noUDP&sock=startd_7313_b02e> slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:21       Successfully matched with slot1@xxxxxxxxxxxxxxxx 05/29/22 10:04:21     Request 02537.00000: autocluster 86 (request count 15 of 15) 05/29/22 10:04:21       Matched 2537.0 guest.dudu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <192.20.9.70:20123?addrs=192.20.9.70-20123&alias=fleetnetworks-SBMT01.ORG.fleetnetworks.gorgo&noUDP&sock=schedd_2426173_74c0> preempting none <192.3.30.127:9618?addrs=192.3.30.127-9618&alias=shmo-server116.gorgo&noUDP&sock=startd_11345_4e09> slot1@xxxxxxxxxxxxxxxxxxxx



From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Dudu Handelman <duduhandelman@xxxxxxxxxxx>
Sent: 28 May 2022 11:08
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Cc: Condor-Users Mail List <condor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Increase memory on release
 
Thanks Jamie. 
I will recreate this in the lab and provide the information.


Thanks 
David 


From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Jaime Frey <jfrey@xxxxxxxxxxx>
Sent: Thursday, May 26, 2022, 03:41
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Cc: Condor-Users Mail List <condor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Increase memory on release

Can you tell us the _expression_ you’re using, and the command and output with question marks?

 - Jaime

On May 20, 2022, at 1:09 AM, Dudu Handelman <duduhandelman@xxxxxxxxxxx> wrote:

Hi All. 
I'm trying to increase memory on failed jobs. 
It's actually working using if statement on requested memory. 
But the auto cluster is unable to handle it. 
While looking at autocluster queue it displays question marks at the memory column

I personally think it's a very important feature. 

Thanks 
David