[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] max_materialize not working with HTC 8.9.3 (failed to create ClassAd for Job)



Mixing versions may or may not work because submit and the Schedd need to agree on what goes into the submit digest, and that is still in flux.   

 

run

 

condor_config_val -factory -wide <jobid>

 

Then have a look in the file listed under the DIGEST column,  I think you will see that it has no executable

 

I think it was early in the 8.9 series that the form of the submit digest changed so that it does not include keywords

that are constant for the submission.  this fixes a host of bugs.  but also means that the 8.9 series cannot

do late materialization submit into older Schedds.

 

In general, you should assume that late materialization requires condor_submit and the schedd to be the same version for now,

 

-tj 

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of ervikrant06@xxxxxxxxx
Sent: Wednesday, May 20, 2020 3:47 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] max_materialize not working with HTC 8.9.3 (failed to create ClassAd for Job)

 

Don't have 8.9.3 submit atm. 

 

Yesterday I tried this with 8.8.5 version and it worked fine. It was not allowing more than 50 idle jobs in queue. 

 

materialize_max_idle = 50

 


Thanks & Regards,

Vikrant Aggarwal

 

 

On Wed, May 20, 2020 at 1:51 PM Beyer, Christoph <christoph.beyer@xxxxxxx> wrote:

Hi,

I try something like:

max_materialize = 100
queue 1000

This works fine using a remote submit with 8.7.1 on the submitting node but fails with 8.9.3

In the log file I see:


035 (6200261.-01.000) 05/20 10:12:18 Factory submitted from host: <131.169.223.39:9618?addrs=131.169.223.39-9618&noUDP&sock=schedd_3371_cdd8_145>
...
037 (6200261.-01.000) 05/20 10:12:18 Job Materialization Paused
        failed to create ClassAd for Job 6200261.0 : Submit:-1:No 'executable' parameter was provided

        PauseCode 1
...

As the submit file is exactly the same of course, hence the 'executable' is not missing for sure ;)

I also recognized that in the older htc version the scheduler gets unresponsive for condor_q requests during processing the max_materialize job but that is a different 'thing' I guess.

Did not find anything about late_materialization in the docs, is that on purpose or is it on the todo list ?

Best
christoph

--
Christoph Beyer
DESY Hamburg
IT-Department

Notkestr. 85
Building 02b, Room 009
22607 Hamburg

phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/