[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Python API submission of DAGs



On 18/12/2013 10:47, Brian Candler wrote:
I am starting to conclude the python API is (a) not sufficiently documented, and (b) is different from the built-in tools in subtle and confusing ways (e.g. the Requirements hack is not required when you use condor_submit to submit a scheduler universe job)

I'm just trying to chase down the source of this inconsistency when submitting jobs to the scheduler universe.

1. Submission the manual way

condor_submit_dag -no_submit foo.dag
condor_submit foo.dag

gives the following in the job classAd:

Requirements = ( TARGET.Arch == "X86_64" ) && ( TARGET.OpSys == "LINUX" ) && ( TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory )
ShouldTransferFiles = "IF_NEEDED"


2. Submission via the python API:

The Requirements _expression_ varies depending on the setting of ShouldTransferFiles. If you don't set it explicitly you get:

ShouldTransferFiles = "YES"
Requirements = true && TARGET.OPSYS == "LINUX" && TARGET.ARCH == "X86_64" && TARGET.HasFileTransfer && TARGET.Disk >= RequestDisk && TARGET.Memory >= RequestMemory
, which defaults to "YES" if you don't set it. If you force it to "IF_NEEDED" then you get the following:

[Notice the different capitalisation in the requirements _expression_]

If you set it to IF_NEEDED then you get:

Requirements = true && TARGET.OPSYS == "LINUX" && TARGET.ARCH == "X86_64" && ( TARGET.HasFileTransfer || ( TARGET.FileSystemDomain == MY.FileSystemDomain ) ) && TARGET.Disk >= RequestDisk && TARGET.Memory >= RequestMemory

If you set it to NO then you get:

Requirements = true && TARGET.OPSYS == "LINUX" && TARGET.ARCH == "X86_64" && TARGET.FileSystemDomain == MY.FileSystemDomain && TARGET.Disk >= RequestDisk && TARGET.Memory >= RequestMemory


Now, in case (1) with capitalisation "OpSys" this clearly comes from src/condor_submit.V6/submit.cpp

There is no check on the filesystem domain. This is because check_requirements() doesn't add any file transfer requirements if mightTransfer(JobUniverse) is false, which is the case for scheduler universe.

Also, ShouldTransferFiles explicitly defaults to IF_NEEDED:

        if (!should) {
                should = "IF_NEEDED";
                should_transfer = STF_IF_NEEDED;
                default_should = true;


Case (2) is harder to track down.

I *think* defaulting to ShouldTransferFiles = "YES" is in condor_utils/classad_helpers.cpp

        job_ad->Assign( ATTR_SHOULD_TRANSFER_FILES,
                                        getShouldTransferFilesString( STF_YES ) );

and this is also where we get Requirements = true

        job_ad->Assign( ATTR_REQUIREMENTS, true );

As for appending the requirements, I think this is actually within the python bindings themselves, python-bindings/schedd.cpp, make_requirements()

make_requirements(ExprTree *reqs, ShouldTransferFiles_t stf)
{
    // Copied ideas from condor_submit.  Pretty lame.
...
    ADD_PARAM(OPSYS);
    ADD_PARAM(ARCH);
    switch (stf)
    {
    case STF_NO:
        ADD_REQUIREMENT(FILE_SYSTEM_DOMAIN, "TARGET." ATTR_FILE_SYSTEM_DOMAIN "
== MY." ATTR_FILE_SYSTEM_DOMAIN);
        break;
    case STF_YES:
        ADD_REQUIREMENT(HAS_FILE_TRANSFER, "TARGET." ATTR_HAS_FILE_TRANSFER);
        break;
    case STF_IF_NEEDED:
        ADD_REQUIREMENT(HAS_FILE_TRANSFER, "(TARGET." ATTR_HAS_FILE_TRANSFER " || (TARGET." ATTR_FILE_SYSTEM_DOMAIN " == MY." ATTR_FILE_SYSTEM_DOMAIN"))");
        break;
    }
    ADD_REQUIREMENT(REQUEST_DISK, "TARGET.Disk >= " ATTR_REQUEST_DISK);
    ADD_REQUIREMENT(REQUEST_MEMORY, "TARGET.Memory >= " ATTR_REQUEST_MEMORY);
    return result;

So I would argue there are a couple of bugs here.

1. The stf requirements should only be added for certain universes (as per mightTransfer in submit.cpp)
2. The existing requirements _expression_ should be put in parentheses before appending to it

Regards,

Brian.