[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Advice on configuring job rounting instead of flock between condor clusters



Hello Experts,

I am looking for an alternative to flocking for sending jobs from on-premises Condor pool to cloud condor pool. Both pools only support vanillaÂand DAG jobs. Target is to first make vanillaÂworking.Â

Currently we are using flocking but we are hitting the issue where jobs stay in idle status for a long time before getting scheduled. Issue mentioned in [1]

Hence I thought of exploring job router configuration. While reading [2] I realized that I need to do customization on both on-premises and cloud condor clusters. On the cloud condor cluster I need to run HTCondor-C which probably requires this configuration [3].. On-premises cluster configuration should be like below:Â
JOB_ROUTER_ROUTE_CondorSite @=rt
  MaxIdleJobs = 20
  GridResource = "condor cloud_submit.example.com cloud_master.example.com"
  SET remote_jobuniverse = 5
@rt

Apart from this configuration, do I need to have anything else in-place to make it work? Also is using a job router instead of flocking is a good idea?

[1]Âhttps://lists.cs.wisc.edu/archive/htcondor-users/2020-April/msg00051.shtml
[2]Âhttps://htcondor.readthedocs.io/en/latest/grid-computing/job-router.html
[3]Âhttps://research.cs.wisc.edu/htcondor/manual/v7.9/5_3Grid_Universe.html#SECTION00631100000000000000


Thanks & Regards,
Vikrant Aggarwal