[HTCondor-users] DAG jobs with data affinity

Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

Date: Mon, 29 Apr 2013 11:14:31 +0300

From: "Edward Aronovich" <eddiea@xxxxxxxxx>

Subject: [HTCondor-users] DAG jobs with data affinity

Hi,

I want to run a bunch of jobs which have some data correlation between them. It means that if each job uses 10 inputs file (out of thousands) there are 9 other jobs that each uses 9 out of 10 files and another additional file.

Since the these files are large, the data movement takes most of the time (approx as long as the process) and therefore we would like to minimize the data transfer.

There is a notion of gang scheduling that deals with CPU affinity, but I could not find some similar solutions with data affinity.

Any suggestions ?

Thanks,

Eddie

Mailing List Archives

Public Access

[HTCondor-users] DAG jobs with data affinity