[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Run Jobs in two architectures



Edier,

Now it’s more clear, in my case I have a windows based condor pool with a wide range of Operative systems at my disposal, windows XP, windows 2003, Windows Vista, and Windows 7 so on my description file I have something like the following as requirements:

requirements = Opsys == "WINNT51" || Opsys == "WINNT52" || Opsys == "WINNT60" || Opsys == "WINNT61"

The best way to obtain the exact Arch and OpSys from your worker nodes is to issue the following command: condor_status –long worker_machinename. See below as sample of what you expect to see.

I hope this help

 

Buena Suerte and Saludos.

 

Alex Alas

 

MyType = "Machine"

TargetType = "Job"

Name = "slot2@xxxxxxxxxxxxxxxxxxx"

Rank = 0

CpuBusy = ((LoadAvg - CondorLoadAvg) >= 0.500000)

SlotWeight = Cpus

Unhibernate = MY.MachineLastMatchTime =!= UNDEFINED

MyCurrentTime = 1287071151

Machine = "MACHINE1.domain.com"

PublicNetworkIpAddr = "<xxx.xxx.xxx.xxx:64947>"

COLLECTOR_HOST_STRING = "Condor_host.domain.com"

CondorVersion = "$CondorVersion: 7.4.2 Mar 30 2010 BuildID: 227044 $"

CondorPlatform = "$CondorPlatform: INTEL-WINNT50 $"

SlotID = 2

VirtualMachineID = 2

VirtualMemory = 912100

TotalDisk = 51321764

Disk = 25660882

CondorLoadAvg = 0.000000

LoadAvg = 0.000000

KeyboardIdle = 1201380

ConsoleIdle = 1201380

Memory = 511

Cpus = 1

StartdIpAddr = "<xxx.xxx.xxx.xxx:64947>"

Arch = "INTEL"

OpSys = "WINNT60"

UidDomain = "domain.com"

FileSystemDomain = "Machine1.domain.com"

HasIOProxy = TRUE

CheckpointPlatform = "WINNT60 INTEL Unknown normal"

WindowsMajorVersion = 6

WindowsMinorVersion = 0

WindowsBuildNumber = 6002

WindowsServicePackMajorVersion = 2

WindowsServicePackMinorVersion = 0

WindowsProductType = 1

TotalVirtualMemory = 1824200

TotalCpus = 2

TotalMemory = 1022

KFlops = 612062

Mips = 3345

LocalCredd = "Cred_host.domain.com"

LastBenchmark = 1287058246

TotalLoadAvg = 0.010000

TotalCondorLoadAvg = 0.000000

ClockMin = 705

ClockDay = 4

TotalSlots = 2

TotalVirtualMachines = 2

HasFileTransfer = TRUE

HasPerFileEncryption = TRUE

HasReconnect = TRUE

HasMPI = TRUE

HasTDP = TRUE

HasJobDeferral = TRUE

HasJICLocalConfig = TRUE

HasJICLocalStdin = TRUE

HasWindowsRunAsOwner = TRUE

StarterAbilityList = "HasFileTransfer,HasPerFileEncryption,HasReconnect,HasMPI,HasTDP,HasJobDeferral,HasJICLocalConfig,HasJICLocalStdin,HasVM,HasWindowsRunAsOwner"

HasVM = FALSE

HibernationLevel = 0

HibernationState = "NONE"

HibernationSupportedStates = ""

CanHibernate = FALSE

HardwareAddress = "00:0B:CD:CE:2B:F0"

SubnetMask = "255.255.255.0"

IsWakeOnLanSupported = TRUE

IsWakeOnLanEnabled = TRUE

IsWakeAble = TRUE

WakeOnLanSupportedFlags = "Magic Packet"

WakeOnLanEnabledFlags = "Magic Packet"

CpuBusyTime = 0

CpuIsBusy = FALSE

TimeToLive = 2147483647

State = "Unclaimed"

EnteredCurrentState = 1287009787

Activity = "Idle"

EnteredCurrentActivity = 1287058246

TotalTimeOwnerIdle = 75

TotalTimeUnclaimedIdle = 824420

TotalTimeUnclaimedBenchmarking = 158

TotalTimeMatchedIdle = 1144

TotalTimeClaimedIdle = 25340

TotalTimeClaimedBusy = 350151

TotalTimePreemptingVacating = 47

TotalTimePreemptingKilling = 60

Start = TRUE

Requirements = (START) && (IsValidCheckpointPlatform)

IsValidCheckpointPlatform = (((TARGET.JobUniverse == 1) == FALSE) || ((MY.CheckpointPlatform =!= UNDEFINED) && ((TARGET.LastCheckpointPlatform =?= MY.CheckpointPlatform) || (TARGET.NumCkpts == 0))))

MaxJobRetirementTime = 0

LastFetchWorkSpawned = 0

LastFetchWorkCompleted = 0

NextFetchWorkDelay = -1

CurrentRank = 0.000000

MonitorSelfTime = 1287071116

MonitorSelfCPUUsage = 0.000000

MonitorSelfImageSize = 82108.000000

MonitorSelfResidentSetSize = 19668

MonitorSelfAge = 1201362

MonitorSelfRegisteredSocketCount = 1

DaemonStartTime = 1286300421

UpdateSequenceNumber = 3571

MyAddress = "<xxx.xxx.xxx.xxx:64947>"

LastHeardFrom = 1287071152

UpdatesTotal = 2838

UpdatesSequenced = 2837

UpdatesLost = 3

UpdatesHistory = "0x00000000000000000000000000000000"

 

 

 

 

From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Edier Alberto Zapata Hernández
Sent: Thursday, October 14, 2010 11:16 AM
To: Condor-Users Mail List
Subject: Re: [Condor-users] Run Jobs in two architectures

 

Thanks Alex,

 Maybe I was not clear, I want that the same task run in both architectures at the same time, I mean, I submit the next submitFile from the submit node (and X86_64 machine):

## submitFile##

Universe = Vanilla
should_transfer_files = YES
when_to_transfer_output = ON_EXIT_OR_EVICT
transfer_input_files = est2genome.pl

transfer_executable = False
Executable = /opt/test/bin/exonerate

 

Arguments = --model est2genome --bestn 1 --query 001_seqs --target /home/condor/Manihot_v1.n50.merMasked.repeat.masked.fa
Input = 001_seqs
Log = Log-001_seqs.txt
Error = Err-001_seqs.txt
Output = Out-001_seqs.txt
=?= 0
Queue

Arguments = --model est2genome --bestn 1 --query 002_seqs --target /home/condor/Manihot_v1.n50.merMasked.repeat.masked.fa
Input = 002_seqs
Log = Log-002_seqs.txt
Error = Err-002_seqs.txt
Output = Out-002_seqs.txt
=?= 0
Queue

...

Arguments = --model est2genome --bestn 1 --query 999_seqs --target /home/condor/Manihot_v1.n50.merMasked.repeat.masked.fa
Input = 999_seqs
Log = Log-999_seqs.txt
Error = Err-999_seqs.txt
Output = Out-999_seqs.txt
=?= 0
Queue
##############

To the cluster, Condor will add Arch="X86_64", OpSys=LINUX, so the jobs will run only in the X86_64 nodex, and the PCs working at 32bits (Arch=INTEL) will be ommited. That's what I don't want, I want that the PCs process the jobs too.

 

Thank you.

On Thu, Oct 14, 2010 at 9:39 AM, Alas, Alex [FEDI] <aalas@domain.com> wrote:

Edier,

In your case I would start tracing the cause, do a condor_q –better-analyze and see what’s coming up. If the allocation is failing for a reason, this command can give you a good idea where the problem lies.

Alex

 

From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Edier Alberto Zapata Hernández
Sent: Thursday, October 14, 2010 9:25 AM
To: condor-users
Subject: [Condor-users] Run Jobs in two architectures

 

Good morning

 

 I have a heterogeneous cluster with condor:

 2 DualQuadCore Servers (X86_64)

 3 PCs (x86)
All of them are Execute, I have the same programs installed on them and I want that when I submit a job it runs in all of them (But condor "require" same architecture in the submit and execute nodes), I'd disable the transfer of the executable to the nodes:

 transfer_executable = False

and I'd think to add a Requirements line like this to the submit file:

      requirements = Arch=="X86_64" || Arch=="INTEL"

 But that only means that It require that, not that It can run there, What line should I add to the submitFile to allow the task run in both architectures at the same time?

 

Thank you.


----
Edier Alberto Zapata Hernández
Est. Ingeniería de Sistemas
Universidad de Valle


_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/




--
----
Edier Alberto Zapata Hernández
Est. Ingeniería de Sistemas
Universidad de Valle