[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] BOINC running, all machine Owner




Hi

I studied the time given by the StartLog and the StarterLog.boinc (see below) and found that the boinc job was evicted because the MachineBusy became true because of the CpuBusy.

I tried with EVICT_BACKFILL = FALSE and i can see that when boinc is running in backfill, the LoadAvg = 1.0000 and the CondorLoadAvg = 0.00
Then the CpuBusy = true ( 1 - 0 >= 0.5).

I think i 've to find a policy to EVICT_BACKFILL
that doesn't use CpuBusy . Any idea ?

Manu



4/7 15:45:23 State change: START_BACKFILL is TRUE
4/7 15:45:23 Changing state: Unclaimed -> Backfill
4/7 15:45:28 State change: BOINC client running for vm1
4/7 15:45:28 Changing activity: Idle -> Busy
4/7 15:46:13 State change: EVICT_BACKFILL is TRUE
4/7 15:46:13 Changing activity: Busy -> Killing
4/7 15:46:14 BOINC client (pid 22443) exited with status 0
4/7 15:46:14 State change: starter exited
4/7 15:46:14 Changing state and activity: Backfill/Killing -> Owner/Idle
4/7 15:46:14 State change: IS_OWNER is false
4/7 15:46:14 Changing state: Owner -> Unclaimed
4/7 15:46:14 State change: IS_OWNER is TRUE


4/7 15:45:28 About to exec /usr/local/BOINC/boinc condor_exec.exe -dir /usr/local/BOINC
4/7 15:45:28 Create_Process succeeded, pid=22444
4/7 15:46:13 Got SIGQUIT.  Performing fast shutdown.
4/7 15:46:13 ShutdownFast all jobs.
4/7 15:46:13 Process exited, pid=22444, signal=9





On Thu, 30 Mar 2006, Emmanuel Le Guirriec wrote:



Hi

I now put the binary file but that doesn't work.
I 've a Got SIGQUIT 30 seconds after boinc has started.

The boinc.out seems to be right (compared to the one i 've when boinc
started by hand).
2006-03-30 10:18:26 [---] Starting BOINC client version 5.2.13 for
i686-pc-linux-gnu
2006-03-30 10:18:26 [---] libcurl/7.14.0 OpenSSL/0.9.8 zlib/1.2.3
2006-03-30 10:18:26 [---] Data directory: /usr/local/BOINC
2006-03-30 10:18:26 [---] Processor: 1 AuthenticAMD AMD Athlon(tm)
Processor
2006-03-30 10:18:26 [---] Memory: 250.39 MB physical, 619.66 MB virtual
2006-03-30 10:18:26 [---] Disk: 5.68 GB total, 1.13 GB free
2006-03-30 10:18:26 [Einstein@Home] Computer ID: 490504; location: ;
project prefs: default
2006-03-30 10:18:26 [---] No general preferences found - using BOINC
defaults
2006-03-30 10:18:26 [---] Remote control not allowed; using loopback
address
2006-03-30 10:18:26 [Einstein@Home] Resuming computation for result
r1_1436.0__1970_S4R2a_2 using albert version 440


I've put the
BOINC_Arguments to -dir $(HOME_BOINC) but it's the same as whitout.
The exact Log for boinc starter is

3/30 10:18:26 Using config file: /home/prof/condor/condor_config
3/30 10:18:26 Using local config files:
/home/prof/condor/hosts/strauss/condor_c
onfig.local
3/30 10:18:26 DaemonCore: Command Socket at <192.168.45.110:39205>
3/30 10:18:26 Done setting resource limits
3/30 10:18:26 Starter running a local job with no shadow
3/30 10:18:26 Getting job ClassAd from config file with keyword: "boinc"
3/30 10:18:26 "boinc_proc" not found in config file
3/30 10:18:26 Starting a VANILLA universe job with ID: 1.0
3/30 10:18:26 IWD: /usr/local/BOINC/
3/30 10:18:26 Output file: /usr/local/BOINC//boinc.out
3/30 10:18:26 Error file: /usr/local/BOINC//boinc.err
3/30 10:18:26 About to exec /usr/local/BOINC//boinc condor_exec.exe -dir
/usr/lo
cal/BOINC/
3/30 10:18:26 Create_Process succeeded, pid=12997
3/30 10:19:11 Got SIGQUIT.  Performing fast shutdown.
3/30 10:19:11 ShutdownFast all jobs.
3/30 10:19:11 Process exited, pid=12997, signal=9
3/30 10:19:11 All jobs have exited... starter exiting
3/30 10:19:11 **** condor_starter (condor_STARTER) EXITING WITH STATUS 0


Manu


On Tue, 28 Mar 2006, Derek Wright wrote:


On Tue, 28 Mar 2006 21:22:28 +0200 (CEST)  Emmanuel Le Guirriec wrote:

StarterLog.boinc
...
3/28 18:02:56 Create_Process: child failed with errno 8 (Exec format
error) before exec()

there's your problem.  /usr/local/BOINC/run_client isn't the right
kind of binary for this machine, or doesn't exist, or something.  you
need to install a working copy of the "boinc_client" program in that
directory (if it doesn't already exist), and set BOINC_Exectuable to
point to that.

from the condor manual:
http://www.cs.wisc.edu/condor/manual/v6.7.18/3_13Setting_Up.html#SECTION004138500000000000000
----------
Required settings:

BOINC_Executable
   The full path to the boinc_client binary to use.
----------

notice, the docs say "... path to the boinc_client binary", not
"run_client" script. ;)


good luck,
-derek



_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users




--
Emmanuel Le Guirriec
Ingenieur de Recherche Calcul Scientifique CNRS
UMR6628-MAPMO
Federation Denis Poisson
Universite d'Orleans
BP 6759
45067 Orleans Cedex 2
tel	02.38.49.46.69 / 48.50