[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] How to solve problem between condor and globus?



I see many problems, staging, universe and expression. I need to see the
submitfile and the condor_config file. Perhaps the I can solve your problem.

Pedro

-----Ursprüngliche Nachricht-----
Von: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] Im Auftrag von Fu-Ming Tsai
Gesendet: Dienstag, 20. Dezember 2005 11:06
An: Condor-Users Mail List
Betreff: Re: [Condor-users] How to solve problem between condor and globus?

Sorry, all,
After trying so many times, I gave up and used NFS.
However, I still can not submit globus job to condor.
so, I tried to get some debug information.

[sary357@osgc01 job]$ condor_q -analyze
---
4206.000:  Run analysis summary.  Of 4 machines,
      3 are rejected by your job's requirements
      0 reject your job because of their own requirements
      0 match but are serving users with a better priority in the pool
      1 match but reject the job for unknown reasons
      0 match but will not currently preempt their existing job
      0 are available to run your job

WARNING: Analysis is only meaningful for Globus universe jobs using 
matchmaking.
---
4207.000:  Run analysis summary.  Of 4 machines,
      0 are rejected by your job's requirements
      3 reject your job because of their own requirements
      0 match but are serving users with a better priority in the pool
      1 match but reject the job for unknown reasons
      0 match but will not currently preempt their existing job
      0 are available to run your job
        Last successful match: Tue Dec 20 09:45:24 2005
        Last failed match: Tue Dec 20 09:55:31 2005
        Reason for last match failure: no match found

== StarterLog.vm2==
12/20 17:35:36 Shadow version: $CondorVersion: 6.7.7 Apr 27 2005 $
12/20 17:35:36 Submitting machine is "osgc01.grid.sinica.edu.tw"
12/20 17:35:36 ShouldTransferFiles is "NO", NOT transfering files
12/20 17:35:36 Submit UidDomain: "grid.sinica.edu.tw"
12/20 17:35:36  Local UidDomain: "grid.sinica.edu.tw"
12/20 17:35:36 Initialized user_priv as "sary357"
12/20 17:35:36 Done moving to directory "/opt/osg/osgs01/execute/dir_6591"
12/20 17:35:36 JICShadow::initIOProxy(): Job does not define WantIOProxy
12/20 17:35:36 No StarterUserLog found in job ClassAd
12/20 17:35:36 Starter will not write a local UserLog
12/20 17:35:36 Starting a VANILLA universe job with ID: 4207.0
12/20 17:35:36 In OsProc::OsProc()
12/20 17:35:36 Main job KillSignal: 15 (SIGTERM)
12/20 17:35:36 Main job RmKillSignal: 15 (SIGTERM)
12/20 17:35:36 Main job HoldKillSignal: 15 (SIGTERM)
12/20 17:35:36 in VanillaProc::StartJob()
12/20 17:35:36 in OsProc::StartJob()
12/20 17:35:36 IWD: /home/sary357/gram_scratch_tUb21E3Wqv
12/20 17:35:36 Input file: /dev/null
12/20 17:35:36 Failed to 
open
'/home/sary357/.globus/job/osgc01.grid.sinica.edu.tw/17186.1135070994/std
out' as standard output: No such file or directory (errno 2)
12/20 17:35:36 Failed to 
open
'/home/sary357/.globus/job/osgc01.grid.sinica.edu.tw/17186.1135070994/std
err' as standard error: No such file or directory (errno 2)
12/20 17:35:36 Failed to open some/all of the std files...
12/20 17:35:36 Aborting OsProc::StartJob.
12/20 17:35:36 Failed to start job, exiting
12/20 17:35:36 ShutdownFast all jobs.
12/20 17:35:36 Got ShutdownFast when no jobs running.
12/20 17:35:36 Removing /opt/osg/osgs01/execute/dir_6591
12/20 17:35:36 Attempting to remove /opt/osg/osgs01/execute/dir_6591 as 
SuperUser (root)
=========================

[sary357@osgc01 job]$ condor_q -better-analyze 4206


-- Submitter: osgc01.grid.sinica.edu.tw : <140.109.98.41:41846> : 
osgc01.grid.sinica.edu.tw
---
4206.000:  Run analysis summary.  Of 4 machines,
      3 are rejected by your job's requirements
      0 reject your job because of their own requirements
      0 match but are serving users with a better priority in the pool
      1 match but reject the job for unknown reasons
      0 match but will not currently preempt their existing job
      0 are available to run your job

The Requirements expression for your job is:

( ( target.Name == "vm2@xxxxxxxxxxxxxxxxxxxxxxxxx" ) )

    Condition                         Machines Matched    Suggestion
    ---------                         ----------------    ----------
1   ( ( target.Name == "vm2@xxxxxxxxxxxxxxxxxxxxxxxxx" ) )
                                      1

WARNING: Analysis is only meaningful for Globus universe jobs using 
matchmaking.
[sary357@osgc01 job]$ condor_q -better-analyze 4207


-- Submitter: osgc01.grid.sinica.edu.tw : <140.109.98.41:41846> : 
osgc01.grid.sinica.edu.tw
Segmentation fault


I'm sure the FileDomain in those 2 machines are the same.
It looks like the output file and error file can not be built. Does anyone 
know?

BR

----------------------------------------------------------------------
"Gravitation is not responsible for people falling in love." 

Fu-Ming Tsai
Academia Sinica Computing Centre, Academia Sinica
sary357@xxxxxxxxxxxxxxxxxx
------------------------------------------------------------------------

_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users