[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] gridmonitor



On Jan 3, 2006, at 11:56 AM, Marco Mambelli wrote:

Below are the details of the submission that seems all ok.
But I found these errors in the GridmanagerLog.marco (seems to be the
same with all servers):
1/3 11:44:47 [2463] Deleting job 587.0 from schedd
1/3 11:44:47 [2463] Schedd connection error! Will retry

This is unrelated to the gridmonitor. If it only happens sporadically, don't worry about it.

On the submit host the gridmonitor seems enabled (and streamings are off)
$ condor_submit csub.sub
Submitting job(s).
Logging submit event(s).
1 job(s) submitted to cluster 2959.
$ condor_config_val ENABLE_GRID_MONITOR
TRUE
...
On the server there are plenty of globus-job-managers (they are mine,
start times correspond):
...
Any idea of what happened?

Two things to check:

1) Is the gridmonitor running successfully on the server. Run 'ps auwx' and look for processes named grid_manager_monitor_agent.

2) Set 'GRIDMANAGER = D_FULLDEBUG' in your Condor config file and look for lines containing 'grid_monitor' in the gridmanager log.

+--------------------------------+-----------------------------------+
|           Jaime Frey           | I used to be a heavy gambler.     |
|       jfrey@xxxxxxxxxxx        | But now I just make mental bets.  |
| http://www.cs.wisc.edu/~jfrey/ | That's how I lost my mind.        |
+--------------------------------+-----------------------------------+