[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] problems with MPI-example in manual



Hmm, it seems I forgot to attach the logfile; it comes with this Re

Yours sincerely,
Jakob van Bethlehem


J.S. van Bethlehem wrote:
> Being a complete newby to Condor, I may overlook something very simple
> here, but alas, that's why there are these lists I guess.
> 
> I literally copied the little MPI example on pages 61/62 of the manual
> (for condor 7.0.5 in section 2.9), except the lines that do something
......
> (and actually: I could repeat the whole story for Fortran 90, for which
> I get precisely the same behaviour/results)
> 
> Yours sincerely,
> Jakob van Bethlehem
> 
> -----------------------------------------
> Kapteyn Astronomical Institute
> Groningen, the Netherlands
000 (060.000.000) 12/11 12:38:03 Job submitted from host: <10.0.6.29:33027>
...
014 (060.000.001) 12/11 12:38:09 Node 1 executing on host: <10.0.6.184:33390>
...
014 (060.000.000) 12/11 12:38:09 Node 0 executing on host: <10.0.6.184:33390>
...
014 (060.000.002) 12/11 12:38:09 Node 2 executing on host: <10.0.6.171:56957>
...
015 (060.000.001) 12/11 12:38:09 Node 1 terminated.
	(1) Normal termination (return value 127)
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
	0  -  Run Bytes Sent By Node
	0  -  Run Bytes Received By Node
	0  -  Total Bytes Sent By Node
	0  -  Total Bytes Received By Node
...
015 (060.000.000) 12/11 12:38:09 Node 0 terminated.
	(1) Normal termination (return value 127)
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
	0  -  Run Bytes Sent By Node
	0  -  Run Bytes Received By Node
	0  -  Total Bytes Sent By Node
	0  -  Total Bytes Received By Node
...
005 (060.000.000) 12/11 12:38:09 Job terminated.
	(1) Normal termination (return value 127)
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
	0  -  Run Bytes Sent By Job
	0  -  Run Bytes Received By Job
	0  -  Total Bytes Sent By Job
	0  -  Total Bytes Received By Job
...
000 (061.000.000) 12/11 12:39:11 Job submitted from host: <10.0.6.29:33027>
...
014 (061.000.001) 12/11 12:39:15 Node 1 executing on host: <10.0.6.184:33390>
...
014 (061.000.000) 12/11 12:39:15 Node 0 executing on host: <10.0.6.184:33390>
...
014 (061.000.002) 12/11 12:39:15 Node 2 executing on host: <10.0.6.171:56957>
...
015 (061.000.000) 12/11 12:39:15 Node 0 terminated.
	(0) Abnormal termination (signal 11)
	(0) No core file
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
	0  -  Run Bytes Sent By Node
	0  -  Run Bytes Received By Node
	0  -  Total Bytes Sent By Node
	0  -  Total Bytes Received By Node
...
005 (061.000.000) 12/11 12:39:15 Job terminated.
	(0) Abnormal termination (signal 11)
	(0) No core file
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
	0  -  Run Bytes Sent By Job
	0  -  Run Bytes Received By Job
	0  -  Total Bytes Sent By Job
	0  -  Total Bytes Received By Job
...
000 (062.000.000) 12/11 12:40:51 Job submitted from host: <10.0.6.29:33027>
...
014 (062.000.000) 12/11 12:40:52 Node 0 executing on host: <10.0.6.184:33390>
...
014 (062.000.001) 12/11 12:40:52 Node 1 executing on host: <10.0.6.184:33390>
...
014 (062.000.002) 12/11 12:40:52 Node 2 executing on host: <10.0.6.171:56957>
...
014 (062.000.003) 12/11 12:40:52 Node 3 executing on host: <10.0.6.151:37743>
...
001 (062.000.000) 12/11 12:40:52 Job executing on host: MPI_job
...
015 (062.000.000) 12/11 12:40:52 Node 0 terminated.
	(1) Normal termination (return value 0)
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
	0  -  Run Bytes Sent By Node
	0  -  Run Bytes Received By Node
	0  -  Total Bytes Sent By Node
	0  -  Total Bytes Received By Node
...
005 (062.000.000) 12/11 12:40:52 Job terminated.
	(1) Normal termination (return value 0)
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
	0  -  Run Bytes Sent By Job
	0  -  Run Bytes Received By Job
	0  -  Total Bytes Sent By Job
	0  -  Total Bytes Received By Job
...