[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Problem in getting the Result (Output File) back



Hi,

I just saw this not too old thread. I have exactly the same problem. It appears sporadicly (sometimes 80% of a job is not returned, sometime a mere 1%) but persistently and my logs are very similar to yours.
Do you get the standard output back ? I dont even get that.

005 (2264.067.000) 12/30 17:59:21 Job terminated.
  (1) Normal termination (return value 0)
[...]
  0  -  Run Bytes Sent By Job
  5375989  -  Run Bytes Received By Job
  0  -  Total Bytes Sent By Job
  5375989  -  Total Bytes Received By Job

Have you (or anyone) since had any luck fixing this problem ?

Cheers, Mike


Natarajan, Senthil wrote:
Thanks Simon for your reply.

Actually I tried all those combinations nothing worked, that’s why I posted.

As you mentioned I tried like this also.

Rscript.bat

*************

C:\setR.bat

R --vanilla a0

Is any body know what might be the problem.

SchedLog

***********

11/29 17:03:53 -------- Done starting jobs --------

11/29 17:03:53 Inside SelfDrainingQueue::timerHandler() for job_is_finished_queue

11/29 17:03:53 Job cleanup for 802.0 will block, calling jobIsFinished() in a thread

11/29 17:03:53 statfs() failed: 13/Permission denied

11/29 17:03:53 SelfDrainingQueue job_is_finished_queue is empty, not resetting timer

11/29 17:03:53 Canceling timer for SelfDrainingQueue job_is_finished_queue (timer id: 10343)

11/29 17:03:53 DaemonCore: No more children processes to reap.

11/29 17:03:53 jobIsFinished() completed, calling DestroyProc(802.0)

11/29 17:03:53 Saving classad to history file

11/29 17:03:53 DaemonCore: Command received via TCP from host <xxx.xx.xxx.xx:53072>

11/29 17:03:53 DaemonCore: received command 1111 (QMGMT_CMD), calling handler (handle_q)

11/29 17:03:53 condor_read(): Socket closed when trying to read buffer

11/29 17:03:53 IO: EOF reading packet header

11/29 17:03:53 QMGR Connection closed

11/29 17:03:53 Got VACATE_SERVICE from <xxx.xxx.xxx.xxx:9610>

------------------------------------------------------------------------

*From:* condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] *On Behalf Of *Simon Hoyle
*Sent:* Wednesday, November 29, 2006 5:48 PM
*To:* Condor-Users Mail List
*Subject:* Re: [Condor-users] Problem in getting the Result (Output File) back

I guess you were calling R directly before and have now switched to using a script.

Looks like you need to apply the arguments in your R script.

R %1 %2

Also nothing will be returned from the directory C:\condor\execute. Your process needs to put the output in the directory that condor sets up and loads the files into (referenced with _CONDOR_SCRATCH_DIR), or your script should transfer the files back there when R finishes.

Simon

--------------------------------------------------------------------

Simon Hoyle

Senior Fisheries Scientist

Stock Assessment and Modelling Section, Oceanic Fisheries Programme

Secretariat of the Pacific Community

BP D5, 98848 Noumea CEDEX, New Caledonia

(Direct): +687 266 776, (office) +687 262 000 xt 455, (Fax) +687 263 818

Web: www.spc.int

------------------------------------------------------------------------

*From:* condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] *On Behalf Of *Natarajan, Senthil
*Sent:* Thursday, 30 November 2006 9:13 AM
*To:* Condor-Users Mail List
*Subject:* [Condor-users] Problem in getting the Result (Output File) back

Hi,

I am submitting a job to windows from Linux, job is running fine and in the log says Normal termination (return value 0).

But I am not getting the result back (output file). I am submitting a R Software job. Here is the submit files. Please let me know what might be the problem, why the results are not copied back to the job submitting machine.

universe = vanilla

executable =  Rscript.bat

arguments =  --vanilla a0

Log = Ra0.log

requirements   = Arch=="INTEL" && OpSys == "WINNT51" && HasR == TRUE

notification = error

input = a0

output = Ra0.out

error =Ra0.err

transfer_input_files = a0

should_transfer_files = YES

when_to_transfer_output = ON_EXIT

Queue

Rscript.bat

************

C:\setR.bat

cd C:\condor\execute

R

SchedLog

***********

11/29 17:03:53 -------- Done starting jobs --------

11/29 17:03:53 Inside SelfDrainingQueue::timerHandler() for job_is_finished_queue

11/29 17:03:53 Job cleanup for 802.0 will block, calling jobIsFinished() in a thread

11/29 17:03:53 statfs() failed: 13/Permission denied

11/29 17:03:53 SelfDrainingQueue job_is_finished_queue is empty, not resetting timer

11/29 17:03:53 Canceling timer for SelfDrainingQueue job_is_finished_queue (timer id: 10343)

11/29 17:03:53 DaemonCore: No more children processes to reap.

11/29 17:03:53 jobIsFinished() completed, calling DestroyProc(802.0)

11/29 17:03:53 Saving classad to history file

11/29 17:03:53 DaemonCore: Command received via TCP from host <xxx.xx.xxx.xx:53072>

11/29 17:03:53 DaemonCore: received command 1111 (QMGMT_CMD), calling handler (handle_q)

11/29 17:03:53 condor_read(): Socket closed when trying to read buffer

11/29 17:03:53 IO: EOF reading packet header

11/29 17:03:53 QMGR Connection closed

11/29 17:03:53 Got VACATE_SERVICE from <xxx.xxx.xxx.xxx:9610>


------------------------------------------------------------------------

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR