[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Greetings, and rtgen over Condor

On 7/4/05, Miguel Dilaj <mdilaj@xxxxxxxxxxxxx> wrote:
> Hi Matt et all,
> Thanks for the comments. I think that I'm not going to be affected by
> evictions because the nodes have been configured to stop and continue later,
> not to migrate.

Since you are using ON_EXIT rather than on ON_EXIT_OR_EVICT this is fine anyway.

> I already took care of indicating an external (big) drive as the
> destination, so the files should be transferred back there. BUT, I run into
> an unexpected problem...
> When I was piloting with 3 dedicated machines, I launched my .sub file, and
> got the *.rt (the rainbow tables) back in the central manager's disk, but
> now I'm not getting the files back, and I don't see the reason for that...
> The .sub is simple, only minor mods since the first version I used for
> piloting, mainly due to the fact that I'm using E:\ as the destination now:
>        Universe = vanilla
>        Executable = rtgen.exe
>        Log = rtgeneration.log
>        Arguments = lm alpha 1 5 $(Process) 1000 4000
>        Initialdir = E:\

I would imagine the change to initialdir is the problem. (I think you
would be ok using e:/ incidentally)

>        # the extra line above is to avoid problems with the backslash and
> command continuation!
>        Should_transfer_files = YES
>        When_to_transfer_output = ON_EXIT
>        Transfer_input_files = charset.txt
>        Nice_user = True
>        Notification = Never
>        Requirements = (OpSys == "WINNT50") || (OpSys == "WINNT51")
>        Queue 5
> Inspecting the log file shows that the jobs finished OK. It can also be
> monitored using condor_q during the few seconds it is running. However, the
> logs show "0 - Run bytes sent by job", when I expect the table to be
> transferred back.

Try doing exactly the same thing but replace your exe with a batch
file that does "touch foo.bar" and see if it comes back. If it doesn't
start looking into the schedd and shadow logs to see what it is
saying. The condor daemon may be unable to write to the e:\ location.

The other thing to check is that the files created by rtgen are in the
same directory as before (i.e. the directory the command is launched
in). But I am guessing that hasn't changed