[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Corrupt files on HTCondor transfer to node



Hi,


Is it possible pack_17.tar.gz on the submit side is still being created at the time when your HTCondor job starts?

I can't see how... I used the python script bellow

for d in os.listdir("./data"):
  files = os.listdir("./data/%s"%d)
  for i in range(0, len(files), groupSize):
    if len(sys.argv) == 1:
      selectedFiles = files[i:i+groupSize]
      os.system("echo '%s' > commandLine.txt"%configuration)
      command = ("tar -zcf ./sandbox/pack_%d.tar.gz commandLine.txt ./data/%s/"%(tarId,d)) + (" ./data/%s/"%d).join(selectedFiles)
      os.system(command)
      os.system("rm commandLine.txt")
      for teste in testes:
        configuration = "--method=%s --stage2=%s --stage3=%s --n=%s"%(teste["init"], teste["stage2"], teste["stage3"], teste["n"])
        command = "condor_submit condorexecfile n=%d alg=%s"%(tarId, teste["alg"])
        os.system(command)

Â
As Greg said, the timestamps are odd. Perhaps the HTCondor job was launched before pack_17.tar.gz was actually ready? Or something else outside of HTCondor is modifying it? Are these pack_ files static or being created dynamically as part of a workflow?

I create the pack files just once. And sync the clocks with time server. Still same issue...

Ideas?

Thank you!!!

Roberto


Â


Cheers,
-zach


> -----Original Message-----
> From: HTCondor-users <htcondor-users-bounces@cs.wisc.edu> On Behalf Of Greg
> Thain
> Sent: Thursday, April 05, 2018 11:00 AM
> To: htcondor-users@xxxxxxxxxxx
> Subject: Re: [HTCondor-users] Corrupt files on HTCondor transfer to node
>
> On 04/05/2018 08:17 AM, Roberto Tavares wrote:
>
>
>Â Â Â ÂHello,
>
>Â Â Â ÂWell, I run the set of jobs twice (same procedure). On the first
> time, it worked. On the second time, I got some errors.
>
>Â Â Â ÂOn the submission node, I got
>
>   Â-rw-rw-r-- 1 myuser mygroup 110193 Abr 5 07:45 pack_17.tar.gz
>
>
>Â Â Â ÂOn the execution procedure, I've inserted a "ls -al", and I got a
> smaller file:
>
>   Â-rw-rw-r-- 1 nobody nogroup 49152 Apr 5 07:44 pack_17.tar.gz
>
>
>
> Assuming your clocks are synchronized across submit and execute machines,
> these timestamps seem suspicious.
>
> -greg
>
>


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@cs.wisc.edu with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/