[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Bug in condor_transfer_data (Windows) in Condor 7.6 series



That's good news! Thanks! I'll upgrade to 7.6.5 now and test if it works
for me. :-)




Am Dienstag, den 03.01.2012, 11:56 -0600 schrieb John (TJ) Knoeller:
> Thanks for taking the trouble to figure out why condor_transferdata was 
> not working.
> I stumbled across this bug and fixed it as part of fixing another bug in 
> 7.6.5.
> The current code in the 7.6.5 branch is
> 
> ACL_SIZE_INFORMATION acl_info;
> 
> // first get the number of ACEs in the ACL
> if (! GetAclInformation( pacl,······// acl to get info from
> &acl_info,··// buffer to receive info
>                           sizeof(acl_info),  // size in bytes of buffer
>                           AclSizeInformation // class of info to retrieve
>                          ) ) {
> 
> 
> This is different than your proposed fix, but still correct I think.
> 
> -tj
> 
> 
> On 1/3/2012 10:27 AM, Felix Wolfheimer wrote:
> > I reported some months ago on this list that condor_transfer_data
> > suddenly stopped working on my Windows machines when I upgrade from
> > the 7.4 series to the 7.6 series. As I got no reply at that time it
> > seems to me that no one else has experienced this issue so far. As I
> > would really like to upgrade to Condor 7.6 now I investigated the
> > issue now and found the change in the source that seems to break the
> > condor_transfer_data tool on my machines. Just as a reminder. As soon
> > as I try to transfer data with the condor_transfer_data tool. On any
> > of my Windows machines I get the following error:
> >
> > C:\Documents and Settings\FelixWolfheimer\Desktop\condor>condor_transfer_data 65
> >
> >
> > Fetching data files...
> >
> >
> >
> > DCSchedd::receiveJobSandbox:7003:File transfer failed for target job 65.0: SCHED
> >
> > D at 10.2.4.60 failed to send file(s) to<10.2.4.60:1318>: error reading from C:
> >
> > \Condor/spool\65\0\cluster65.proc0.subproc0\horn.out.log: permission denied; TOO
> >
> > L failed to receive file(s) from<10.2.4.60:9619>
> >
> > ERROR: Failed to spool job files.
> >
> > When I look into the SchedLog file I can see the following error (the
> > important line is the first one:
> > 01/03/12 16:30:44 (pid:47892) Perm::GetAclInformation failed with error 122
> >
> > 01/03/12 16:30:44 (pid:47892) DoUpload: (Condor error code 13, subcode
> > 1) SCHEDD at 10.2.4.60 failed to send file(s) to<10.2.4.60:1318>:
> > error reading from
> > C:\Condor/spool\65\0\cluster65.proc0.subproc0\horn.out.log: permission
> > denied; TOOL failed to receive file(s) from<10.2.4.60:9619>
> >
> > 01/03/12 16:30:44 (pid:47892) generalJobFilesWorkerThread(): failed to
> > transfer files for job 65.0
> >
> > This is the piece of code in Condor causing the error in version 7.6.4
> > (src/condor_utils/perm.WINDOWS.cpp):
> > 	ACL_SIZE_INFORMATION* acl_info = new ACL_SIZE_INFORMATION();
> > 		// Structure contains the following members:
> > 		//  DWORD   AceCount;
> > 		//  DWORD   AclBytesInUse;
> > 		//  DWORD   AclBytesFree;
> >
> >
> >
> > 	// first get the number of ACEs in the ACL
> > 		if (! GetAclInformation( pacl,		// acl to get info from
> > 					acl_info,	// buffer to receive info
> > 					sizeof(acl_info),  // size in bytes of buffer
> > 					AclSizeInformation // class of info to retrieve
> > 					) ) {
> > 			dprintf(D_ALWAYS, "Perm::GetAclInformation failed with error %d\n",
> > GetLastError() );
> > 			return -1;
> > 		}
> >
> > Here is the piece of code which worked for me (Condor
> > 7.4.4,src/condor_c++_util/perm.cpp):
> > 	ACL_SIZE_INFORMATION* acl_info = new ACL_SIZE_INFORMATION();
> > 		// Structure contains the following members:
> > 		//  DWORD   AceCount;
> > 		//  DWORD   AclBytesInUse;
> > 		//  DWORD   AclBytesFree;
> >
> > 	// first get the number of ACEs in the ACL
> > 		if (! GetAclInformation( pacl,		// acl to get info from
> > 								acl_info,	// buffer to receive info
> > 								24,			// size in bytes of buffer
> > 								AclSizeInformation // class of info to retrieve
> > 								) ) {
> > 			dprintf(D_ALWAYS, "Perm::GetAclInformation failed with error %d\n",
> > GetLastError() );
> > 			return -1;
> > 		}
> >
> > The code in Condor 7.6.4 although a good attempt to parameterize the
> > size of ACL_SIZE_INFORMATION is wrong. You are asking for the size of
> > the pointer here (8 Bytes) and not for the size of the struct (24
> > Bytes). Is there any chance that you can fix that soon in the stable
> > series (7.6)? BTW: The GetLastError function comes back with 122 which
> > means "The buffer provided is too small" which is consinstent with my
> > observations. I wonder why no one else seems to have this problem as
> > condor_transfer_data will fail with this bug in it for sure on any
> > Windows system...
> > _______________________________________________
> > Condor-users mailing list
> > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/condor-users/
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/