[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] gt4 grid universe problem



Hi all, especially Jaime,

I'm still having trouble submitting grid-universe jobs to a GT4-WS gatekeeper. Gory details below.

On 2 Oct 2006, at 16:58, Jaime Frey wrote:
On Sep 29, 2006, at 10:46 AM, Andrew Walker wrote:
On 27 Sep 2006, at 16:37, Jaime Frey wrote:
On Sep 26, 2006, at 8:30 AM, Andrew Walker wrote:
Having recently upgraded from condor 6.6 to 6.8(.0), I'm trying to submit a grid universe gt4 job to a remote gatekeeper in front of a condor pool. Currently my job is failing with the error "Failed to create proxy delegation" (which is Code 0 Subcode 0 in the user log file). Does anybody have any idea how to debug this?

The gatekeeper is running globus 4.0.1 and I can successfully submit jobs using the pre-WS gram (using both the gt2 grid universe and the globus universe). At the moment I have pre-staged the executable and am not attempting to recover the output back to the submit machine - all I want to do is run a shell script on a condor node and return the output to the gatekeeper. I think my problem is with the condor-g submit machine, but I have access to log and configuration files at both ends.


snip...


One possibility is that gridftp is not correctly traversing the firewalls between the gatekeeper and the condor submit machine (I have two firewalls to worry about - both filter traffic in both directions). What are the network requirements for a gt4 resource? I guess the gatekeeper has to connect back to the submitting machine on TCP port 2811. However, I don't think this is the immediate problem as I'm not seeing any activity (or failing outbound network connections) from the gatekeeper.

The problem is not with the gridftp server, but with delegating your proxy to the Delegation service on the gatekeeper machine. The best way to debug this is to try Globus' WS GRAM client to submit an equivalent job. Try this:

globusrun-ws -submit -job-delegate -factory cartman.niees.group.cam.ac.uk -factory-type Condor -job-command /bin/date

This will delegate a credential, then submit a job that uses that credential. If this fails, then you know that the problem is not related to Condor-G.

A couple other notes:

The 'globus_rsl' attribute doesn't work for WS GRAM jobs. Instead, there's a globus_xml attribute, for use with WS GRAM's XML-based RSL description.

The gridftp server Condor-G starts up for WS GRAM file transfers listens on a dynamic port, not 2811. If you have a hole in your firewall and LOWPORT/HIGHPORT set appropriately in your Condor config file, then the gridftp server shouldn't have any problems.




Jaime, 

Thanks for the info - it turned out that this was a firewall issue resolved by moving my tests to a new pair of machines. However, I have now run up against a new problem. (I'm now submitting from a 6.8.1 condor machine to a gatekeeper running globus 4.0.2 in front of a 6.8.1 condor pool; firewalls between the two machines have been set to allow any traffic in either direction free access).

I have simplified my script a bit too in order to try and work out what is going on - all I want to see is the hostname of the execute node on the remote condor pool:

Universe        = grid
grid_resource = gt4 cete.niees.group.cam.ac.uk Condor
Executable      = /bin/hostname
Notification    = NEVER
Output          = host_$(PROCESS).out
Error           = host.err
Log             = host.log
Queue 1


Again the job enters the local queue, the gridftp server starts up and then the job fails and enters the held state. This time I have a different error in the log (Globus error: Staging error for RSL element fileStageIn):


000 (192.000.000) 09/29 16:31:55 Job submitted from host: <131.111.20.163:9661>
...
017 (192.000.000) 09/29 16:32:50 Job submitted to Globus
    RM-Contact: cete.niees.group.cam.ac.uk
    JM-Contact: https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?b8486b60-4fcf-11db-ba9e-8b423672fa7f
    Can-Restart-JM: 0
...
027 (192.000.000) 09/29 16:32:50 Job submitted to grid resource
    GridResource: gt4 cete.niees.group.cam.ac.uk Condor
    GridJobId: gt4 https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?b8486b60-4fcf-11db-ba9e-8b423672fa7f
...
012 (192.000.000) 09/29 16:32:53 Job was held.
        Globus error: Staging error for RSL element fileStageIn.
        Code 0 Subcode 0
...


However, running the equivalent command using the globus client works (and the returned output file shows that the job ran on a condor execute node): 

globusrun-ws -streaming -stdout-file testout -submit -job-delegate -factory cete.niees.group.cam.ac.uk -factory-type Condor -job-command /bin/hostname
Delegating user credentials...Done.
Submitting job...Done.
Job ID: uuid:3cc015da-4faa-11db-8c27-00042388e7a7
Termination time: 09/30/2006 11:04 GMT
Current job state: Pending
Current job state: Active
Current job state: CleanUp-Hold
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
Cleaning up any delegated credentials...Done.


Using condor's GT2 interface also works as expected:

Universe        = grid
grid_resource = gt2 cete.niees.group.cam.ac.uk/jobmanager-condor
Executable      = /bin/hostname
Notification    = NEVER
Output          = host_$(PROCESS).out
Error           = host.err
Log             = host.log
Queue 1



And I see exactly the same behavior replacing all the condor jobmanager commands with fork commands. Again I'm after some help finding a starting place for debugging. Does anybody have any idea where to start?

Condor is trying to transfer /bin/hostname to the GRAM server. globusrun-ws is using the /bin/hostname that's already there. Something about the transfer is failing. You can confirm this by adding 'transfer_executable = false' to your submit file.

Interestingly adding "transfer_executable = false" to the submit file gives exactly the same behavior - that is I see:

012 (200.000.000) 10/10 14:35:34 Job was held.
        Globus error: Staging error for RSL element fileStageIn.
        Code 0 Subcode 0

in the user log file.


 Can you transfer files from your submit machine to cete.niees.group.cam.ac.uk using globus-url-copy?

This appears to be working correctly. The command:

globus-url-copy file:///home/amw75/testfile gsiftp://cete.niees.group.cam.ac.uk:2811/home/andreww/testfile

creates a copy of "testfile" on cete. Is that what you wanted me to test?



If you have GRIDMANAGER_DEBUG=D_FULLDEBUG in your condor config file, you should see a java exception stack trace in your gridmanager daemon log. That may give us more detail on what exactly is failing. The server's container output should contain the same stack trace.

The log is below (and quite long) - job 200 is the submission that ends up in the "held" state, 201 is the gsiftp server that gets started on the submit machine and 128.232.232.28 is cete - the gatekeeper. The interesting line seems to be in the java stack trace:

10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Error authenticating user at source/dest hostnull. Caused by java.io.EOFException
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.ftp.extended.GridFTPInputStream.readMsg(GridFTPInputStream.java:100)

but I'm not clear why authentication is failing at this stage.


Cheers,

Andrew




10/10 14:34:51 ******************************************************
10/10 14:34:51 ** condor_gridmanager (CONDOR_GRIDMANAGER) STARTING UP
10/10 14:34:51 ** /Condor/RH9/condor-6.8.1-dynamic/sbin/condor_gridmanager
10/10 14:34:51 ** $CondorVersion: 6.8.1 Sep 17 2006  $
10/10 14:34:51 ** $CondorPlatform: I386-LINUX_RHEL3 $
10/10 14:34:51 ** PID = 13000
10/10 14:34:51 ** Log last touched time unavailable (No such file or directory)
10/10 14:34:51 ******************************************************
10/10 14:34:51 Using config source: /home/condor/condor_config
10/10 14:34:51 Using local config sources:
10/10 14:34:51    /home/condor/condor_config.local
10/10 14:34:51 DaemonCore: Command Socket at <131.111.20.163:9652>
10/10 14:34:51 Welcome to the all-singing, all dancing, "amazing" GridManager!
10/10 14:34:51 [13000] Getting monitoring info for pid 13000
10/10 14:34:51 [13000] Checking proxies
10/10 14:34:52 [13000] DaemonCore: in SendAliveToParent()
10/10 14:34:52 [13000] DaemonCore: attempting to connect to '<131.111.20.163:9661>'
10/10 14:34:54 [13000] Received ADD_JOBS signal
10/10 14:34:54 [13000] in doContactSchedd()
10/10 14:34:54 [13000] querying for new jobs
10/10 14:34:54 [13000] Using constraint ((Owner=?="amw75"&&JobUniverse==9)) && (Managed =!= "ScheddDone") && (((Matched =!= FALSE) && (JobStatus != 5)) || (Managed =?= "External"))
10/10 14:34:54 [13000] Using job type GT4 for job 200.0
10/10 14:34:54 [13000] (200.0) SetJobLeaseTimers()
10/10 14:34:54 [13000] Found job 200.0 --- inserting
10/10 14:34:54 [13000] Fetched 1 new job ads from schedd
10/10 14:34:54 [13000] querying for removed/held jobs
10/10 14:34:54 [13000] Using constraint ((Owner=?="amw75"&&JobUniverse==9)) && ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?= "External"))
10/10 14:34:54 [13000] Fetched 0 job ads from schedd
10/10 14:34:54 [13000] leaving doContactSchedd()
10/10 14:34:54 [13000] gahp server not up yet, delaying ping
10/10 14:34:54 [13000] *** UpdateLeases called
10/10 14:34:54 [13000]     Leases not supported, cancelling timer
10/10 14:34:54 [13000] *** checkDelegation()
10/10 14:34:54 [13000] gahp server not up yet, delaying checkDelegation
10/10 14:34:54 [13000] GridftpServer: Scanning schedd for previously submitted gridftp server jobs
10/10 14:34:54 [13000] GridftpServer: Submitting job for proxy '/C=UK/O=eScience/OU=Cambridge/L=UCS/CN=andrew walker'
10/10 14:34:54 [13000] entering FileTransfer::SimpleInit
10/10 14:34:54 [13000] entering FileTransfer::UploadFiles (final_transfer=0)
10/10 14:34:54 [13000] entering FileTransfer::Upload
10/10 14:34:54 [13000] entering FileTransfer::DoUpload
10/10 14:34:54 [13000] DoUpload: send file /tmp/condor_g_scratch.0x8572df0.6105/grid-mapfile
10/10 14:34:54 [13000] ReliSock::put_file_with_permissions(): going to send permissions 100644
10/10 14:34:54 [13000] put_file: going to send from filename /tmp/condor_g_scratch.0x8572df0.6105/grid-mapfile
10/10 14:34:54 [13000] put_file: Found file size 61
10/10 14:34:54 [13000] put_file: senting 61 bytes
10/10 14:34:54 [13000] ReliSock: put_file: sent 61 bytes
10/10 14:34:54 [13000] DoUpload: send file /tmp/condor_g_scratch.0x8572df0.6105/master_proxy.2
10/10 14:34:54 [13000] DoUpload: send file /Condor/Debian/condor/libexec/gridftp_wrapper.sh
10/10 14:34:54 [13000] ReliSock::put_file_with_permissions(): going to send permissions 100755
10/10 14:34:54 [13000] put_file: going to send from filename /Condor/Debian/condor/libexec/gridftp_wrapper.sh
10/10 14:34:54 [13000] put_file: Found file size 111
10/10 14:34:54 [13000] put_file: senting 111 bytes
10/10 14:34:54 [13000] ReliSock: put_file: sent 111 bytes
10/10 14:34:54 [13000] DoUpload: exiting at 2090
10/10 14:34:57 [13000] (200.0) doEvaluateState called: gmState GM_INIT, globusState 32
10/10 14:34:57 [13000] GAHP server pid = 13003
10/10 14:34:58 [13000] GAHP server version: $GahpVersion: 1.4.0 Jun 02 2005 GT4 GAHP (GT-4.0.0) $
10/10 14:34:58 [13000] GAHP[13003] <- 'COMMANDS'
10/10 14:34:58 [13000] GAHP[13003] -> 'S' 'ASYNC_MODE_OFF' 'ASYNC_MODE_ON' 'CACHE_PROXY_FROM_FILE' 'COMMANDS' 'GASS_SERVER_INIT' 'GT4_DELEGATE_CREDENTIAL' 'GT4_GENERATE_SUBMIT_ID' 'GT4_GRAM_CALLBACK_ALLOW' 'GT4_GRAM_JOB_CALLBACK_REGISTER' 'GT4_GRAM_JOB_DESTROY' 'GT4_GRAM_JOB_START' 'GT4_GRAM_JOB_STATUS' 'GT4_GRAM_JOB_SUBMIT' 'GT4_GRAM_PING' 'GT4_REFRESH_CREDENTIAL' 'GT4_SET_TERMINATION_TIME' 'INITIALIZE_FROM_FILE' 'QUIT' 'REFRESH_PROXY_FROM_FILE' 'RESPONSE_PREFIX' 'RESULTS' 'UNCACHE_PROXY' 'USE_CACHED_PROXY' 'VERSION'
10/10 14:34:58 [13000] GAHP[13003] <- 'RESPONSE_PREFIX GAHP:'
10/10 14:34:58 [13000] GAHP[13003] -> 'S'
10/10 14:34:58 [13000] GAHP[13003] <- 'ASYNC_MODE_ON'
10/10 14:34:58 [13000] GAHP[13003] -> 'S'
10/10 14:34:58 [13000] GAHP[13003] <- 'INITIALIZE_FROM_FILE /tmp/condor_g_scratch.0x8572df0.6105/master_proxy.2'
10/10 14:34:59 [13000] GAHP[13003] -> 'S'
10/10 14:34:59 [13000] GAHP[13003] <- 'CACHE_PROXY_FROM_FILE 2 /tmp/condor_g_scratch.0x8572df0.6105/master_proxy.2'
10/10 14:34:59 [13000] GAHP[13003] -> 'S'
10/10 14:34:59 [13000] GAHP[13003] <- 'CACHE_PROXY_FROM_FILE 1 /tmp/x509up_u1501'
10/10 14:34:59 [13000] GAHP[13003] -> 'S'
10/10 14:34:59 [13000] GAHP[13003] <- 'GT4_GRAM_CALLBACK_ALLOW 2'
10/10 14:35:00 [13000] GAHP[13003] -> 'S' '1'
10/10 14:35:00 [13000] (200.0) gm state change: GM_INIT -> GM_START
10/10 14:35:00 [13000] (200.0) gm state change: GM_START -> GM_CLEAR_REQUEST
10/10 14:35:00 [13000] (200.0) UpdateJobLeaseSent(-1)
10/10 14:35:00 [13000] (200.0) gm state change: GM_CLEAR_REQUEST -> GM_UNSUBMITTED
10/10 14:35:00 [13000] GridftpServer: Updating job leases for gridftp server jobs
10/10 14:35:00 [13000] GAHP[13003] <- 'GT4_GRAM_PING 3 https://cete.niees.group.cam.ac.uk'
10/10 14:35:00 [13000] GAHP[13003] -> 'S'
10/10 14:35:00 [13000] *** checkDelegation()
10/10 14:35:00 [13000] in doContactSchedd()
10/10 14:35:00 [13000] querying for removed/held jobs
10/10 14:35:00 [13000] Using constraint ((Owner=?="amw75"&&JobUniverse==9)) && ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?= "External"))
10/10 14:35:00 [13000] Fetched 0 job ads from schedd
10/10 14:35:00 [13000] 201.0 job status: 2
10/10 14:35:00 [13000] leaving doContactSchedd()
10/10 14:35:00 [13000] (200.0) doEvaluateState called: gmState GM_UNSUBMITTED, globusState 32
10/10 14:35:00 [13000] (200.0) gm state change: GM_UNSUBMITTED -> GM_DELEGATE_PROXY
10/10 14:35:00 [13000] getDelegationError(): failed to find ProxyDelegation for proxy /tmp/x509up_u1501
10/10 14:35:00 [13000] *** getDelegationURI(/tmp/x509up_u1501)
10/10 14:35:00 [13000]     creating new ProxyDelegation
10/10 14:35:00 [13000] GAHP[13003] <- 'RESULTS'
10/10 14:35:00 [13000] GAHP[13003] -> 'R'
10/10 14:35:00 [13000] GAHP[13003] -> 'S' '1'
10/10 14:35:00 [13000] GAHP[13003] -> '3' '0' 'NULL'
10/10 14:35:00 [13000] *** checkDelegation()
10/10 14:35:00 [13000]     new delegation
10/10 14:35:00 [13000] GAHP[13003] <- 'USE_CACHED_PROXY 1'
10/10 14:35:00 [13000] GAHP[13003] -> 'S'
10/10 14:35:00 [13000] GAHP[13003] <- 'GT4_DELEGATE_CREDENTIAL 4 https://cete.niees.group.cam.ac.uk/wsrf/services/DelegationFactoryService'
10/10 14:35:00 [13000] GAHP[13003] -> 'S'
10/10 14:35:00 [13000] resource https://cete.niees.group.cam.ac.uk is now up
10/10 14:35:00 [13000] (200.0) doEvaluateState called: gmState GM_DELEGATE_PROXY, globusState 32
10/10 14:35:00 [13000] *** getDelegationURI(/tmp/x509up_u1501)
10/10 14:35:00 [13000]     found ProxyDelegation
10/10 14:35:06 [13000] DaemonCore::IsPidAlive(): kill returned EPERM, assuming pid 6105 is alive.
10/10 14:35:19 [13000] GAHP[13003] <- 'RESULTS'
10/10 14:35:19 [13000] GAHP[13003] -> 'R'
10/10 14:35:19 [13000] GAHP[13003] -> 'S' '1'
10/10 14:35:19 [13000] GAHP[13003] -> '4' '0' 'https://128.232.232.28:8443/wsrf/services/DelegationService?4045c610-5864-11db-b222-fa0bb9bed964' 'NULL'
10/10 14:35:19 [13000] *** checkDelegation()
10/10 14:35:19 [13000]     new delegation
10/10 14:35:19 [13000]       https://128.232.232.28:8443/wsrf/services/DelegationService?4045c610-5864-11db-b222-fa0bb9bed964
10/10 14:35:19 [13000]     signalling jobs for https://128.232.232.28:8443/wsrf/services/DelegationService?4045c610-5864-11db-b222-fa0bb9bed964
10/10 14:35:19 [13000] (200.0) doEvaluateState called: gmState GM_DELEGATE_PROXY, globusState 32
10/10 14:35:19 [13000] *** getDelegationURI(/tmp/x509up_u1501)
10/10 14:35:19 [13000]     found ProxyDelegation
10/10 14:35:19 [13000] (200.0) gm state change: GM_DELEGATE_PROXY -> GM_GENERATE_ID
10/10 14:35:19 [13000] GAHP[13003] <- 'GT4_GENERATE_SUBMIT_ID 5 '
10/10 14:35:19 [13000] GAHP[13003] -> 'S'
10/10 14:35:19 [13000] GAHP[13003] <- 'RESULTS'
10/10 14:35:19 [13000] GAHP[13003] -> 'R'
10/10 14:35:19 [13000] GAHP[13003] -> 'S' '1'
10/10 14:35:19 [13000] GAHP[13003] -> '5' 'uuid:2bbf41d0-5864-11db-b2c4-d34c60b0c3c6'
10/10 14:35:19 [13000] (200.0) doEvaluateState called: gmState GM_GENERATE_ID, globusState 32
10/10 14:35:19 [13000] (200.0) gm state change: GM_GENERATE_ID -> GM_SUBMIT_ID_SAVE
10/10 14:35:19 [13000] in doContactSchedd()
10/10 14:35:19 [13000] querying for removed/held jobs
10/10 14:35:19 [13000] Using constraint ((Owner=?="amw75"&&JobUniverse==9)) && ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?= "External"))
10/10 14:35:19 [13000] Fetched 0 job ads from schedd
10/10 14:35:19 [13000] Updating classad values for 200.0:
10/10 14:35:19 [13000]    GridftpUrlBase = "gsiftp://holbein.escience.cam.ac.uk:41225"
10/10 14:35:19 [13000]    GlobusDelegationUri = "https://128.232.232.28:8443/wsrf/services/DelegationService?4045c610-5864-11db-b222-fa0bb9bed964"
10/10 14:35:19 [13000]    GlobusSubmitId = "uuid:2bbf41d0-5864-11db-b2c4-d34c60b0c3c6"
10/10 14:35:19 [13000] leaving doContactSchedd()
10/10 14:35:19 [13000] (200.0) doEvaluateState called: gmState GM_SUBMIT_ID_SAVE, globusState 32
10/10 14:35:19 [13000] (200.0) gm state change: GM_SUBMIT_ID_SAVE -> GM_SUBMIT
10/10 14:35:19 [13000] GAHP[13003] <- 'GT4_GRAM_JOB_SUBMIT 6 uuid:2bbf41d0-5864-11db-b2c4-d34c60b0c3c6 https://cete.niees.group.cam.ac.uk Condor 1 <job><executable>/bin/date</executable><directory>/${GLOBUS_SCRATCH_DIR}/job_2bbf41d0-5864-11db-b2c4-d34c60b0c3c6/</directory><stdout>/${GLOBUS_SCRATCH_DIR}/job_2bbf41d0-5864-11db-b2c4-d34c60b0c3c6//date_0.out</stdout><stderr>/${GLOBUS_SCRATCH_DIR}/job_2bbf41d0-5864-11db-b2c4-d34c60b0c3c6//date.err</stderr><fileStageIn><maxAttempts>5</maxAttempts><transferCredentialEndpoint\ xsi:type="ns1:EndpointReferenceType"\ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"\ xmlns:ns1="http://schemas.xmlsoap.org/ws/2004/03/addressing"><ns1:Address\ xsi:type="ns1:AttributedURI">https://128.232.232.28:8443/wsrf/services/DelegationService</ns1:Address><ns1:ReferenceProperties\ xsi:type="ns1:ReferencePropertiesType"><ns1:DelegationKey\ xmlns:ns1="http://www.globus.org/08/2004/delegationService">4045c610-5864-11db-b222-fa0bb9bed964</ns1:DelegationKey></ns1:ReferenceProperties><ns1:ReferenceParameters\ xsi:type="ns1:ReferenceParametersType"/></transferCredentialEndpoint><transfer><sourceUrl>gsiftp://holbein.escience.cam.ac.uk:41225/tmp/condor_g_empty_dir_u1501/</sourceUrl><destinationUrl>file:///${GLOBUS_SCRATCH_DIR}</destinationUrl><rftOptions><sourceSubjectName>/C=UK/O=eScience/OU=Cambridge/L=UCS/CN=andrew\ walker</sourceSubjectName></rftOptions></transfer><transfer><sourceUrl>gsiftp://holbein.escience.cam.ac.uk:41225/tmp/condor_g_empty_dir_u1501/</sourceUrl><destinationUrl>file:///${GLOBUS_SCRATCH_DIR}/job_2bbf41d0-5864-11db-b2c4-d34c60b0c3c6/</destinationUrl><rftOptions><sourceSubjectName>/C=UK/O=eScience/OU=Cambridge/L=UCS/CN=andrew\ walker</sourceSubjectName></rftOptions></transfer></fileStageIn><fileStageOut><maxAttempts>5</maxAttempts><transferCredentialEndpoint\ xsi:type="ns1:EndpointReferenceType"\ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"\ xmlns:ns1="http://schemas.xmlsoap.org/ws/2004/03/addressing"><ns1:Address\ xsi:type="ns1:AttributedURI">https://128.232.232.28:8443/wsrf/services/DelegationService</ns1:Address><ns1:ReferenceProperties\ xsi:type="ns1:ReferencePropertiesType"><ns1:DelegationKey\ xmlns:ns1="http://www.globus.org/08/2004/delegationService">4045c610-5864-11db-b222-fa0bb9bed964</ns1:DelegationKey></ns1:ReferenceProperties><ns1:ReferenceParameters\ xsi:type="ns1:ReferenceParametersType"/></transferCredentialEndpoint><transfer><sourceUrl>file:///${GLOBUS_SCRATCH_DIR}/job_2bbf41d0-5864-11db-b2c4-d34c60b0c3c6/date_0.out</sourceUrl><destinationUrl>gsiftp://holbein.escience.cam.ac.uk:41225/home/amw75/cete_grid_gt4/date_0.out</destinationUrl><rftOptions><destinationSubjectName>/C=UK/O=eScience/OU=Cambridge/L=UCS/CN=andrew\ walker</destinationSubjectName></rftOptions></transfer><transfer><sourceUrl>file:///${GLOBUS_SCRATCH_DIR}/job_2bbf41d0-5864-11db-b2c4-d34c60b0c3c6/date.err</sourceUrl><destinationUrl>gsiftp://holbein.escience.cam.ac.uk:41225/home/amw75/cete_grid_gt4/date.err</destinationUrl><rftOptions><destinationSubjectName>/C=UK/O=eScience/OU=Cambridge/L=UCS/CN=andrew\ walker</destinationSubjectName></rftOptions></transfer></fileStageOut><fileCleanUp><transferCredentialEndpoint\ xsi:type="ns1:EndpointReferenceType"\ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"\ xmlns:ns1="http://schemas.xmlsoap.org/ws/2004/03/addressing"><ns1:Address\ xsi:type="ns1:AttributedURI">https://128.232.232.28:8443/wsrf/services/DelegationService</ns1:Address><ns1:ReferenceProperties\ xsi:type="ns1:ReferencePropertiesType"><ns1:DelegationKey\ xmlns:ns1="http://www.globus.org/08/2004/delegationService">4045c610-5864-11db-b222-fa0bb9bed964</ns1:DelegationKey></ns1:ReferenceProperties><ns1:ReferenceParameters\ xsi:type="ns1:ReferenceParametersType"/></transferCredentialEndpoint><deletion><file>file:///${GLOBUS_SCRATCH_DIR}/job_2bbf41d0-5864-11db-b2c4-d34c60b0c3c6/</file></deletion></fileCleanUp><jobCredentialEndpoint\ xsi:type="ns1:EndpointReferenceType"\ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"\ xmlns:ns1="http://schemas.xmlsoap.org/ws/2004/03/addressing"><ns1:Address\ xsi:type="ns1:AttributedURI">https://128.232.232.28:8443/wsrf/services/DelegationService</ns1:Address><ns1:ReferenceProperties\ xsi:type="ns1:ReferencePropertiesType"><ns1:DelegationKey\ xmlns:ns1="http://www.globus.org/08/2004/delegationService">4045c610-5864-11db-b222-fa0bb9bed964</ns1:DelegationKey></ns1:ReferenceProperties><ns1:ReferenceParameters\ xsi:type="ns1:ReferenceParametersType"/></jobCredentialEndpoint><stagingCredentialEndpoint\ xsi:type="ns1:EndpointReferenceType"\ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"\ xmlns:ns1="http://schemas.xmlsoap.org/ws/2004/03/addressing"><ns1:Address\ xsi:type="ns1:AttributedURI">https://128.232.232.28:8443/wsrf/services/DelegationService</ns1:Address><ns1:ReferenceProperties\ xsi:type="ns1:ReferencePropertiesType"><ns1:DelegationKey\ xmlns:ns1="http://www.globus.org/08/2004/delegationService">4045c610-5864-11db-b222-fa0bb9bed964</ns1:DelegationKey></ns1:ReferenceProperties><ns1:ReferenceParameters\ xsi:type="ns1:ReferenceParametersType"/></stagingCredentialEndpoint><holdState>StageIn</holdState></job> NULL'
10/10 14:35:19 [13000] GAHP[13003] -> 'S'
10/10 14:35:22 [13000] GAHP[13003] <- 'RESULTS'
10/10 14:35:22 [13000] GAHP[13003] -> 'R'
10/10 14:35:22 [13000] GAHP[13003] -> 'S' '1'
10/10 14:35:22 [13000] GAHP[13003] -> '6' '0' 'https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?2bbf41d0-5864-11db-b2c4-d34c60b0c3c6' 'NULL'
10/10 14:35:22 [13000] (200.0) doEvaluateState called: gmState GM_SUBMIT, globusState 32
10/10 14:35:22 [13000] (200.0) gm state change: GM_SUBMIT -> GM_SUBMIT_SET_LIFETIME
10/10 14:35:22 [13000]     Starting sent lease
10/10 14:35:22 [13000] *** (200.0) CalculateLease: new lease should expire at 1160530522
10/10 14:35:22 [13000] GAHP[13003] <- 'GT4_SET_TERMINATION_TIME 7 https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?2bbf41d0-5864-11db-b2c4-d34c60b0c3c6 43200'
10/10 14:35:22 [13000] GAHP[13003] -> 'S'
10/10 14:35:24 [13000] in doContactSchedd()
10/10 14:35:24 [13000] querying for removed/held jobs
10/10 14:35:24 [13000] Using constraint ((Owner=?="amw75"&&JobUniverse==9)) && ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?= "External"))
10/10 14:35:24 [13000] Fetched 0 job ads from schedd
10/10 14:35:24 [13000] Updating classad values for 200.0:
10/10 14:35:24 [13000]    GridJobId = "gt4 https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?2bbf41d0-5864-11db-b2c4-d34c60b0c3c6"
10/10 14:35:24 [13000] leaving doContactSchedd()
10/10 14:35:24 [13000] GAHP[13003] <- 'RESULTS'
10/10 14:35:24 [13000] GAHP[13003] -> 'R'
10/10 14:35:24 [13000] GAHP[13003] -> 'S' '1'
10/10 14:35:24 [13000] GAHP[13003] -> '7' '0' '1160530523' 'NULL'
10/10 14:35:24 [13000] (200.0) doEvaluateState called: gmState GM_SUBMIT_SET_LIFETIME, globusState 32
10/10 14:35:24 [13000] (200.0) UpdateJobLeaseSent(1160530523)
10/10 14:35:24 [13000] (200.0) SetJobLeaseTimers()
10/10 14:35:24 [13000] (200.0) gm state change: GM_SUBMIT_SET_LIFETIME -> GM_SUBMIT_SAVE
10/10 14:35:29 [13000] in doContactSchedd()
10/10 14:35:29 [13000] querying for removed/held jobs
10/10 14:35:29 [13000] Using constraint ((Owner=?="amw75"&&JobUniverse==9)) && ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?= "External"))
10/10 14:35:29 [13000] Fetched 0 job ads from schedd
10/10 14:35:29 [13000] Updating classad values for 200.0:
10/10 14:35:29 [13000]    JobLeaseExpiration = 1160530523
10/10 14:35:29 [13000] leaving doContactSchedd()
10/10 14:35:29 [13000] (200.0) doEvaluateState called: gmState GM_SUBMIT_SAVE, globusState 32
10/10 14:35:29 [13000] (200.0) gm state change: GM_SUBMIT_SAVE -> GM_SUBMIT_COMMIT
10/10 14:35:29 [13000] GAHP[13003] <- 'GT4_GRAM_JOB_START 8 https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?2bbf41d0-5864-11db-b2c4-d34c60b0c3c6'
10/10 14:35:29 [13000] GAHP[13003] -> 'S'
10/10 14:35:29 [13000] GAHP[13003] <- 'RESULTS'
10/10 14:35:29 [13000] GAHP[13003] -> 'R'
10/10 14:35:29 [13000] GAHP[13003] -> 'S' '1'
10/10 14:35:29 [13000] GAHP[13003] -> '8' '0' 'NULL'
10/10 14:35:29 [13000] (200.0) doEvaluateState called: gmState GM_SUBMIT_COMMIT, globusState 32
10/10 14:35:29 [13000] (200.0) gm state change: GM_SUBMIT_COMMIT -> GM_SUBMITTED
10/10 14:35:29 [13000] *** (200.0) CalculateLease: no new lease at present
10/10 14:35:31 [13000] GAHP[13003] <- 'RESULTS'
10/10 14:35:31 [13000] GAHP[13003] -> 'R'
10/10 14:35:31 [13000] GAHP[13003] -> 'S' '1'
10/10 14:35:31 [13000] GAHP[13003] -> '2' 'https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?2bbf41d0-5864-11db-b2c4-d34c60b0c3c6' 'StageIn' 'NULL' '0'
10/10 14:35:31 [13000] (200.0) gram callback: state StageIn, fault (null), exit code 0
10/10 14:35:31 [13000] (200.0) doEvaluateState called: gmState GM_SUBMITTED, globusState 32
10/10 14:35:31 [13000] (200.0) globus state change: Unsubmitted -> StageIn
10/10 14:35:31 [13000] (200.0) Writing globus submit record to user logfile
10/10 14:35:31 [13000] (200.0) Writing grid submit record to user logfile
10/10 14:35:31 [13000] *** (200.0) CalculateLease: no new lease at present
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Full fault for job https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?2bbf41d0-5864-11db-b2c4-d34c60b0c3c6:
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> fault type: org.globus.exec.generated.StagingFaultType:
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> attribute: fileStageIn
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> description:
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Staging error for RSL element fileStageIn.
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> faultReason:
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> faultString:
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> gt2ErrorCode: 0
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> originator: Address: https://128.232.232.28:8443/wsrf/services/ManagedJobFactoryService
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Reference property[0]:
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> <ns1:ResourceID xmlns:ns1="http://www.globus.org/namespaces/2004/10/gram/job">2bbf41d0-5864-11db-b2c4-d34c60b0c3c6</ns1:ResourceID>
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> stackTrace:
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> org.globus.exec.generated.StagingFaultType: Staging error for RSL element fileStageIn.
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Timestamp: Tue Oct 10 14:36:05 BST 2006
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Originator: Address: https://128.232.232.28:8443/wsrf/services/ManagedJobFactoryService
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Reference property[0]:
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> <ns1:ResourceID xmlns:ns1="http://www.globus.org/namespaces/2004/10/gram/job">2bbf41d0-5864-11db-b2c4-d34c60b0c3c6</ns1:ResourceID>
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at java.lang.reflect.Constructor.newInstance(Constructor.java:274)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at java.lang.Class.newInstance0(Class.java:308)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at java.lang.Class.newInstance(Class.java:261)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.exec.utils.FaultUtils.makeFault(FaultUtils.java:485)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.exec.utils.FaultUtils.createStagingFault(FaultUtils.java:363)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.exec.service.exec.StateMachine.processStageInResponseState(StateMachine.java:995)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at java.lang.reflect.Method.invoke(Method.java:324)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.exec.service.exec.StateMachine.processState(StateMachine.java:367)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.exec.service.exec.RunThread.run(RunThread.java:93)
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Error authenticating user at source/dest hostnull. Caused by java.io.EOFException
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.ftp.extended.GridFTPInputStream.readMsg(GridFTPInputStream.java:100)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.gsi.gssapi.net.GssInputStream.hasData(GssInputStream.java:81)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.gsi.gssapi.net.GssInputStream.read(GssInputStream.java:55)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at sun.nio.cs.StreamDecoder$CharsetSD.readBytes(StreamDecoder.java:408)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at sun.nio.cs.StreamDecoder$CharsetSD.implRead(StreamDecoder.java:450)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:182)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at java.io.InputStreamReader.read(InputStreamReader.java:167)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at java.io.BufferedReader.fill(BufferedReader.java:136)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at java.io.BufferedReader.readLine(BufferedReader.java:299)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at java.io.BufferedReader.readLine(BufferedReader.java:362)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.ftp.vanilla.Reply.&lt;init&gt;(Reply.java:66)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.ftp.vanilla.FTPControlChannel.read(FTPControlChannel.java:257)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.ftp.extended.GridFTPControlChannel.authenticate(GridFTPControlChannel.java:278)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.ftp.GridFTPClient.authenticate(GridFTPClient.java:99)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.ftp.GridFTPClient.authenticate(GridFTPClient.java:84)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.transfer.reliable.service.TransferClient.authenticateSource(TransferClient.java:538)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.transfer.reliable.service.TransferClient.authenticate(TransferClient.java:527)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.transfer.reliable.service.TransferWork.getNewClient(TransferWork.java:432)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.transfer.reliable.service.TransferWork.getTransferClient(TransferWork.java:369)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.transfer.reliable.service.TransferWork.run(TransferWork.java:692)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.wsrf.impl.work.WorkManagerImpl$WorkWrapper.run(WorkManagerImpl.java:345)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at java.lang.Thread.run(Thread.java:534)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at java.lang.reflect.Constructor.newInstance(Constructor.java:494)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at java.lang.Class.newInstance0(Class.java:350)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at java.lang.Class.newInstance(Class.java:303)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.axis.encoding.ser.BeanDeserializer.&lt;init&gt;(BeanDeserializer.java:90)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.axis.encoding.ser.BeanDeserializer.&lt;init&gt;(BeanDeserializer.java:76)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.exec.generated.StagingFaultType.getDeserializer(StagingFaultType.java:152)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at java.lang.reflect.Method.invoke(Method.java:585)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.axis.encoding.DeserializationContext.getDeserializerForClass(DeserializationContext.java:510)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.axis.encoding.ser.BeanDeserializer.onStartChild(BeanDeserializer.java:250)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.axis.encoding.DeserializationContext.startElement(DeserializationContext.java:1035)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown Source)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at javax.xml.parsers.SAXParser.parse(SAXParser.java:375)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.axis.encoding.DeserializationContext.parse(DeserializationContext.java:227)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.wsrf.encoding.ObjectDeserializer.toObject(ObjectDeserializer.java:59)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at condor.gahp.gt4.JobListener.deliver(JobListener.java:157)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.wsrf.impl.notification.NotificationConsumerProvider.notify(NotificationConsumerProvider.java:109)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at java.lang.reflect.Method.invoke(Method.java:585)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.axis.providers.java.RPCProvider.invokeMethod(RPCProvider.java:384)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.axis.providers.java.RPCProvider.processMessage(RPCProvider.java:281)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.axis.providers.java.JavaProvider.invoke(JavaProvider.java:319)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.axis.strategies.InvocationStrategy.visit(InvocationStrategy.java:32)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.axis.SimpleChain.doVisiting(SimpleChain.java:118)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.axis.SimpleChain.invoke(SimpleChain.java:83)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.axis.handlers.soap.SOAPService.invoke(SOAPService.java:450)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.apache.axis.server.AxisServer.invoke(AxisServer.java:285)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.wsrf.container.ServiceThread.doPost(ServiceThread.java:665)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.wsrf.container.ServiceThread.process(ServiceThread.java:396)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->  at org.globus.wsrf.container.ServiceThread.run(ServiceThread.java:300)
10/10 14:35:33 [13000] GAHP[13003] (stderr) ->
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> stateWhenFailureOccurred: StageIn
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> timestamp: java.util.GregorianCalendar[time=1160487365693,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="GMT",offset=0,dstSavings=0,useDaylight=false,transitions=0,lastRule=null],firstDayOfWeek=2,minimalDaysInFirstWeek=4,ERA=1,YEAR=2006,MONTH=9,WEEK_OF_YEAR=41,WEEK_OF_MONTH=2,DAY_OF_MONTH=10,DAY_OF_YEAR=283,DAY_OF_WEEK=3,DAY_OF_WEEK_IN_MONTH=2,AM_PM=1,HOUR=1,HOUR_OF_DAY=13,MINUTE=36,SECOND=5,MILLISECOND=693,ZONE_OFFSET=0,DST_OFFSET=0]
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Message:
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> org.globus.exec.generated.StagingFaultType: Staging error for RSL element fileStageIn.
10/10 14:35:33 [13000] GAHP[13003] <- 'RESULTS'
10/10 14:35:33 [13000] GAHP[13003] -> 'R'
10/10 14:35:33 [13000] GAHP[13003] -> 'S' '1'
10/10 14:35:33 [13000] GAHP[13003] -> '2' 'https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?2bbf41d0-5864-11db-b2c4-d34c60b0c3c6' 'Failed' 'Staging error for RSL element fileStageIn.' '0'
10/10 14:35:33 [13000] (200.0) gram callback: state Failed, fault Staging error for RSL element fileStageIn., exit code 0
10/10 14:35:33 [13000] (200.0) doEvaluateState called: gmState GM_SUBMITTED, globusState 64
10/10 14:35:33 [13000] (200.0) globus state change: StageIn -> Failed
10/10 14:35:33 [13000] (200.0) gm state change: GM_SUBMITTED -> GM_FAILED
10/10 14:35:33 [13000] GAHP[13003] <- 'GT4_GRAM_JOB_DESTROY 9 https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?2bbf41d0-5864-11db-b2c4-d34c60b0c3c6'
10/10 14:35:33 [13000] GAHP[13003] -> 'S'
10/10 14:35:33 [13000] (200.0) doEvaluateState called: gmState GM_FAILED, globusState 4
10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Cmd 9: gramJob.cancel()
10/10 14:35:34 [13000] GAHP[13003] (stderr) -> Cmd 9: CallbackSing.getAllCallbackSinks()
10/10 14:35:34 [13000] GAHP[13003] (stderr) -> Cmd 9: iter.removeJobListener()
10/10 14:35:34 [13000] in doContactSchedd()
10/10 14:35:34 [13000] querying for removed/held jobs
10/10 14:35:34 [13000] Using constraint ((Owner=?="amw75"&&JobUniverse==9)) && ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?= "External"))
10/10 14:35:34 [13000] Fetched 0 job ads from schedd
10/10 14:35:34 [13000] Updating classad values for 200.0:
10/10 14:35:34 [13000]    NumGlobusSubmits = 1
10/10 14:35:34 [13000]    GlobusStatus = 4
10/10 14:35:34 [13000] leaving doContactSchedd()
10/10 14:35:34 [13000] (200.0) doEvaluateState called: gmState GM_FAILED, globusState 4
10/10 14:35:34 [13000] GAHP[13003] (stderr) -> Cmd 9: Done
10/10 14:35:34 [13000] GAHP[13003] <- 'RESULTS'
10/10 14:35:34 [13000] GAHP[13003] -> 'R'
10/10 14:35:34 [13000] GAHP[13003] -> 'S' '1'
10/10 14:35:34 [13000] GAHP[13003] -> '9' '0' 'NULL'
10/10 14:35:34 [13000] (200.0) doEvaluateState called: gmState GM_FAILED, globusState 4
10/10 14:35:34 [13000] (200.0) gm state change: GM_FAILED -> GM_HOLD
10/10 14:35:34 [13000] (200.0) Writing hold record to user logfile
10/10 14:35:34 [13000] (200.0) gm state change: GM_HOLD -> GM_DELETE
10/10 14:35:39 [13000] in doContactSchedd()
10/10 14:35:39 [13000] querying for removed/held jobs
10/10 14:35:39 [13000] Using constraint ((Owner=?="amw75"&&JobUniverse==9)) && ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?= "External"))
10/10 14:35:39 [13000] Fetched 0 job ads from schedd
10/10 14:35:39 [13000] Updating classad values for 200.0:
10/10 14:35:39 [13000]    GlobusDelegationUri = UNDEFINED
10/10 14:35:39 [13000]    GridftpUrlBase = UNDEFINED
10/10 14:35:39 [13000]    GlobusSubmitId = UNDEFINED
10/10 14:35:39 [13000]    GridJobId = UNDEFINED
10/10 14:35:39 [13000]    GlobusStatus = 32
10/10 14:35:39 [13000]    JobStatus = 5
10/10 14:35:39 [13000]    EnteredCurrentStatus = 1160487334
10/10 14:35:39 [13000]    HoldReason = "Globus error: Staging error for RSL element fileStageIn."
10/10 14:35:39 [13000]    HoldReasonCode = 0
10/10 14:35:39 [13000]    HoldReasonSubCode = 0
10/10 14:35:39 [13000]    ReleaseReason = UNDEFINED
10/10 14:35:39 [13000]    NumSystemHolds = 1
10/10 14:35:39 [13000]    Managed = "Schedd"
10/10 14:35:39 [13000] No jobs left, shutting down
10/10 14:35:39 [13000] leaving doContactSchedd()
10/10 14:35:39 [13000] Got SIGTERM. Performing graceful shutdown.
10/10 14:35:39 [13000] Started timer to call main_shutdown_fast in 1800 seconds
10/10 14:35:39 [13000] **** condor_gridmanager (condor_GRIDMANAGER) EXITING WITH STATUS 0








Dr Andrew Walker

Department of Earth Sciences
University of Cambridge
Downing Street
Cambridge 
CB2 3EQ
UK

phone +44 (0)1223 333432