[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Error from starter on host: Internal vmgahp server error (with change in Q)



Hi,

we have installed Centos 5.2 in our condor Pool and Condor version was
Condor 7.0.3 (Central Manager has Fedora 5)

Before that We Have Centos 5.1 in our Pool.In that VM starting, Suspending, etc that all worked fine.

Now after upgrade the following Issue In the log file while Starting the VM. I am using same VM image only.

IN StarterLog file of Executor
7/23 10:33:14 About to start new VM
7/23 10:33:14 Will send the part of job ClassAd to vmgahp
7/23 10:33:14 About to exec /opt/condor-7.0.3/sbin/condor_vm-gahp -f -M 1
7/23 10:33:14 Env = VMGAHP_WORKING_DIR=/vm/local.grid7/execute/dir_22472 VMGAHP_USER_GID=49527 _CONDOR_SLOT=1 CONDOR_IDS=49527.49527 VMGAHP_VMTYPE=vmware VMGAHP_USER_UID=49527 _CONDOR_SCRATCH_DIR=/vm/local.grid7/execute/dir_22472 VMGAHP_CONFIG=/opt/condor-7.0.3/etc/condor_vmgahp_config.vmware
7/23 10:33:14 Create_Process: using fast clone() to create child process.
7/23 10:33:14 VMGAHP server pid=22475
7/23 10:33:14 Failed to read vmgahp server version
7/23 10:33:14 Inside VM_GAHP_SERVER::cleanup()
7/23 10:33:14 VMGAHP write line(QUIT) Error
7/23 10:33:14 End of VM_GAHP_SERVER::cleanup
7/23 10:33:15 Failed to start vm-gahp server
7/23 10:33:16 Inside VMProc::cleanup()
7/23 10:33:16 Failed to start job, exiting
7/23 10:33:16 ShutdownFast all jobs.
7/23 10:33:16 Got ShutdownFast when no jobs running.
7/23 10:33:16 Removing /vm/local.grid7/execute/dir_22472
7/23 10:33:16 Attempting to remove /vm/local.grid7/execute/dir_22472 as SuperUser (root)

IN StartLog file of Executor
7/23 14:34:48 get_file(): going to write to filename /vm/local.grid7/execute/dir_24214/centos.vmdk
7/23 14:34:48 get_file: Receiving 494731264 bytes
7/23 14:35:17 Got SIGTERM. Performing graceful shutdown.

In ShadowLog file of Submitter
7/23 10:32:31 (15.0) (26031): ReliSock::put_file_with_permissions(): going to send permissions 100600
7/23 10:32:31 (15.0) (26031): put_file: going to send from filename /home/idealgrid/Emailcentos/centos.vmdk
7/23 10:32:31 (15.0) (26031): put_file: Found file size 494731264
7/23 10:32:31 (15.0) (26031): put_file: sending 494731264 bytes
7/23 10:32:58 (13.0) (25379): Getting monitoring info for pid 25379
7/23 10:33:13 (15.0) (26031): ReliSock: put_file: sent 494731264 bytes
7/23 10:33:13 (15.0) (26031): DoUpload: exiting at 2357
7/23 10:33:14 (15.0) (26031): Resource grid7.pesgrid.wipro.com changing state from STARTUP to EXECUTING
7/23 10:33:14 (15.0) (26031): scheddname = scorpio.pesgrid.wipro.com
7/23 10:33:14 (15.0) (26031): executeHost = <10.201.42.237:45684>
7/23 10:33:14 (15.0) (26031): start = <10.201.42.237:45684>
7/23 10:33:14 (15.0) (26031): end = :45684>
7/23 10:33:14 (15.0) (26031): tmpaddr = 10.201.42.237
7/23 10:33:14 (15.0) (26031): Executehost name = grid7.pesgrid.wipro.com (hp->h_name)
7/23 10:33:14 (15.0) (26031): Started timer to evaluate periodic user policy expressions every 60 seconds
7/23 10:33:14 (15.0) (26031): QmgrJobUpdater: started timer to update queue every 900 seconds (tid=10)
7/23 10:33:14 (15.0) (26031): Set NumJobStarts to 1
7/23 10:33:16 (15.0) (26031): ERROR "Error from starter on grid7.pesgrid.wipro.com: Internal vmgahp server error" at line 649 in file pseudo_ops.C

In UserLog file
001 (015.000.000) 07/23 10:33:14 Job executing on host: <10.201.42.237:45684>
...
007 (015.000.000) 07/23 10:33:16 Shadow exception!
        Error from starter on grid7.pesgrid.wipro.com: Internal vmgahp server error
        0  -  Run Bytes Sent By Job
        494734752  -  Run Bytes Received By Job


Error from starter on grid7.pesgrid.wipro.com: Internal vmgahp server error
7/23 10:33:14 Failed to read vmgahp server version
7/23 10:33:14 Inside VM_GAHP_SERVER::cleanup()
7/23 10:33:14 VMGAHP write line(QUIT) Error
7/23 10:33:14 End of VM_GAHP_SERVER::cleanup
7/23 10:33:15 Failed to start vm-gahp server
7/23 10:33:16 Inside VMProc::cleanup()
7/23 10:33:16 Failed to start job, exiting


by
Johnson



Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.

www.wipro.com