[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] running VMware VIX job with Condor



Hi Arindam:

Maybe I misunderstood your original intentions: are you trying to run a VM as a job, or a job that starts a VM?  If it is the former, then you may want to look at the VM universe:

http://www.cs.wisc.edu/condor/manual/v7.6/2_11Virtual_Machine.html

If it is the latter, then from the output file, it looks as if it has finished.  What does the log for the job say?

Regards,
-B

On 2011-06-21, at 2:31 PM, Arindam Choudhury wrote:

> Hi,
> with
> should_transfer_files = YES
> when_to_transfer_output = ON_EXIT
> 
> the job executes correctly and output files are also been transferred:
> 
> [condor@aow condor-job]$ cat simple2.out
> 2011-06-21 22:16:05  2124 connector "vmlocal" booted
> About to find running virtual machines
> Listing running virtual machines
> number of running virtual machine: 0
> about to open
> opened (35651647)
> powering on
> powered on
> [condor@aow condor-job]$ cat simple2.error
> 2011-06-21 22:16:05  2120 no printer configured or none available
> 2011-06-21 22:16:05  2122 no spooler system, adaptor daemon aborted
> 2011-06-21 22:16:05  2123 no locale info available
> 2011-06-21 22:16:16  2127 connector "vmlocal" socket line desynced, resyncing...
> 2011-06-21 22:17:16  2127 connector "vmlocal" socket line resynced due to timeout condition
> [condor@aow condor-job]$
> 
> then why the job is not getting finished? I tried putting return 0 and exit(0), but no solution. also when I try to remove these jobs, its got in X mode and then after long its get removed.
> 
> -Arindam
> 
> -----Original Message----- From: Arindam Choudhury
> Sent: Tuesday, June 21, 2011 8:54 PM
> To: condor users
> Subject: Re: [Condor-users] running VMware VIX job with Condor
> 
> Hi,
> 
> I changed VIX_VMPOWEROP_LAUNCH_GUI to VIX_VMPOWEROP_NORMAL, now the VM is
> starting.
> 
> But when I am using:
> should_transfer_files = YES
> when_to_transfer_output = ON_EXIT
> 
> the job is running indefinitely. I tried putting a sleep() for 5 minutes
> after VixVM_PowerOn(), but it still the same.
> 
> Thanks,
> -Arindam
> 
> -----Original Message----- From: Burnett, Ben
> Sent: Tuesday, June 21, 2011 7:16 PM
> To: Condor-Users Mail List
> Subject: Re: [Condor-users] running VMware VIX job with Condor
> 
> Hi Arindam:
> 
> It might be that the script if failing because you are asking it to launch
> the VM GUI.  You might try starting the VM without the GUI.
> 
> Regards,
> -B
> 
> On 2011-06-21, at 10:14 AM, Arindam Choudhury wrote:
> 
>> 
>> HI,
>> 
>> I have test.c VMware VIX script in execution node which look for running vm and if there is no running vm then it start a fedora vm.
>> 
>> the test.c:
>> 
>> #include <stdio.h>
>> #include <stdlib.h>
>> 
>> 
>> #include "vix.h"
>> 
>> #define CONNTYPE VIX_SERVICEPROVIDER_VMWARE_PLAYER
>> 
>> #define HOSTNAME NULL
>> 
>> #define HOSTPORT 0
>> #define USERNAME NULL
>> #define PASSWORD NULL
>> 
>> #define VM_NUMBER 4
>> 
>> int vmHandleIndex = 0;
>> int runningVMCount = 0;
>> VixHandle   hostHandle = VIX_INVALID_HANDLE;
>> 
>> char *runningVM[VM_NUMBER];
>> 
>> static void find_runningVM(VixHandle jobHandle,VixEventType ev,VixHandle moreEvInfo,void *cd)
>> {
>>   VixError err = VIX_OK;
>>   char *loc = NULL;
>> 
>>   if (VIX_EVENTTYPE_FIND_ITEM != ev) {
>>       return;
>>   }
>> 
>>   if (runningVMCount < VM_NUMBER) {
>>   err = Vix_GetProperties(moreEvInfo,VIX_PROPERTY_FOUND_ITEM_LOCATION,&loc,VIX_PROPERTY_NONE);
>>   if (VIX_SUCCEEDED(err)) {
>>   runningVM[runningVMCount] = loc;
>>   runningVMCount++;
>> 
>>   } else {
>>       fprintf(stderr,"GetProperties failed (%s)\n",Vix_GetErrorText(err, NULL));
>>   }
>>   } else {
>>       fprintf(stderr, "Warning: found too many virtual machines!\n");
>>   }
>> }
>> 
>> int main(int argc, char **argv)
>> {
>>   VixError    err;
>>   VixHandle   jobHandle = VIX_INVALID_HANDLE;
>>   VixHandle   vmHandle = VIX_INVALID_HANDLE;
>>   int i;
>> 
>>   jobHandle = VixHost_Connect(VIX_API_VERSION,CONNTYPE,HOSTNAME,HOSTPORT,USERNAME,PASSWORD,0,VIX_INVALID_HANDLE,NULL,NULL);
>>   err = VixJob_Wait(jobHandle,VIX_PROPERTY_JOB_RESULT_HANDLE,&hostHandle,VIX_PROPERTY_NONE);
>>   Vix_ReleaseHandle(jobHandle);
>>   if (VIX_FAILED(err)) {
>>       fprintf(stderr,"Failed to connect to host (%s)\n",Vix_GetErrorText(err, NULL));
>>       goto abort;
>>   }
>> 
>>   printf("About to find running virtual machines\n");
>> 
>>   jobHandle = VixHost_FindItems(hostHandle,VIX_FIND_RUNNING_VMS,VIX_INVALID_HANDLE, -1,find_runningVM,NULL);
>> 
>>   err = VixJob_Wait(jobHandle, VIX_PROPERTY_NONE);
>>   Vix_ReleaseHandle(jobHandle);
>>   if (VIX_FAILED(err)) {
>>       fprintf(stderr,"FindItems failed (%s)\n",Vix_GetErrorText(err, NULL));
>>       goto abort;
>>   }
>>   printf("Listing running virtual machines\n");
>> 
>>   printf("number of running virtual machine: %d\n", runningVMCount);
>>   for(i = 0; i < runningVMCount; ++i) {
>>       printf("%s\n",runningVM[i]);
>>   }
>> 
>>   if(runningVMCount > 0)
>>   {
>>       printf("can not start more virtual machine\n");
>>       goto abort;
>>   } else {
>> 
>>       printf ("about to open \n");
>>          jobHandle = VixVM_Open(hostHandle, "/home/condor/vmware/Fedora/Fedora.vmx",NULL,NULL);
>>         err = VixJob_Wait(jobHandle, VIX_PROPERTY_JOB_RESULT_HANDLE, &vmHandle, VIX_PROPERTY_NONE);
>>          Vix_ReleaseHandle(jobHandle);
>>          if (VIX_FAILED(err))
>>       {
>>               fprintf(stderr, "failed to open virtual machine (%"FMT64"d %s)\n", err,
>>                Vix_GetErrorText(err, NULL));
>>                goto abort;
>>          }
>>         printf ("opened (%d)\n", vmHandle);
>> 
>>          printf("powering on\n");
>>          jobHandle = VixVM_PowerOn(vmHandle,VIX_VMPOWEROP_LAUNCH_GUI,VIX_INVALID_HANDLE,NULL,NULL);
>>          err = VixJob_Wait(jobHandle, VIX_PROPERTY_NONE);
>>          Vix_ReleaseHandle(jobHandle);
>>          if (VIX_FAILED(err)) {
>>                fprintf(stderr, "failed to power on virtual machine (%"FMT64"d %s)\n", err,
>>                    Vix_GetErrorText(err, NULL));
>>                goto abort;
>>          }
>>          printf("powered on\n");
>>   }
>> 
>> ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
>>   abort:
>>       VixHost_Disconnect(hostHandle);
>> /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
>> }
>> 
>> to compile I use a make file:
>> 
>> WRAPPER = -lvixAllProducts -ldl
>> SERVER11 = /usr/lib/vmware-vix/lib/server-1/32bit/libvix.so
>> WORKST60 = /usr/lib/vmware-vix/lib/ws-3/32bit/libvix.so
>> SERVER20 = /usr/lib/vmware-vix/lib/VIServer-2.0.0/32bit/libvix.so
>> WORKST65 = /usr/lib/vmware-vix/lib/Workstation-6.5.0/32bit/libvix.so
>> WRAPORNOT = $(WRAPPER)
>> VIXH = -I/usr/include/vmware-vix
>> 
>> all:test
>> 
>> test: test.c
>>   gcc $(VIXH) test.c -o test $(WRAPORNOT)
>> clean:
>>   rm -f test
>> 
>> when run on terminal it gives the following output and starts the vm:
>> 
>> [condor@aopcach experiment]$ ./test
>> About to find running virtual machines
>> Listing running virtual machines
>> number of running virtual machine: 0
>> about to open
>> opened (34603069)
>> powering on
>> powered on
>> [condor@aopcach experiment]$
>> 
>> when executed with condor using the following submit file:
>> 
>> Universe   = vanilla
>> transfer_executable = false
>> Executable = /home/condor/experiment/test
>> Log        = simple2.log
>> Output     = simple2.out
>> Error      = simple2.error
>> Requirements = (Machine == "aopcach.uab.es")
>> run_as_owner = true
>> copy_to_spool = false
>> Queue
>> 
>> it gives the following output without starting the vm. here submit node and execute node is same:
>> 
>> [condor@aopcach ~]$ cat simple2.out
>> About to find running virtual machines
>> Listing running virtual machines
>> number of running virtual machine: 0
>> about to open
>> opened (35651645)
>> powering on
>> [condor@aopcach ~]$
>> 
>> if I use the following submit file then the job keeps running for indefinite time:
>> 
>> Universe   = vanilla
>> transfer_executable = false
>> Executable = /home/condor/experiment/test
>> Log        = simple2.log
>> Output     = simple2.out
>> Error      = simple2.error
>> Requirements = (Machine == "aopcach.uab.es")
>> run_as_owner = true
>> copy_to_spool = false
>> should_transfer_files = YES
>> when_to_transfer_output = ON_EXIT
>> Queue
>> 
>> with simple C program, everything runs fine. But with VIX, I am facing the problems.
>> 
>> -Arindam
>> 
>> 
>>     _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>> 
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
> 
> --
> Ben Burnett
> Optimization Research Group
> Department of Math & Computer Science
> University of Lethbridge
> http://optimization.cs.uleth.ca
> 
> "Everyone is entitled to their opinion; you're not entitled to your own
> fact."
> - Michael Specter
> 
> 
> 
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
> 
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/ 
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/

--
Ben Burnett
Optimization Research Group
Department of Math & Computer Science
University of Lethbridge
http://optimization.cs.uleth.ca

"Everyone is entitled to their opinion; you're not entitled to your own fact."
- Michael Specter