[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor and KVM: cannot connect to qemu:///session




On Tue, 2010-12-14 at 15:59 -0500, Ryan Jansen wrote:
> Seeing as user rjansen didn't have permission to read from his home
> directory in AFS, I changed the permissions to allow access. Now, I
> can run virsh and connect to qemu:///session as rjansen without AFS
> credentials. I'm able to list his machines.
> 
> Condor, however, gives me the same error. Does a vm universe job user
> need anything other than an accessible home directory to start a
> virtual machine under Condor? Any ideas as to what the problem is?
> 
> Also, does the condor_vm-gahp run as root

It needs(condor) to be started as superuser and will periodically
elevate privs on certain function calls.

> , but then switch to the job user to create and start up a virtual
> machine? Is it possible to make Condor just connect to qemu:///system
> as root (or even as the user)?

e.g. - my condor_config.local 

ALWAYS_VM_UNIV_USE_NOBODY = TRUE
VM_UNIV_NOBODY_USER = tstclair

Then I have a script which starts with elevated privs: 

sudo env PATH=$PATH CONDOR_CONFIG=$CONDOR_CONFIG condor_master 


Hope this helps,
Tim

> 
> Any insight would be very much appreciated.
> 
> Thanks,
> Ryan
> 
> On Mon, Dec 13, 2010 at 3:48 PM, Ryan Jansen <rjansen@xxxxxx> wrote:
>         Tim,
>         
>         I currently have two VMs defined on the host machine,
>         ryankvm01, which is defined under qemu:///system for user
>         rjansen, and ryankvm02, which is defined under qemu:///system
>         for root.
>         
>         Running as root, I can connect to both URIs and run VMs under
>         each:
>         
>         # virsh -c qemu:///system list --all
>          Id Name                 State
>         ----------------------------------
>           - ryankvm02            shut off
>         
>         virsh -c qemu:///session list --all
>          Id Name                 State
>         ----------------------------------
>           - ryankvm02            shut off
>         
>         
>         Logging in as rjansen and running virsh, I can do the same
>         thing (although it shows ryankvm01 under qemu:///session).
>         
>         However (and I think this may be part of the problem), when I
>         just su into user rjansen from root, I don't have the
>         appropriate AFS tokens to access rjansen's home. As such,
>         libvirt gives me an error:
>         
>         # su rjansen
>         # virsh -c qemu:///session list --all
>         libvir: Network Config error : Failed to open dir
>         '/afs/crc.nd.edu/user/r/rjansen/.libvirt/qemu/networks':
>         Permission denied
>         Failed to open dir
>         '/afs/crc.nd.edu/user/r/rjansen/.libvirt/storage': Permission
>         deniedlibvir: Domain Config error : Failed to open dir
>         '/afs/crc.nd.edu/user/r/rjansen/.libvirt/qemu': Permission
>         denied
>         libvir: error : could not connect to qemu:///session
>         error: could not connect to qemu:///session
>         error: failed to connect to the hypervisor
>         
>         
>         Following up on the error with rjansen, I also tried creating
>         a local user with a home directory on the host machine,
>         condor_vm. condor_vm can use virsh, but Condor still gives me
>         the same error when I try to run the job with condor_vm set as
>         the nobody user. condor_vm can connect just fine with a simple
>         su.
>         
>         # su condor_vm
>         # virsh -c qemu:///session list --all
>          Id Name                 State
>         ----------------------------------
>         
>         The error with rjansen seems to be on the right track, but I
>         don't understand why the condor_vm user didn't work. Any
>         ideas?
>         
>         Thanks,
>         Ryan
>         
>         
>         
>         
>         On Mon, Dec 13, 2010 at 3:08 PM, Timothy St. Clair
>         <tstclair@xxxxxxxxxx> wrote:
>                 What happens when you try to open via virsh?
>                 
>                 
>                 On Mon, 2010-12-13 at 13:55 -0500, Ryan Jansen wrote:
>                 > Tim,
>                 >
>                 > That's what I suspected at first, but it looks like
>                 the vm-gahp is
>                 > running as root. Here's the vm-gahp log with
>                 D_FULLDEBUG on:
>                 >
>                 > 12/13 13:41:37 Running as root.  Enabling
>                 specialized core dump
>                 > routines
>                 > 12/13 13:41:37 DaemonCore: Command Socket at
>                 <10.32.72.74:9077>
>                 > 12/13 13:41:37 Will use UDP to update collector
>                 cclweb00.cse.nd.edu
>                 > <129.74.152.166:9618>
>                 > 12/13 13:41:37 VMGAHP[916]: VM-GAHP initialized with
>                 run-mode 3
>                 > 12/13 13:41:37 VMGAHP[916]: Initial UID/GUID=0/0,
>                 > EUID/EGUID=126019/1313, Condor UID/GID=108172,40
>                 > 12/13 13:41:37 VMGAHP[916]: Initialize Uids:
>                 caller=root, job
>                 > user=rjansen
>                 > 12/13 13:41:37 VMGAHP[916]: Constructed VMGahp
>                 > 12/13 13:41:37 VMGAHP[916]: Command: COMMANDS
>                 > 12/13 13:41:38 VMGAHP[916]: Command: SUPPORT_VMS
>                 > 12/13 13:41:38 VMGAHP[916]: Execute commands: S xen
>                 kvm vmware
>                 > 12/13 13:41:39 VMGAHP[916]: Command: ASYNC_MODE_ON
>                 > 12/13 13:41:40 VMGAHP[916]: Command: CLASSAD
>                 > 12/13 13:41:43 VMGAHP[916]: Command: CONDOR_VM_START
>                 > 12/13 13:41:43 VMGAHP[916]: Constructed VM_Type.
>                 > 12/13 13:41:43 ERROR "Failed to create libvirt
>                 connection: could not
>                 > connect to qemu:///session" at line 989 in file
>                 xen_type.cpp
>                 >
>                 > Based on the log output, It appears to be running as
>                 root, and it
>                 > knows that the job user is rjansen. Does that look
>                 normal, or do you
>                 > still think it's most likely a permissions problem?
>                 Is there any way
>                 > to get some more useful output from libvirt, maybe
>                 explaining why it
>                 > couldn't connect?
>                 >
>                 > Thanks,
>                 > Ryan
>                 >
>                 >
>                 > On Mon, Dec 13, 2010 at 1:11 PM, Timothy St. Clair
>                 > <tstclair@xxxxxxxxxx> wrote:
>                 >         If you can verify that your libvirtd is
>                 running & qemu+kvm are
>                 >         installed
>                 >         properly (check via virsh command prompt),
>                 then it is likely a
>                 >         permissions issue.  Condor's vm-gahp
>                 requires it be started
>                 >         with
>                 >         elevated priv's(~root) in order to
>                 communicate with the
>                 >         libvirtd.
>                 >
>                 >         Cheers,
>                 >         Tim
>                 >
>                 >
>                 >         On Mon, 2010-12-13 at 12:01 -0500, Ryan
>                 Jansen wrote:
>                 >         > Hi Tim,
>                 >         >
>                 >         > Thanks for the email and sorry for taking
>                 so long to get
>                 >         back to you.
>                 >         >
>                 >         > I'm using libvirt version 0.6.3.
>                 >         >
>                 >         > Ryan
>                 >         >
>                 >         > On Wed, Dec 8, 2010 at 11:13 AM, Timothy
>                 St. Clair
>                 >         > <tstclair@xxxxxxxxxx> wrote:
>                 >         >         what version of libvirt are you
>                 using?
>                 >         >
>                 >         >         Cheers,
>                 >         >         Tim
>                 >         >
>                 >         >
>                 >         >         On Tue, 2010-12-07 at 16:36 -0500,
>                 Ryan Jansen
>                 >         wrote:
>                 >         >         > Hi everyone,
>                 >         >         >
>                 >         >         > I'm having a problem getting
>                 Condor to start up a
>                 >         KVM
>                 >         >         virtual machine
>                 >         >         > in Condor. I posted an email
>                 before, and with
>                 >         advice from a
>                 >         >         few
>                 >         >         > people, I was able to sort out
>                 my KVM problems.
>                 >         But now,
>                 >         >         whenever I
>                 >         >         > run a vm universe job, the
>                 condor_vm-gahp fails
>                 >         with the
>                 >         >         following
>                 >         >         > error:
>                 >         >         >
>                 >         >         > 12/07 16:18:12 ** condor_vm-gahp
>                 (CONDOR_VM_GAHP)
>                 >         STARTING
>                 >         >         UP
>                 >         >         > 12/07 16:18:12
>                 >         >         >
>                 >         >
>                 >
>                 ** /afs/nd.edu/user37/condor/software/versions/amd64-redhat5/condor-7.4.2-dynamic/sbin/condor_vm-gahp
>                 >         >         > 12/07 16:18:12 ** SubsystemInfo:
>                 name=VM_GAHP
>                 >         type=GAHP(9)
>                 >         >         > class=DAEMON(1)
>                 >         >         > 12/07 16:18:12 ** Configuration:
>                 subsystem:VM_GAHP
>                 >         >         local:<NONE>
>                 >         >         > class:DAEMON
>                 >         >         > 12/07 16:18:12 **
>                 $CondorVersion: 7.4.2 Mar 29
>                 >         2010 BuildID:
>                 >         >         227044 $
>                 >         >         > 12/07 16:18:12 **
>                 $CondorPlatform:
>                 >         X86_64-LINUX_RHEL5 $
>                 >         >         > 12/07 16:18:12 ** PID = 13583
>                 >         >         > 12/07 16:18:12 ** Log last
>                 touched 12/7 16:18:10
>                 >         >         > 12/07 16:18:12
>                 >         >
>                 >
>                 ******************************************************
>                 >         >         > 12/07 16:18:12 Using config
>                 >         >         >
>                 source: /afs/nd.edu/user37/condor/condor_config
>                 >         >         > 12/07 16:18:12 Using local
>                 config sources:
>                 >         >         > 12/07
>                 >         >         > 16:18:12
>                 >         >
>                 >
>                  /afs/nd.edu/user37/condor/software/config/machines/dqcneh100.local
>                 >         >         > 12/07 16:18:12 DaemonCore:
>                 Command Socket at
>                 >         >         <10.32.72.74:9118>
>                 >         >         > 12/07 16:18:12 VMGAHP[13583]:
>                 VM-GAHP initialized
>                 >         with
>                 >         >         run-mode 3
>                 >         >         > 12/07 16:18:12 VMGAHP[13583]:
>                 Initial
>                 >         UID/GUID=0/0,
>                 >         >         > EUID/EGUID=126019/1313, Condor
>                 UID/GID=108172,40
>                 >         >         > 12/07 16:18:12 VMGAHP[13583]:
>                 Initialize Uids:
>                 >         caller=root,
>                 >         >         job
>                 >         >         > user=rjansen
>                 >         >         > 12/07 16:18:18 ERROR "Failed to
>                 create libvirt
>                 >         connection:
>                 >         >         could not
>                 >         >         > connect to qemu:///session" at
>                 line 989 in file
>                 >         xen_type.cpp
>                 >         >         >
>                 >         >         > Now, I have
>                 adjusted /etc/libvirt/libvirt.conf to
>                 >         allow the
>                 >         >         libvirt
>                 >         >         > group to access the libvirt rw
>                 socket, and I added
>                 >         the users
>                 >         >         root,
>                 >         >         > rjansen, and condor to that
>                 group.
>                 >         >         >
>                 >         >         > Additionally, I can connect just
>                 fine (as root and
>                 >         rjansen)
>                 >         >         to
>                 >         >         > qemu:///session, through virsh,
>                 and through the
>                 >         libvirt C
>                 >         >         library
>                 >         >         > using example code from the qemu
>                 website. In fact,
>                 >         the code
>                 >         >         I use to
>                 >         >         > connect to the library in the
>                 example program is
>                 >         essentially
>                 >         >         the same
>                 >         >         > as the code on line 989 in
>                 xen_type.cpp, which is
>                 >         failing.
>                 >         >         >
>                 >         >         > I'm not sure if I'm doing
>                 something wrong with
>                 >         Condor or
>                 >         >         something
>                 >         >         > wrong with KVM/libvirt, but I'd
>                 like to get this
>                 >         working.
>                 >         >         >
>                 >         >         > Does anyone have any ideas on
>                 how to fix this
>                 >         problem?
>                 >         >         >
>                 >         >         > Thanks,
>                 >         >         > Ryan
>                 >         >
>                 >         >         >
>                 _______________________________________________
>                 >         >         > Condor-users mailing list
>                 >         >         > To unsubscribe, send a message
>                 to
>                 >         >         condor-users-request@xxxxxxxxxxx
>                 with a
>                 >         >         > subject: Unsubscribe
>                 >         >         > You can also unsubscribe by
>                 visiting
>                 >         >         >
>                 >
>                 https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>                 >         >         >
>                 >         >         > The archives can be found at:
>                 >         >         >
>                 https://lists.cs.wisc.edu/archive/condor-users/
>                 >         >
>                 >         >
>                 _______________________________________________
>                 >         >         Condor-users mailing list
>                 >         >         To unsubscribe, send a message to
>                 >         >         condor-users-request@xxxxxxxxxxx
>                 with a
>                 >         >         subject: Unsubscribe
>                 >         >         You can also unsubscribe by
>                 visiting
>                 >         >
>                 >
>                 https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>                 >         >
>                 >         >         The archives can be found at:
>                 >         >
>                 https://lists.cs.wisc.edu/archive/condor-users/
>                 >         >
>                 >
>                 >
>                 >
>                 
>                 
>         
>         
>