[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Shadow Exception: Create_Process failed toregister the job with the ProcD



Hi,

I am sorry for the bounced emails. I didn't know @qq.com would cause such trouble and I've changed the email address.

I first met that problem in version 8.6.9. But when changed to 8.4.9 the problem disappears. I build my image based onÂagaveapi/htcondor.

CentOS 6.9 used in docker image. Condor installed by RPM as that dockerfile shows and run as root.


------------------ÂOriginalÂ------------------
From:ÂÂ"Todd Tannenbaum"<tannenba@xxxxxxxxxxx>;
Date:ÂÂTue, Mar 13, 2018 00:24 AM
To:ÂÂ"Alan"<852016362@xxxxxx>;
Cc:ÂÂ"Greg Thain"<gthain@xxxxxxxxxxx>;
Subject:ÂÂRe: [HTCondor-users] Shadow Exception: Create_Process failed toregister the job with the ProcD

Hi Alan,

For whatever reason, many (hundreds!) of people on htcondor-users neverÂ
were able to see your question below because many email serversÂ
apparently do not trust messages with a From line including qq.com whenÂ
the message originated someplace else (such as from the UW-Madison emailÂ
listserv). For example, every person on htcondor-users with an gmailÂ
email address of xxx@xxxxxxxxx never saw your question because GoogleÂ
refused to deliver your message. Perhaps you could register toÂ
htcondor-users with a different email address other than @qq.com?Â
Please don't blame me, I don't make all the spam detection rules, I justÂ
see the hundreds of bounced emails whenever you post. :)

Meanwhile on the below, we have someone looking at starting HTCondor inÂ
a docker container in order to reproduce the below issue. Could youÂ
tell us what version of HTCondor you are trying to use inside theÂ
container? What distro are you using inside the container? Did youÂ
install HTCondor inside the container using RPM, DEB, or tarball? AreÂ
you running HTCondor inside the container as root?

Thanks
Todd

On 3/9/2018 8:30 AM, Alan wrote:
> Hi,
>Â
> I install and configure HTCondor in a docker container. I submit aÂ
> simple sleep.sub file as Quick Start shows but I get the log file asÂ
> follows.
>Â
> 000 (007.000.000) 03/09 13:48:31 Job submitted from host:Â
> <172.17.0.2:9618?addrs=172.17.0.2-9618+[--1]-9618&noUDP&sock=46415_b8d0_4>
> ...
> 001 (007.000.000) 03/09 13:48:38 Job executing on host:Â
> <172.17.0.2:9618?addrs=172.17.0.2-9618+[--1]-9618&noUDP&sock=46415_b8d0_6>
> ...
> 007 (007.000.000) 03/09 13:48:38 Shadow exception!
>Â Â Â Â Â Error from slot2@ddfb828b5e4d: Create_Process failed toÂ
> register the job with the ProcD
>Â Â Â Â Â 0Â -Â Run Bytes Sent By Job
>Â Â Â Â Â 114Â -Â Run Bytes Received By Job
>Â
> =====================================
> The content of shadow_log file is as follows.
>Â
> 03/09/18 13:54:55 Daemon Log is logging: D_ALWAYS D_ERROR
> 03/09/18 13:54:55 SharedPortEndpoint: waiting for connections to namedÂ
> socket 46465_8bbb_383
> 03/09/18 13:54:55 DaemonCore: command socket atÂ
> <172.17.0.2:9618?addrs=172.17.0.2-9618+[--1]-9618&noUDP&sock=46465_8bbb_383>
> 03/09/18 13:54:55 DaemonCore: private command socket atÂ
> <172.17.0.2:9618?addrs=172.17.0.2-9618+[--1]-9618&noUDP&sock=46465_8bbb_383>
> 03/09/18 13:54:55 Initializing a VANILLA shadow for job 7.0
> 03/09/18 13:54:55 (7.0) (132973): Request to run on slot2@ddfb828b5e4dÂ
> <172.17.0.2:9618?addrs=172.17.0.2-9618+[--1]-9618&noUDP&sock=46415_b8d0_6>Â
> was ACCEPTED
> 03/09/18 13:54:58 (6.0) (132964): File transfer completed successfully.
> 03/09/18 13:54:59 (6.0) (132964): ERROR "Error from slot1@ddfb828b5e4d:Â
> Create_Process failed to register the job with the ProcD" at line 608 inÂ
> fileÂ
> /slots/01/dir_317056/userdir/.tmpj8HirB/BUILD/condor-8.6.9/src/condor_shadow.V6.1/pseudo_ops.cpp
> 03/09/18 13:55:00 (7.0) (132973): File transfer completed successfully.
> 03/09/18 13:55:01 (7.0) (132973): ERROR "Error from slot2@ddfb828b5e4d:Â
> Create_Process failed to register the job with the ProcD" at line 608 inÂ
> fileÂ
> /slots/01/dir_317056/userdir/.tmpj8HirB/BUILD/condor-8.6.9/src/condor_shadow.V6.1/pseudo_ops.cpp
>Â
> =====================================
>Â
> If I set USE_PROCD = false in the configuration file, the job finishesÂ
> successfully.
> I wondor if it is ok to do so or if there is a better way to solve that.
>Â
>Â
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>Â
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
>Â


--Â
Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
Center for High Throughput ComputingÂÂ Department of Computer Sciences
HTCondor Technical LeadÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Madison, WI 53706-1685