[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] my_popenv error



I tracked down what was causing this, and wanted to post here in case someone else runs across this.


I tried to use strace to see what was going on, but it was less than helpful.  I then tried
valgrind like so:  

valgrind -v condor_submit_dag

This gives a lot of output.  Buried in there I found the following:

==893403==    at 0x6D76270: __close_nocancel (syscall-template.S:81)
==893403==    by 0x5475DB0: ??? (in /usr/lib64/libcondor_utils_8_5_7.so)
==893403==    by 0x5476022: my_popen(ArgList&, char const*, int, Env*, bool, char const*) (in /usr/lib64/libcondor_utils_8_5_7.so)
==893403==    by 0x545C3AC: Copy_macro_source_into (in /usr/lib64/libcondor_utils_8_5_7.so)
==893403==    by 0x54627FC: Parse_macros (in /usr/lib64/libcondor_utils_8_5_7.so)
==893403==    by 0x535DBF9: process_config_source (in /usr/lib64/libcondor_utils_8_5_7.so)
==893403==    by 0x5363C7A: real_config (in /usr/lib64/libcondor_utils_8_5_7.so)
==893403==    by 0x5364603: config_ex (in /usr/lib64/libcondor_utils_8_5_7.so)
==893403==    by 0x40463E: main (in /usr/bin/condor_submit_dag)


I downloaded the source for 8.5.7, and realized that the /etc/condor/condor_config file was the issue.  I took a look at that and found
the following:

##
## If you've installed the condor-ec2 package, this will set TCP_FORWARDING_HOST
## to the instance's public IP and cause the startd to advertise that IP and
## the instance ID.  It will also fetch and install additional config.d files
## if the instance's IAM profile is configured correctly (pointing to a single
## specific file in S3); see the manual for condor_annex for details.
##
include ifexist command into $(LOCAL_CONFIG_DIR)/49ec2-instance.config : \
        /etc/condor/config.d/49ec2-instance.sh

I commented out these lines, since we donât have that installed on our system, and the error message
went away.


Hope this helps,

Steve 



On Nov 14, 2016, at 1:28 PM, Pietrowicz, Stephen R <srp@xxxxxxxxxxxx> wrote:

Hi,

Iâm seeing a weird error that I canât quite figure out.

I executed the following commands:

bash-4.2$ cat simple.dag
JOB Simple srp.submit
bash-4.2$ cat srp.submit
executable = /usr/bin/hostname
universe = vanilla
input = /home/srp/short.input
output = test.out
error = test.error
log = test.log

queue
bash-4.2$ condor_submit_dag -force simple.dag

my_popenv: Failed to exec in child, errno=2 (No such file or directory)
Renaming rescue DAGs newer than number 0
-----------------------------------------------------------------------
File for submitting this DAG to HTCondor           : simple.dag.condor.sub
Log of DAGMan debugging messages                 : simple.dag.dagman.out
Log of HTCondor library output                     : simple.dag.lib.out
Log of HTCondor library error messages             : simple.dag.lib.err
Log of the life of condor_dagman itself          : simple.dag.dagman.log

Submitting job(s).
1 job(s) submitted to cluster 297.
-----------------------------------------------------------------------
bash-4.2$ 


Note the âmy_popenv: Failed to exec in child, errno=2 (No such file or directory)â


This is under HTCondor version 8.5.7

Any ideas why this is happening?  I tried this under 8.4.9, and didnât see this error.

Steve