[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] terminate called after throwing an instance of 'boost::python::error_already_set
- Date: Thu, 02 Jun 2022 14:51:52 +0530
- From: Vikrant Aggarwal <ervikrant06@xxxxxxxxx>
- Subject: Re: [HTCondor-users] terminate called after throwing an instance of 'boost::python::error_already_set
We have seen this issue in our environment and my colleague found one way of reproducing it. If the jobÂis already removed and you try to remove it again, it's aborting the program. But we believe this is not the only issue other factors can also contribute to cause this abort.Â
Thanks & Regards,
One more follow up, are you using the bindings that were installed along with the base htcondor deb package (e.g. via apt) or are you installing the bindings into a virtual or conda environment, and if so, what version of the bindings are you installing into that environment (i.e. is it a different version than 8.9.11)?
On 5/31/22 2:24 PM, Larry Martell wrote:
> On Tue, May 31, 2022 at 10:42 AM Cole Bollig via HTCondor-users
> <htcondor-users@xxxxxxxxxxx> wrote:
>> Hello Larry,
>> At the moment we think this issue is deeper than the python layer but could use some more information.
>> What version of condor is this happening on?
> $CondorVersion: 8.9.11 Dec 29 2020 BuildID: Debian-8.9.11-1.2
> PackageID: 8.9.11-1.2 Debian-8.9.11-1.2 $
> $CondorPlatform: X86_64-Ubuntu_20.04 $
>> Where the exception is being thrown?
>> What is the script doing when the exception is thrown?
> We have a python script, PWIL.py that is run using condor. That in
> turn runs another python script, CR.py also using condor. That runs a
> C++ program from within its own process (i.e. there is not another
> condor submission for the C++ run). I see the error in the PWIL log
> along with these errors:
> error from htcondor.Submit.queue
> Failed to abort transaction.
> The program does something like this:
> submit = htcondor.Submit(submit_dict)
> with schedd.transaction() as txn:
>Â Â Â submit.queue(txn)
> completed_job_ads = schedd.query(constraint="JobStatus == 4",
> completed_jobs = [completed_job['ClusterId'] for completed_job in
> for id in job_ids:
>Â Â Â if id in completed_jobs:
>Â Â Â Â Â schedd.act(htcondor.JobAction.Remove, 'clusterid==%d' % id)
>> And just to be safe.
>> What version of python are you running?
> Python 3.8.10 (default, Mar 15 2022, 12:22:08)
> [GCC 9.4.0] on linux
>> -Cole Bollig
>> From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Larry Martell <larry.martell@xxxxxxxxx>
>> Sent: Sunday, May 29, 2022 3:55 PM
>> To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
>> Subject: [HTCondor-users] terminate called after throwing an instance of 'boost::python::error_already_set
>> I have a script that has literally been running using condor for 10 years.
>> Suddenly, for some runs it crashes with the error:
>> terminate called after throwing an instance of 'boost::python::error_already_set
>> I assume this is coming from condor. Anyone have any thoughts on what could be causing this and/or how I can debug it?
>> HTCondor-users mailing list
>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> The archives can be found at:
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> The archives can be found at:
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
You can also unsubscribe by visiting
The archives can be found at: