[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Unable to start Abaqus MPI Jobs with HTCondor



Dear Greg,


thanks for your immediate response.

I spend some time for testing and I switched back from partitionable
slots to static slots and followed your advice.

Just as a short answer. The abaqus job within the interactive session is
running fine, i.e.:

$ condor_submit -i
Submitting job(s).
1 job(s) submitted to cluster 80.
Waiting for job to start...
Welcome to slot1@kallisto
You will be logged out after 7200 seconds of inactivity.
$ abq2017 input=test_06_400_1_90 job=test_06_400_1_90.inp cpus=32
user=uamp_x64_v5_KpKonst_amp_extern.f interr
Analysis initiated from SIMULIA established products
Abaqus JOB test_06_400_1_90
Abaqus 3DEXPERIENCE R2017x
Abaqus License Manager checked out the following licenses:
Begin Compiling Abaqus/Standard User Subroutines
Sat 06 Feb 2021 09:53:11 PM CET
End Compiling Abaqus/Standard User Subroutines
Begin Linking Abaqus/Standard User Subroutines
GNU ld version 2.27-44.base.el7
End Linking Abaqus/Standard User Subroutines
Sat 06 Feb 2021 09:53:13 PM CET
Begin Analysis Input File Processor
Sat 06 Feb 2021 09:53:13 PM CET
Run pre
Sat 06 Feb 2021 09:53:23 PM CET
End Analysis Input File Processor
Begin Abaqus/Standard Analysis
Sat 06 Feb 2021 09:53:23 PM CET
Run standard
Sat 06 Feb 2021 09:58:19 PM CET
End Abaqus/Standard Analysis
Begin MFS->SFS and SIM cleanup
Sat 06 Feb 2021 09:58:19 PM CET
Run SMASimUtility
Sat 06 Feb 2021 09:58:19 PM CET
End MFS->SFS and SIM cleanup
Abaqus JOB test_06_400_1_90 COMPLETED

I will spend more time for testing and analysis. This is just a quick
reponse in order to provide feedback.

Maybe this result indicates the right direction for debugging.


Regards,

Felix




On 06/02/2021 00:20, Greg Thain wrote:

If you start an interactive shell via condor with submit -i, and then
run abaqus "by hand", do you get the same error?


-greg

On 2/5/21 4:30 PM, felix.koelzow@xxxxxx wrote:
At first,

I would like to say thank you to everyone who contributed and/or still
contributes in any sense to

htcondor, I am really impressed about the capabilities and possibilities
of htcondor.


Currently, I am struggeling with htcondor while running abaqus jobs. In
short, I am able to run

abaqus jobs on each host if it is started WITHOUT condor. But if condor
is involed, after a short

successfull runtime, the abaqus job terminates with


*** buffer overflow detected ***:
/opt/Abaqus/V6R2017x/linux_a64/code/bin/standard terminated
======= Backtrace: =========
/lib64/libc.so.6(__fortify_fail+0x37)[0x2af27504a607]
/lib64/libc.so.6(+0x116782)[0x2af275048782]
/lib64/libc.so.6(+0x115c8b)[0x2af275047c8b]
/lib64/libc.so.6(_IO_default_xsputn+0xe1)[0x2af274faefa1]
/lib64/libc.so.6(_IO_vfprintf+0x28c5)[0x2af274f7cec5]
/lib64/libc.so.6(__vsprintf_chk+0x88)[0x2af275047d18]
/lib64/libc.so.6(__sprintf_chk+0x7d)[0x2af275047c6d]
/opt/Abaqus/V6R2017x/linux_a64/code/bin/libifcoremt.so.5(fname_from_piped_fd+0xa3)[0x2af273bda993]


...


Actually, I am absolutely without ideas and any help is appreciated.


I guess you need more information, so please feel free to ask for more.


OS: CentOS7.9

Condor Version: Stable 8.8.12


Regards,

Felix


--------

submit file:

universe = parallel
executable = runAbaqus.sh
request_cpusÂÂÂÂ = 1
machine_countÂÂÂ = 1
requirements = ( machine == "specifichost")

jobname = somename
usersubroutine = uamp.f
arguments= job=$(jobname) input=$(jobname).inp user=$(usersubroutine)
cpus=$(request_cpus) inter

outputÂÂÂÂÂÂ = outputfile
errorÂÂÂÂÂÂÂ = errorfile
logÂÂÂÂÂÂÂÂÂ = abq2017.log
concurrency_limits = abaqus_tokens:5
getenv = True
should_transfer_files = no
queue

runAbaqus.sh file

cat runAbaqus.sh
#!/usr/bin/bash
/opt/Abaqus/Commands/abq2017 "$@"


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/