[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Unable to start Abaqus MPI Jobs with HTCondor




On 2/6/21 3:05 PM, felix.koelzow@xxxxxx wrote:


Just as a short answer. The abaqus job within the interactive session is
running fine, i.e.:

Hi Felix:

I don't know anything about Abaqus, but typically when this happens, it means there is an environment variable missing that the job needs. My first guess would be HOME. In your runAbaqus.sh script, can you add a line like

export HOME=$_CONDOR_SCRATCH_DIR

if you are running without a shared filesystem, or the path to the correct home if you are?

Thanks,


-greg

$ condor_submit -i
Submitting job(s).
1 job(s) submitted to cluster 80.
Waiting for job to start...
Welcome to slot1@kallisto
You will be logged out after 7200 seconds of inactivity.
$ abq2017 input=test_06_400_1_90 job=test_06_400_1_90.inp cpus=32
user=uamp_x64_v5_KpKonst_amp_extern.f interr
Analysis initiated from SIMULIA established products
Abaqus JOB test_06_400_1_90
Abaqus 3DEXPERIENCE R2017x
Abaqus License Manager checked out the following licenses:
Begin Compiling Abaqus/Standard User Subroutines
Sat 06 Feb 2021 09:53:11 PM CET
End Compiling Abaqus/Standard User Subroutines
Begin Linking Abaqus/Standard User Subroutines
GNU ld version 2.27-44.base.el7
End Linking Abaqus/Standard User Subroutines
Sat 06 Feb 2021 09:53:13 PM CET
Begin Analysis Input File Processor
Sat 06 Feb 2021 09:53:13 PM CET
Run pre
Sat 06 Feb 2021 09:53:23 PM CET
End Analysis Input File Processor
Begin Abaqus/Standard Analysis
Sat 06 Feb 2021 09:53:23 PM CET
Run standard
Sat 06 Feb 2021 09:58:19 PM CET
End Abaqus/Standard Analysis
Begin MFS->SFS and SIM cleanup
Sat 06 Feb 2021 09:58:19 PM CET
Run SMASimUtility
Sat 06 Feb 2021 09:58:19 PM CET
End MFS->SFS and SIM cleanup
Abaqus JOB test_06_400_1_90 COMPLETED

I will spend more time for testing and analysis. This is just a quick
reponse in order to provide feedback.

Maybe this result indicates the right direction for debugging.


Regards,

Felix




On 06/02/2021 00:20, Greg Thain wrote:

If you start an interactive shell via condor with submit -i, and then
run abaqus "by hand", do you get the same error?


-greg

On 2/5/21 4:30 PM, felix.koelzow@xxxxxx wrote:
At first,

I would like to say thank you to everyone who contributed and/or still
contributes in any sense to

htcondor, I am really impressed about the capabilities and possibilities
of htcondor.


Currently, I am struggeling with htcondor while running abaqus jobs. In
short, I am able to run

abaqus jobs on each host if it is started WITHOUT condor. But if condor
is involed, after a short

successfull runtime, the abaqus job terminates with


*** buffer overflow detected ***:
/opt/Abaqus/V6R2017x/linux_a64/code/bin/standard terminated
======= Backtrace: =========
/lib64/libc.so.6(__fortify_fail+0x37)[0x2af27504a607]
/lib64/libc.so.6(+0x116782)[0x2af275048782]
/lib64/libc.so.6(+0x115c8b)[0x2af275047c8b]
/lib64/libc.so.6(_IO_default_xsputn+0xe1)[0x2af274faefa1]
/lib64/libc.so.6(_IO_vfprintf+0x28c5)[0x2af274f7cec5]
/lib64/libc.so.6(__vsprintf_chk+0x88)[0x2af275047d18]
/lib64/libc.so.6(__sprintf_chk+0x7d)[0x2af275047c6d]
/opt/Abaqus/V6R2017x/linux_a64/code/bin/libifcoremt.so.5(fname_from_piped_fd+0xa3)[0x2af273bda993]


...


Actually, I am absolutely without ideas and any help is appreciated.


I guess you need more information, so please feel free to ask for more.


OS: CentOS7.9

Condor Version: Stable 8.8.12


Regards,

Felix


--------

submit file:

universe = parallel
executable = runAbaqus.sh
request_cpusÂÂÂÂ = 1
machine_countÂÂÂ = 1
requirements = ( machine == "specifichost")

jobname = somename
usersubroutine = uamp.f
arguments= job=$(jobname) input=$(jobname).inp user=$(usersubroutine)
cpus=$(request_cpus) inter

outputÂÂÂÂÂÂ = outputfile
errorÂÂÂÂÂÂÂ = errorfile
logÂÂÂÂÂÂÂÂÂ = abq2017.log
concurrency_limits = abaqus_tokens:5
getenv = True
should_transfer_files = no
queue

runAbaqus.sh file

cat runAbaqus.sh
#!/usr/bin/bash
/opt/Abaqus/Commands/abq2017 "$@"


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/