[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HTCondor as NoSql database.



Alexander,

Does the main process run on one of the nodes of the parallel universe job? If so, you could use condor_chirp ( https://htcondor.readthedocs.io/en/latest/man-pages/condor_chirp.html ) to push and pull files from a dedicated spool directory for your job ( $_CONDOR_REMOTE_SPOOL_DIR ) that is automatically created on the submit machine:

# get path to chirp
CONDOR_CHIRP=$(condor_config_val libexec)/condor_chirp

# put this node's information in spool
echo "I'm running on port 8080" | $CONDOR_CHIRP put -mode cwa -perm 400 - $_CONDOR_REMOTE_SPOOL_DIR/this_node_info.$_CONDOR_PROCNO

# fetch other nodes' information from spool
for node in $(seq 0 $(( $_CONDOR_NPROCS - 1 ))); do
    $CONDOR_CHIRP fetch $_CONDOR_REMOTE_SPOOL_DIR/this_node_info.$node ./node_info.$node
done

Jason Patton

On Fri, Aug 9, 2019 at 3:32 AM Alexander Prokhorov <prokher@xxxxxxxxx> wrote:
Dear Colleagues,

Thank you for responses. Actually, the goal Ivan and I are trying to achieve is the following. Possibly you can help us to find a proper HT Condor based solution.

We run parallel universe job to be sure all processes are running at the same time (to guarantee there are not deadlocks when many such jobs run). Then we need to establish  connections between all worker processes and the main one. The difficulty here is that all running processes (both main and workers) bind to an arbitrary ports and we need some discovery mechanism to let them find each other. So the idea was to publish main process endpoint somewhere (that is where we thought about HT Condor as a key-value storage) and let workers request this endpoint and check-in to the main process.

May be you can advise something here. Thanks in advance.

All the best,
Alexander A. Prokhorov


On 9 Aug 2019, at 00:36, Todd L Miller <tlmiller@xxxxxxxxxxx> wrote:

condor_advertise can write to the COLLECTOR, but the data in the collector is stored only in memory, so this would be a runtime-only key-value store.  You would have to re-advertise everything each time the collector was restarted.

The collector can be forced to persist certain ads to disk; see
COLLECTOR_PERSISTENT_AD_LOG and/or the absent ads feature.  These features weren't intended for use a general-purpose NoSQL database, but if you're still interested:

https://htcondor.readthedocs.io/en/latest/admin-manual/configuration-macros.html#index-730
and/or
https://htcondor.readthedocs.io/en/latest/admin-manual/monitoring.html#absent-classads

- ToddM
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/