[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Antwort: Re: condor-C to DAGman problem




Hi,Matthew,
   
    Thank you for reminding, I have already seen this, actually I think it is too complicated, so then I decided to use remote dagman to submit jobs.

Tao



Matthew Farrellee <matt@xxxxxxxxxx>
Gesendet von: condor-users-bounces@xxxxxxxxxxx

10/02/2009 02:07 PM

Bitte antworten an
Condor-Users Mail List <condor-users@xxxxxxxxxxx>

An
Condor-Users Mail List <condor-users@xxxxxxxxxxx>
Kopie
Thema
Re: [Condor-users] condor-C to DAGman problem





Have you seen, which appears to be a start at answering this question...

http://condor-wiki.cs.wisc.edu/index.cgi/wiki?p=DagManUnderCondorc

Best,


matt

Tao.3.Chen@xxxxxxxxxxxxxxxxxxxxxxxxxxx wrote:
> Hi,
>    I am trying to use condor-C to DAGman, which means submitting a remote
> DAGman job.As the condor team suggested, I modified the dagman.condor.sub
> file, and submit the dag.condor.sub file with condor_submit.
>    However,the dag job ended very fast also, without submitting and
> running any jobs. In file dag_test.dag.lib.err there is nothing in, but
> there is an error from the dag_test.dag.dagman.out file below which is
> confused to me, I searched this error in google, but seems not much
> helpful...can anyone give some suggestions? Thank you very much!
>    PS: I can submit local DAGman and it runs well, but with the remote
> mode, the error occurs. And also I can submit local jobs from the remote
> machine, which means the schedd daemon in the remote machine works well.
> The error is as below:
> 9/1 14:25:15 Submitting Condor Node testA job(s)...
> 9/1 14:25:15 submitting: condor_submit -a dag_node_name' '=' 'testA -a
> +DAGManJobId' '=' '-1 -a DAGManJobId' '=' '-1 -a submit_event_notes' '='
> 'DAG' 'Node:' 'testA -a +DAGParentNodeNames' '=' '"" testA.sub
> 9/1 14:25:16 From submit:
> 9/1 14:25:16 From submit: ERROR: Can't find address of local schedd 9/1
> 14:25:16 failed while reading from pipe.
> 9/1 14:25:16 Read so far: ERROR: Can't find address of local schedd 9/1
> 14:25:16 ERROR: submit attempt failed
> 9/1 14:25:16 submit command was: condor_submit -a dag_node_name' '='
> 'testA -a +DAGManJobId' '=' '-1 -a DAGManJobId' '=' '-1 -a
> submit_event_notes' '=' 'DAG' 'Node:' 'testA -a +DAGParentNodeNames' '='
> '"" testA.sub
> 9/1 14:25:16 Job submit try 2/6 failed, will try again in >= 2 seconds.
>
> Here are the files I use:
>
> DAD file:
> JOB testA testA.sub
> JOB testB testB.sub
> JOB testC testC.sub
> PARENT testA CHILD testB testC
> PARENT testC CHILD testB
>
> dag.condor.sub file
> # Filename: dag_test.dag.condor.sub
> # Generated by condor_submit_dag dag_test.dag
> universe        = grid
> grid_resource = condor L50.com L50**.com
> executable      = C:\condor\bin\condor_dagman.exe
> getenv          = True
> output          = dag_test.dag.lib.out
> error           = dag_test.dag.lib.err
> log             = dag_test.dag.dagman.log
> # Note: default on_exit_remove _expression_:
> # ( ExitSignal =?= 11 || (ExitCode =!= UNDEFINED && ExitCode >=0 &&
> ExitCode <= 2))
> # attempts to ensure that DAGMan is automatically
> # requeued by the schedd if it exits abnormally or
> # is killed (e.g., during a reboot).
> on_exit_remove  = ( ExitSignal =?= 11 || (ExitCode =!= UNDEFINED &&
> ExitCode >=0 && ExitCode <= 2))
> copy_to_spool   = False
> arguments       = -f -l . -Debug 3 -Lockfile dag_test.dag.lock -Condorlog
> DAGmantest.log.txt -Dag dag_test.dag -Rescue dag_test.dag.rescue
> environment     =
> _CONDOR_DAGMAN_LOG=dag_test.dag.dagman.out|_CONDOR_MAX_DAGMAN_LOG=0
> should_transfer_files = YES
> when_to_transfer_output = ON_EXIT
> transfer_input_files
> =DAGmantestA.bat,testA.sub,DAGmantestB.bat,testB.sub,DAGmantestC.bat,testC.sub,dag_test.dag
> queue
>
> job sub file(only testA is given here):
> Universe = Vanilla
> Executable =DAGmantestA.bat
> GetEnv     = True
> RunAsOwner = True
> Log        = DAGmantest.log.txt
> Error      = DAGmantest.bat.error.txt
> Queue
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/