[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] HoldReason = "Streaming not supported"



Jaime Frey wrote:

>On Jun 27, 2008, at 12:36 PM, Sean Manning wrote:
>
>> Jaime Frey wrote:
>>
>>> On Jun 23, 2008, at 6:50 PM, Sean Manning wrote:
>>>
>>>> I am working on a Web Services interface to submit jobs to our  
>>>> Globus
>>>> grid.  It uses the condor and birdbath Java packages.  We can
>>>> successfully submit the attached JDL on the command line of a condor
>>>> head node (the metascheduler of our grid)  and see it complete, but
>>>> when we submit it with the Java program from an external Condor  
>>>> client
>>>> machine the job stays Idle then Halts with an error.  Running the
>>>> condor daemons as root got rid of one error, but now we get another
>>>> one: HoldReason = "Streaming not supported".  I can't find any
>>>> information about this error in the usergroup archives.  Does anyone
>>>> here have an idea what could be causing this?
>>>
>>> For GT4 GRAM jobs, if StreamOut and StreamErr aren't explicitly set  
>>> to
>>> False in the job ad, then Condor assumes you want stdout and stderr  
>>> to
>>> be streamed, which isn't supported by Condor for GT4 GRAM jobs. This
>>> appears to be a bug, as the default behavior for other job types is  
>>> no
>>> streaming.
>>>
>>> If you add the following two attributes to your job ads, it should
>>> eliminate the problem:
>>> StreamOut = False
>>> StreamErr = False
>>>
>>> Thanks and regards,
>>> Jaime Frey
>>> UW-Madison Condor Team
>>>
>>
>> Dear Jaime,
>>
>>  Thanks for the reply.
>>
>>  I made that change, but jobs are still hanging with HoldReason =
>> "Streaming not supported."  I can submit the new file with
>> condor_submit from the grid metascheduler and see it appear on the  
>> head
>> node of a worker cluster, when condor_config has SOAP enabled.  The
>> output and error come back to the machine I submitted the job from  
>> just
>> like they are supposed to.  But when I  submit the same JDL to the  
>> grid
>> metascheduler using our Web Services code, the job always holds  
>> after a
>> delay.
>>
>>  <snip>
>>
>>  In principle, if we can submit a job to the grid using condor_submit,
>> then the web services submission should work as well.  I would be very
>> grateful if you have any further advice about what I am missing.
>>
>
>
>Can you look at the values of StreamOut and StreamErr in the classad  
>of the held job in the schedd? I'm guessing they're either missing or  
>set to the string "False". They need to set to False (no quotes). I'll  
>bet your JobHelper class isn't handling these attributes correctly.
>
>Thanks and regards,
>Jaime Frey
>UW-Madison Condor Team
>

Dear Jamie,

  Thanks again for your help.  I think that part of my problem is 
definitely related to how I am parsing the JDL file in my class 
JobHelper.java

  I discovered that all the attributes of the JDL were being 
interpreted as Strings, for reasons I will explain below.  Early on, I 
discovered that if I submitted a JDL to the Web Services interface 
which had worked with command-line submission, I got a 
Java.text.ParseException from the parser here (lines 66 and 67 of 
JobHelper.java):

Ad jobad = new Ad();	// This is an org.glite.jdl.Ad.  That's all I know 
about it.
jobad.fromFile(file);

with messages like this:

Unable to parse: Doesn't seem to be a valid Expression
  at org.glite.jdl.Ad.fromString(Ad.java:497)
  at org.glite.jdl.Ad.fromFile(Ad.java:433)
  at birdbath.JobHelper.getJobAttrFromJDL(JobHelper.java:67)
 <snip>

To avoid these errors, I followed my predecessor in making some changes 
to the JDL:

Terminate every line with a semicolon
Wrap quotes around every value
  eg. foo = bar becomes foo = "bar"; and out = 
out.$(Cluster).$(Process) becomes out = "$(Cluster).$(Process)";
Add a line InputSandbox = {*} where * is the full path to the JDL, a 
comma, and a full path to the executable.
Change the variable StreamOutput to StreamOut, and StreamError to 
StreamErr

I don't understand the need for many of these changes, but they 
appeared to work.  However, the quotes around every value cause it to 
be interpreted as a STRING-ATTR not a BOOLEAN-ATTR or EXPRESSION-ATTR 
or whatever.  I'm having trouble debugging this because it uses various 
classes in the condor package, and I don't know of any detailed 
documentation for that package.  Right now, JobHelper.java is checking 
for variables with values like "TRUE" and "false" and treating them as 
booleans when it creates a condor.ClassAdStructAttr to represent that 
line.  This code appears to work for booleans: I see the correct line 
StreamOut = FALSE; in condor_q -l where I used to see StreamOut = 
"False";

Status
=-=-=-=

  Right now, all lines of the JDL are being interpreted as either 
Booleans or Strings.  I can get jobs to run, but not to complete.  If I 
manually change the owner of the spool/cluster1234.proc0.subproc0 
folder and its contents from root to myself, the job runs on the grid 
then goes into state C (completed?).  The output never gets staged 
over, and the job never terminates.  Alternately, they fail to run and 
halt with HoldReason = "Failed to get expiration time of proxy".  If I 
change the owner of the contents of the 
spool/cluster1234.proc0.subproc0 folder from root to myself, the job 
halts with HoldReason = "Globus error: Staging error for RSL element 
fileStageOut."  I have noticed that, in either case, some attributes 
like GridJobId, GlobusSubmitId, GridftpUrlBase, and WallClockCheckpoint 
are being left UNDEFINED.  When I submit by command line, they are 
either absent (the last three) or set to a specific value (GlobusJobId 
= "babargt4.phys.uvic.ca#12114125215125#5330.0" or similar)

  I have attached my class JobHelper.java.  Could you look at the 
parser code in the getJobAttrFromJDL () method and tell me if I am 
doing anything wrong? 

Regards,

Sean Manning
/*
 * jobHelper.java
 *
 * Created on November 29, 2007, 2:23 PM
 *
 * To change this template, choose Tools | Template Manager
 * and open the template in the editor.
 */

package birdbath;


import condor.ClassAdStructAttr;
import condor.Status;	// SM Not used
import condor.StatusCode;
import java.rmi.RemoteException;
import java.util.ArrayList;
import java.util.List;
import org.glite.jdl.JobAd;
import org.glite.jdl.*;
import condor.classad.*;
import java.util.Iterator;
import java.util.Vector;
import condor.ClassAdStructAttr;	// SM Not used
import condor.ClassAdAttrType;

/**
 *
 * @author David Gong
 */
public class JobHelper {
 
    public static  String[] attrTypeString = {"ERROR-ATTR", "EXPRESSION-ATTR", "BOOLEAN-ATTR", "INTEGER-ATTR", 
                                "FLOAT-ATTR", "STRING-ATTR", "ERROR-ATTR", "UNDEFINED-ATTR"};
 
    private ClassAdStructAttr[] jobAttr;
    private Vector<ClassAdStructAttr> jobAttrVect;
    private Expr owner;
    private Expr jobUniverse;
    private Expr command;
    private Expr arguments;
    private Expr requirements;
    private Expr inputSandbox;
    
    public JobHelper(){
        
    }
    
    
    /** Creates a new instance of jobHelper */
    public JobHelper(String file) throws Exception{
            getJobAttrFromJDL(file);
    }
    
    public   ClassAdStructAttr[] getJobAttrFromJDL(String file)throws NoSuchFieldException, Exception
    {

    	/* See pp. 177ff of the paper manual for details of Condor ClassAds */
        String[] attrTypeString = {"ERROR-ATTR", "EXPRESSION-ATTR", "BOOLEAN-ATTR", "INTEGER-ATTR", 
                                "FLOAT-ATTR", "STRING-ATTR", "ERROR-ATTR", "UNDEFINED-ATTR"};
        // ERROR AND UNDEFINED NEED TO BE CONFIRMED- DG
        Vector<ClassAdStructAttr> result = new Vector <ClassAdStructAttr> ();


        Ad jobad = new Ad();	// SM This is an org.glite.jdl.Ad.  That's all I know about it.
//        Vector <ClassAdStructAttr> myResult = new Vector <ClassAdStructAttr> ();
        jobad.fromFile(file);
        
      owner = jobad.lookup("Owner");
      jobUniverse = jobad.lookup("JobUniverse");
      command = jobad.lookup("Executable");
      arguments = jobad.lookup("Arguments");
      requirements = jobad.lookup("Requirements");
      inputSandbox = jobad.lookup("InputSandbox");
/*
      try{
      jobad.delAttribute("Owner");
      System.out.println("Owner is deleted");
      jobad.delAttribute("JobUniverse");
      System.out.println("JobUniverse is deleted");
      jobad.delAttribute("Executable");
      System.out.println("Executable is deleted");
      jobad.delAttribute("Arguments");
      System.out.println("Arguments is deleted");
      jobad.delAttribute("Requirements");
      System.out.println("Requirements is deleted");
      jobad.delAttribute("InputSandbox");
      System.out.println("InputSandbox is deleted");
      }
      catch(Exception err){;}     
*/      

      String test = "$(foo).$(bar)";
      if (!containsVariables (test)) {
    	  System.out.println ("Error in containsVariables ... choose a better regex");
      }
      
      Iterator it = jobad.attributes();
      AttrName temp = null;
      int iAttrType = 0;
      ClassAdStructAttr currentAttr = null;
      while (it.hasNext()) {
    	  temp = (AttrName) it.next();	// a condor.classad.AttrName from classad.jar
    	  iAttrType = jobad.getType(temp.rawString());	// Attribute type as int
    	  // SM Added this line
    	  System.out.print ("Type of \"" + temp.toString () + "\" is " + attrTypeString[iAttrType] + "; ");
    	  Expr tempV = jobad.lookup(temp.toString());
    	  String val = null;
    	  String raw = temp.rawString();
    	  
    	  /*
    	   * Get the "value" attribute of the attribute in one of two ways.
    	   */
          if (tempV instanceof ListExpr) {	// SM What is a ListExpr?
        	  val = jobad.lookup (temp.rawString()).toString();
          }
          else {
        	  // java.lang.ArithmeticException: boolean false in string context thrown here
        	  // when I remove quotes around a boolean value in the JDL
        	  val = jobad.lookup(temp.rawString()).stringValue();
          }

          /*
           * Create the ClassAdStructAttr
           */
          
    	  /*
    	   * The parser wrongly reads many attributes of boolean type as strings.  
    	   * This causes obvious difficulties; for example, StreamOut (which must 
    	   * be explicitly set to FALSE for the code to work) cannot be set because 
    	   * it takes a boolean value not a string.
    	   * 
    	   * This clause corrects the problem BY ASSUMING THAT 'TRUE' AND 'FALSE' 
    	   * ARE ALWAYS BOOLEAN VALUES, NEVER STRINGS.  If you ever need to have 
    	   * a string "true" or similar, you will need to change this ... perhaps 
    	   * letting name determine type.
    	   */
          // TODO Change interpretation of floats and integers too?
          if (val.equalsIgnoreCase ("true") || val.equalsIgnoreCase("false")) {
    		  System.out.print ("changed type to BOOLEAN_ATTR; ");
    		  currentAttr = new ClassAdStructAttr (temp.rawString (),	// Name
    				  ClassAdAttrType.fromString("BOOLEAN-ATTR"), 		// Type
    				  val);												// Value
    	  }
    	  else if (isNonNegativeInt (val)) {;}	// TODO Implement
    	  else if (containsVariables (val)) { // TODO Choose whether to keep
    		  System.out.print("changed type to EXPRESSION_ATTR; ");
    		  currentAttr = new ClassAdStructAttr (temp.rawString (),	// Name
    				  ClassAdAttrType.fromString("EXPRESSION-ATTR"), 	// Type
    				  val);												// Value
    	  }
    	  else {
    		  currentAttr = new ClassAdStructAttr (temp.rawString(), 	// Name
    				  ClassAdAttrType.fromString(attrTypeString[iAttrType]), // Type
    				  val);												// Value
    	  }
          
          // SM Added this line.
          System.out.println ("value is \"" + currentAttr.getValue () + "\"");
          result.add (currentAttr);
      }

      jobAttrVect = result;
      
      return jobAttr = (ClassAdStructAttr[]) result.toArray(new ClassAdStructAttr[0]);
    }
     
    public static boolean isNonNegativeInt (String s) {
    	/* A non-negative integer consists of 0, 
    	 * or one digit from 1-9 followed by zero or more digits, 
    	 * with zero or more whitespace characters before and after it.
    	 */
    	String intFormat = "\\s*[1-9]\\d+\\s*";
    	String zero = "\\s*0\\s*";
    	if (s.matches (intFormat)) {;}
    	else if (s.matches (zero)) {;}
    	return false;
    }
    
    public static boolean containsVariables (String s) {
    	// A variable consists of $(*), where * is one or more characters
    	// It has zero or more other characters before or after it
    	String varFormat = ".*\\$\\(.+\\).*";
    	
    	if (s.matches (varFormat)) {return true;}
    	else {return false;}
    }
    
     public Vector<ClassAdStructAttr> getJobAttrVector(){
         return jobAttrVect;
     }
     
     
     
     public static ClassAdStructAttr createStringAttr(String attrName, String attrValue){
         return new ClassAdStructAttr(attrName, ClassAdAttrType.fromString(attrTypeString[5]), attrValue );
     }
    
     public Expr getOwner(){
         return owner;
     }
     
     public Expr getJobUniverse(){
         return jobUniverse;
     }
    
     public Expr getCommand(){
         return command;
     }
    
     public Expr getRequirements(){
         return requirements;
     }
     
     public Expr getArguments(){
         return arguments;
     }
     
     
     public Expr getInputSandboxExpr(){
         return inputSandbox;
     }
     
     public String[] getStageInFiles(){
         Vector<String> retVal = new Vector<String>();
         if (inputSandbox instanceof ListExpr){
             Iterator it = ((ListExpr) inputSandbox).iterator();
             Expr tmp = null;
             while (it.hasNext()){
                 tmp = (Expr)it.next();
                 retVal.add(tmp.stringValue());
             }        
         } 
         else{
             retVal.add(inputSandbox.toString());
         }
         return (String[]) retVal.toArray(new String[0]);
     }
     
     public Vector<String> getStageInFilesVector(){
         Vector<String> retVal = new Vector<String>();
         if (inputSandbox instanceof ListExpr) {
             Iterator it = ((ListExpr) inputSandbox).iterator();
             Expr tmp = null;
             while (it.hasNext()){
                 tmp = (Expr)it.next();
                 retVal.add(tmp.stringValue());
             }        
         } 
         else {
             retVal.add(inputSandbox.toString());
         }
         return retVal;
        
     }
     
     
     public void addProxy()throws Exception{
         
     }
     
    public static void main(String[] args) throws Exception{

        JobHelper my = new JobHelper("/hepuser/seangwm/workspace_ganymede/CondorWSProjectRon/src/supportfiles/testjdl-gt4");
        String[] files = my.getStageInFiles();
        return;

   }

     public ClassAdStructAttr[] getJobAttr(){
         return jobAttr;
     }
}