Class XMLChar

java.lang.Object
org.apache.xmlbeans.impl.common.XMLChar

public class XMLChar
extends Object
This class defines the basic XML character properties. The data in this class can be used to verify that a character is a valid XML character or if the character is a space, name start, or name character.

A series of convenience methods are supplied to ease the burden of the developer. Because inlining the checks can improve per character performance, the tables of character properties are public. Using the character as an index into the CHARS array and applying the appropriate mask flag (e.g. MASK_VALID), yields the same results as calling the convenience methods. There is one exception: check the comments for the isValid method for details.

Version:
$Id: XMLChar.java 111285 2004-12-08 16:54:26Z cezar $
Author:
Glenn Marcy, IBM, Andy Clark, IBM, Eric Ye, IBM, Arnaud Le Hors, IBM, Rahul Srivastava, Sun Microsystems Inc.
  • Field Summary

    Fields 
    Modifier and Type Field Description
    static int MASK_CONTENT
    Content character mask.
    static int MASK_NAME
    Name character mask.
    static int MASK_NAME_START
    Name start character mask.
    static int MASK_NCNAME
    NCName character mask.
    static int MASK_NCNAME_START
    NCName start character mask.
    static int MASK_PUBID
    Pubid character mask.
    static int MASK_SPACE
    Space character mask.
    static int MASK_VALID
    Valid character mask.
  • Constructor Summary

    Constructors 
    Constructor Description
    XMLChar()  
  • Method Summary

    Modifier and Type Method Description
    static char highSurrogate​(int c)
    Returns the high surrogate of a supplemental character
    static boolean isContent​(int c)
    Returns true if the specified character can be considered content.
    static boolean isHighSurrogate​(int c)
    Returns whether the given character is a high surrogate
    static boolean isInvalid​(int c)
    Returns true if the specified character is invalid.
    static boolean isLowSurrogate​(int c)
    Returns whether the given character is a low surrogate
    static boolean isMarkup​(int c)
    Returns true if the specified character can be considered markup.
    static boolean isName​(int c)
    Returns true if the specified character is a valid name character as defined by production [4] in the XML 1.0 specification.
    static boolean isNameStart​(int c)
    Returns true if the specified character is a valid name start character as defined by production [5] in the XML 1.0 specification.
    static boolean isNCName​(int c)
    Returns true if the specified character is a valid NCName character as defined by production [5] in Namespaces in XML recommendation.
    static boolean isNCNameStart​(int c)
    Returns true if the specified character is a valid NCName start character as defined by production [4] in Namespaces in XML recommendation.
    static boolean isPubid​(int c)
    Returns true if the specified character is a valid Pubid character as defined by production [13] in the XML 1.0 specification.
    static boolean isSpace​(int c)
    Returns true if the specified character is a space character as defined by production [3] in the XML 1.0 specification.
    static boolean isSupplemental​(int c)
    Returns true if the specified character is a supplemental character.
    static boolean isValid​(int c)
    Returns true if the specified character is valid.
    static boolean isValidIANAEncoding​(String ianaEncoding)
    Returns true if the encoding name is a valid IANA encoding.
    static boolean isValidJavaEncoding​(String javaEncoding)
    Returns true if the encoding name is a valid Java encoding.
    static boolean isValidName​(String name)
    Check to see if a string is a valid Name according to [5] in the XML 1.0 Recommendation
    static boolean isValidNCName​(String ncName)
    Check to see if a string is a valid NCName according to [4] from the XML Namespaces 1.0 Recommendation
    static boolean isValidNmtoken​(String nmtoken)
    Check to see if a string is a valid Nmtoken according to [7] in the XML 1.0 Recommendation
    static boolean isXML11Space​(int c)
    Returns true if the specified character is a space character as amdended in the XML 1.1 specification.
    static char lowSurrogate​(int c)
    Returns the low surrogate of a supplemental character
    static int supplemental​(char h, char l)
    Returns true the supplemental character corresponding to the given surrogates.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

  • Constructor Details

  • Method Details

    • isSupplemental

      public static boolean isSupplemental​(int c)
      Returns true if the specified character is a supplemental character.
      Parameters:
      c - The character to check.
    • supplemental

      public static int supplemental​(char h, char l)
      Returns true the supplemental character corresponding to the given surrogates.
      Parameters:
      h - The high surrogate.
      l - The low surrogate.
    • highSurrogate

      public static char highSurrogate​(int c)
      Returns the high surrogate of a supplemental character
      Parameters:
      c - The supplemental character to "split".
    • lowSurrogate

      public static char lowSurrogate​(int c)
      Returns the low surrogate of a supplemental character
      Parameters:
      c - The supplemental character to "split".
    • isHighSurrogate

      public static boolean isHighSurrogate​(int c)
      Returns whether the given character is a high surrogate
      Parameters:
      c - The character to check.
    • isLowSurrogate

      public static boolean isLowSurrogate​(int c)
      Returns whether the given character is a low surrogate
      Parameters:
      c - The character to check.
    • isValid

      public static boolean isValid​(int c)
      Returns true if the specified character is valid. This method also checks the surrogate character range from 0x10000 to 0x10FFFF.

      If the program chooses to apply the mask directly to the CHARS array, then they are responsible for checking the surrogate character range.

      Parameters:
      c - The character to check.
    • isInvalid

      public static boolean isInvalid​(int c)
      Returns true if the specified character is invalid.
      Parameters:
      c - The character to check.
    • isContent

      public static boolean isContent​(int c)
      Returns true if the specified character can be considered content.
      Parameters:
      c - The character to check.
    • isMarkup

      public static boolean isMarkup​(int c)
      Returns true if the specified character can be considered markup. Markup characters include '<', '&', and '%'.
      Parameters:
      c - The character to check.
    • isSpace

      public static boolean isSpace​(int c)
      Returns true if the specified character is a space character as defined by production [3] in the XML 1.0 specification.
      Parameters:
      c - The character to check.
    • isXML11Space

      public static boolean isXML11Space​(int c)
      Returns true if the specified character is a space character as amdended in the XML 1.1 specification.
      Parameters:
      c - The character to check.
    • isNameStart

      public static boolean isNameStart​(int c)
      Returns true if the specified character is a valid name start character as defined by production [5] in the XML 1.0 specification.
      Parameters:
      c - The character to check.
    • isName

      public static boolean isName​(int c)
      Returns true if the specified character is a valid name character as defined by production [4] in the XML 1.0 specification.
      Parameters:
      c - The character to check.
    • isNCNameStart

      public static boolean isNCNameStart​(int c)
      Returns true if the specified character is a valid NCName start character as defined by production [4] in Namespaces in XML recommendation.
      Parameters:
      c - The character to check.
    • isNCName

      public static boolean isNCName​(int c)
      Returns true if the specified character is a valid NCName character as defined by production [5] in Namespaces in XML recommendation.
      Parameters:
      c - The character to check.
    • isPubid

      public static boolean isPubid​(int c)
      Returns true if the specified character is a valid Pubid character as defined by production [13] in the XML 1.0 specification.
      Parameters:
      c - The character to check.
    • isValidName

      public static boolean isValidName​(String name)
      Check to see if a string is a valid Name according to [5] in the XML 1.0 Recommendation
      Parameters:
      name - string to check
      Returns:
      true if name is a valid Name
    • isValidNCName

      public static boolean isValidNCName​(String ncName)
      Check to see if a string is a valid NCName according to [4] from the XML Namespaces 1.0 Recommendation
      Parameters:
      ncName - string to check
      Returns:
      true if name is a valid NCName
    • isValidNmtoken

      public static boolean isValidNmtoken​(String nmtoken)
      Check to see if a string is a valid Nmtoken according to [7] in the XML 1.0 Recommendation
      Parameters:
      nmtoken - string to check
      Returns:
      true if nmtoken is a valid Nmtoken
    • isValidIANAEncoding

      public static boolean isValidIANAEncoding​(String ianaEncoding)
      Returns true if the encoding name is a valid IANA encoding. This method does not verify that there is a decoder available for this encoding, only that the characters are valid for an IANA encoding name.
      Parameters:
      ianaEncoding - The IANA encoding name.
    • isValidJavaEncoding

      public static boolean isValidJavaEncoding​(String javaEncoding)
      Returns true if the encoding name is a valid Java encoding. This method does not verify that there is a decoder available for this encoding, only that the characters are valid for an Java encoding name.
      Parameters:
      javaEncoding - The Java encoding name.