Interface IStructuredTextExpert


  • public interface IStructuredTextExpert
    Provides advanced methods for processing bidirectional text with a specific structure to ensure proper presentation. For a general introduction to structured text, see the package documentation. For details about when the advanced methods are needed, see this package documentation.

    Identifiers for several common handlers are included in StructuredTextTypeHandlerFactory. For handlers supplied by other packages, a handler instance can be obtained using the StructuredTextTypeHandlerFactory.getHandler(java.lang.String) method for the registered handlers, or by instantiating a private handler.

    Most of the methods in this interface have a text argument which may be just a part of a larger body of text. When it is the case that the text is submitted in parts with repeated calls, there may be a need to pass information from one invocation to the next one. For instance, one invocation may detect that a comment or a literal has been started but has not been completed. In such cases, the state must be managed by a IStructuredTextExpert instance obtained with the StructuredTextExpertFactory.getStatefulExpert(java.lang.String) method.

    The state returned after processing a string can be retrieved, set and reset using the getState(), setState(Object) and clearState() methods.

    When submitting the initial part of a text, the state should be reset if it is not the first processing call for this IStructuredTextExpert instance.

    Values returned by getState() are opaque objects whose meaning is internal to the relevant structured type handler. These values can only be used in setState(Object) calls to restore a state previously obtained after processing a given part of a text before processing the next part of the text.

    Note that if the user does not modify the state, the state returned by a given processing call is automatically passed as initial state to the next processing call, provided that the expert is a stateful one.

    Code Samples

    The following code shows how to transform a certain type of structured text (directory and file paths) in order to obtain the full text corresponding to the given lean text.

       IStructuredTextExpert expert = StructuredTextExpertFactory.getExpert(StructuredTextTypeHandlerFactory.FILE);
       String leanText = "D:\\אב\\ג\\ד.ext";
       String fullText = expert.leanToFullText(leanText);
       System.out.println("full text = " + fullText);
     

    The following code shows how to transform successive lines of Java code in order to obtain the full text corresponding to the lean text of each line.

       IStructuredTextExpert expert = StructuredTextExpertFactory.getStatefulExpert(StructuredTextTypeHandlerFactory.JAVA);
       String leanText = "int i = 3; // first Java statement";
       String fullText = expert.leanToFullText(leanText);
       System.out.println("full text = " + fullText);
       leanText = "i += 4; // next Java statement";
       fullText = expert.leanToFullText(leanText,);
       System.out.println("full text = " + fullText);
     
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static int DIR_LTR
      Constant specifying that the base direction of a structured text is LTR.
      static int DIR_RTL
      Constant specifying that the base direction of a structured text is RTL.
    • Method Summary

      All Methods Instance Methods Abstract Methods 
      Modifier and Type Method Description
      void clearState()
      Resets the state to initial.
      int[] fullBidiCharOffsets​(String text)
      Given a full string, returns the offsets of characters which are directional formatting characters that have been added in order to ensure correct presentation.
      int[] fullToLeanMap​(String text)
      Given a full string, computes the positions of each of its characters within the corresponding lean string.
      String fullToLeanText​(String text)
      Removes directional formatting characters which were added to a structured text string to ensure correct presentation.
      StructuredTextEnvironment getEnvironment()
      Obtains the environment associated with this IStructuredTextExpert instance.
      Object getState()
      Gets the state established by the last text processing call.
      int getTextDirection​(String text)
      Get the base direction of a structured text.
      StructuredTextTypeHandler getTypeHandler()
      Obtains the structured type handler associated with this IStructuredTextExpert instance.
      String insertMarks​(String text, int[] offsets, int direction, int affixLength)
      Adds directional marks to the given text before the characters specified in the given array of offsets.
      int[] leanBidiCharOffsets​(String text)
      Given a lean string, computes the offsets of characters before which directional formatting characters must be added in order to ensure correct presentation.
      int[] leanToFullMap​(String text)
      Given a lean string, computes the positions of each of its characters within the corresponding full string.
      String leanToFullText​(String text)
      Adds directional formatting characters to a structured text to ensure correct presentation.
      void setState​(Object state)
      Sets the state for the next text processing call.
    • Field Detail

      • DIR_LTR

        static final int DIR_LTR
        Constant specifying that the base direction of a structured text is LTR. The base direction may depend on whether the GUI is mirrored and may be different for Arabic and for Hebrew. This constant can appear as value returned by the getTextDirection(java.lang.String) method.
        See Also:
        Constant Field Values
      • DIR_RTL

        static final int DIR_RTL
        Constant specifying that the base direction of a structured text is RTL. The base direction may depend on whether the GUI is mirrored and may may be different for Arabic and for Hebrew. This constant can appear as value returned by the getTextDirection(java.lang.String) method.
        See Also:
        Constant Field Values
    • Method Detail

      • getTypeHandler

        StructuredTextTypeHandler getTypeHandler()
        Obtains the structured type handler associated with this IStructuredTextExpert instance.
        Returns:
        the type handler instance.
      • getEnvironment

        StructuredTextEnvironment getEnvironment()
        Obtains the environment associated with this IStructuredTextExpert instance.
        Returns:
        the environment instance.
      • leanToFullText

        String leanToFullText​(String text)
        Adds directional formatting characters to a structured text to ensure correct presentation.
        Parameters:
        text - is the structured text string
        Returns:
        the structured text with directional formatting characters added to ensure correct presentation.
      • leanToFullMap

        int[] leanToFullMap​(String text)
        Given a lean string, computes the positions of each of its characters within the corresponding full string.
        Parameters:
        text - is the structured text string.
        Returns:
        an array of integers with one element for each of the characters in the text argument, equal to the offset of the corresponding character in the full string.
      • leanBidiCharOffsets

        int[] leanBidiCharOffsets​(String text)
        Given a lean string, computes the offsets of characters before which directional formatting characters must be added in order to ensure correct presentation.

        Only LRMs (for a string with LTR base direction) and RLMs (for a string with RTL base direction) are considered. Leading and trailing LRE, RLE and PDF which might be prefixed or suffixed depending on the orientation of the GUI component used for display are not reflected in this method.

        Parameters:
        text - is the structured text string
        Returns:
        an array of offsets to the characters in the text argument before which directional marks must be added to ensure correct presentation. The offsets are sorted in ascending order.
      • fullToLeanText

        String fullToLeanText​(String text)
        Removes directional formatting characters which were added to a structured text string to ensure correct presentation.
        Parameters:
        text - is the structured text string including directional formatting characters.
        Returns:
        the structured text string without directional formatting characters which might have been added by processing it with leanToFullText(java.lang.String).
      • fullToLeanMap

        int[] fullToLeanMap​(String text)
        Given a full string, computes the positions of each of its characters within the corresponding lean string.
        Parameters:
        text - is the structured text string including directional formatting characters.
        Returns:
        an array of integers with one element for each of the characters in the text argument, equal to the offset of the corresponding character in the lean string. If there is no corresponding character in the lean string (because the specified character is a directional formatting character added when invoking leanToFullText(java.lang.String)), the value returned for this character is -1.
      • fullBidiCharOffsets

        int[] fullBidiCharOffsets​(String text)
        Given a full string, returns the offsets of characters which are directional formatting characters that have been added in order to ensure correct presentation.

        LRMs (for a string with LTR base direction), RLMs (for a string with RTL base direction) are considered as well as leading and trailing LRE, RLE and PDF which might be prefixed or suffixed depending on the orientation of the GUI component used for display.

        Parameters:
        text - is the structured text string including directional formatting characters
        Returns:
        an array of offsets to the characters in the text argument which are directional formatting characters added to ensure correct presentation. The offsets are sorted in ascending order.
      • insertMarks

        String insertMarks​(String text,
                           int[] offsets,
                           int direction,
                           int affixLength)
        Adds directional marks to the given text before the characters specified in the given array of offsets. It can be used to add a prefix and/or a suffix of directional formatting characters.

        The directional marks will be LRMs for structured text strings with LTR base direction and RLMs for strings with RTL base direction.

        If necessary, leading and trailing directional formatting characters (LRE, RLE and PDF) can be added depending on the value of the affix argument.

        • A value of 1 means that one LRM or RLM must be prefixed, depending on the direction. This is useful when the GUI component presenting this text has a contextual orientation.
        • A value of 2 means that LRE+LRM or RLE+RLM must be prefixed, depending on the direction, and LRM+PDF or RLM+PDF must be suffixed, depending on the direction. This is useful if the GUI component presenting this text needs to have the text orientation explicitly specified.
        • A value of 0 means that no prefix or suffix are needed.
        Parameters:
        text - the structured text string
        offsets - an array of offsets to characters in text before which an LRM or RLM will be inserted. The array must be sorted in ascending order without duplicates. This argument may be null if there are no marks to add.
        direction - the base direction of the structured text. It must be one of the values DIR_LTR, or DIR_RTL.
        affixLength - specifies the length of prefix and suffix which should be added to the result.
        0 means no prefix or suffix
        1 means one LRM or RLM as prefix and no suffix
        2 means 2 characters in both prefix and suffix.
        Returns:
        a string corresponding to the source text with directional marks (LRMs or RLMs) added at the specified offsets, and directional formatting characters (LRE, RLE, PDF) added as prefix and suffix if so required.
        See Also:
        leanBidiCharOffsets(String)
      • getTextDirection

        int getTextDirection​(String text)
        Get the base direction of a structured text. This base direction may depend on whether the text contains Arabic or Hebrew words. If the text contains both, the first Arabic or Hebrew letter in the text determines which is the governing script.
        Parameters:
        text - is the structured text string.
        Returns:
        the base direction of the structured text, DIR_LTR or DIR_RTL
      • setState

        void setState​(Object state)
        Sets the state for the next text processing call. This method does nothing if the expert instance is not a stateful one.
        Parameters:
        state - an object returned by a previous call to getState().
      • getState

        Object getState()
        Gets the state established by the last text processing call. This is null if the expert instance is not a stateful one, or if the last text processing call had nothing to pass to the next call.
        Returns:
        the last established state.
      • clearState

        void clearState()
        Resets the state to initial. This method does nothing if the expert instance is not a stateful one.