dk.brics.automaton
Class CustomAutomatonMatcher

java.lang.Object
  extended by dk.brics.automaton.CustomAutomatonMatcher
All Implemented Interfaces:
java.util.regex.MatchResult

public class CustomAutomatonMatcher
extends java.lang.Object
implements java.util.regex.MatchResult

A tool that performs match operations on a given character sequence using a compiled automaton. Modified such that regular expressions can be assigned IDs that are used to differentiate joined automatons.

Author:
John Gibson <jgibson@mitre.org>, Martin Gerner
See Also:
RunAutomaton.newMatcher(java.lang.CharSequence), RunAutomaton.newMatcher(java.lang.CharSequence, int, int)

Field Summary
private  CustomRunAutomaton automaton
           
private  java.lang.CharSequence chars
           
private  int matchEnd
           
private  java.util.ArrayList<java.lang.String> matchIDs
           
private  int matchStart
           
 
Constructor Summary
CustomAutomatonMatcher(java.lang.CharSequence chars, CustomRunAutomaton automaton)
           
 
Method Summary
 int end()
          Returns the offset after the last character matched.
 int end(int group)
          Returns the offset after the last character matched of the specified capturing group.
 boolean find()
          Find the next matching subsequence of the input.
 boolean findWithDelimitedID(char delimiter)
          Equivalent to find(), but for automatons with IDs that have been separated from the regular expression by delimiter
private  java.lang.CharSequence getChars()
           
private  java.util.ArrayList<java.lang.String> getIDs(int state)
           
private  int getMatchEnd()
           
 java.util.ArrayList<java.lang.String> getMatchIDs()
           
private  int getMatchStart()
           
 java.lang.String group()
          Returns the subsequence of the input found by the previous match.
 java.lang.String group(int group)
          Returns the subsequence of the input found by the specified capturing group during the previous match operation.
 int groupCount()
          Returns the number of capturing groups in the underlying automaton.
private  void matchGood()
          Helper method to check that the last match attempt was valid.
private static void onlyZero(int group)
          Helper method that requires the group argument to be 0.
private  void setMatch(int matchStart, int matchEnd)
           
private  void setMatch(int matchStart, int matchEnd, java.util.ArrayList<java.lang.String> matchIDs)
           
 int start()
          Returns the offset of the first character matched.
 int start(int group)
          Returns the offset of the first character matched of the specified capturing group.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

automaton

private final CustomRunAutomaton automaton

chars

private final java.lang.CharSequence chars

matchStart

private int matchStart

matchEnd

private int matchEnd

matchIDs

private java.util.ArrayList<java.lang.String> matchIDs
Constructor Detail

CustomAutomatonMatcher

CustomAutomatonMatcher(java.lang.CharSequence chars,
                       CustomRunAutomaton automaton)
Method Detail

getIDs

private java.util.ArrayList<java.lang.String> getIDs(int state)

findWithDelimitedID

public boolean findWithDelimitedID(char delimiter)
Equivalent to find(), but for automatons with IDs that have been separated from the regular expression by delimiter

Parameters:
delimiter -
Returns:
true if a match (delimited using delimiter) can be found, false otherwise.

find

public boolean find()
Find the next matching subsequence of the input.
This also updates the values for the start, end, and group methods.

Returns:
true if there is a matching subsequence.

setMatch

private void setMatch(int matchStart,
                      int matchEnd,
                      java.util.ArrayList<java.lang.String> matchIDs)
               throws java.lang.IllegalArgumentException
Throws:
java.lang.IllegalArgumentException

setMatch

private void setMatch(int matchStart,
                      int matchEnd)
               throws java.lang.IllegalArgumentException
Throws:
java.lang.IllegalArgumentException

getMatchStart

private int getMatchStart()

getMatchEnd

private int getMatchEnd()

getChars

private java.lang.CharSequence getChars()

end

public int end()
        throws java.lang.IllegalStateException
Returns the offset after the last character matched.

Specified by:
end in interface java.util.regex.MatchResult
Returns:
The offset after the last character matched.
Throws:
java.lang.IllegalStateException - if there has not been a match attempt or if the last attempt yielded no results.

end

public int end(int group)
        throws java.lang.IndexOutOfBoundsException,
               java.lang.IllegalStateException
Returns the offset after the last character matched of the specified capturing group.
Note that because the automaton does not support capturing groups the only valid group is 0 (the entire match).

Specified by:
end in interface java.util.regex.MatchResult
Parameters:
group - the desired capturing group.
Returns:
The offset after the last character matched of the specified capturing group.
Throws:
java.lang.IllegalStateException - if there has not been a match attempt or if the last attempt yielded no results.
java.lang.IndexOutOfBoundsException - if the specified capturing group does not exist in the underlying automaton.

group

public java.lang.String group()
                       throws java.lang.IllegalStateException
Returns the subsequence of the input found by the previous match.

Specified by:
group in interface java.util.regex.MatchResult
Returns:
The subsequence of the input found by the previous match.
Throws:
java.lang.IllegalStateException - if there has not been a match attempt or if the last attempt yielded no results.

group

public java.lang.String group(int group)
                       throws java.lang.IndexOutOfBoundsException,
                              java.lang.IllegalStateException
Returns the subsequence of the input found by the specified capturing group during the previous match operation.
Note that because the automaton does not support capturing groups the only valid group is 0 (the entire match).

Specified by:
group in interface java.util.regex.MatchResult
Parameters:
group - the desired capturing group.
Returns:
The subsequence of the input found by the specified capturing group during the previous match operation the previous match. Or null if the given group did match.
Throws:
java.lang.IllegalStateException - if there has not been a match attempt or if the last attempt yielded no results.
java.lang.IndexOutOfBoundsException - if the specified capturing group does not exist in the underlying automaton.

groupCount

public int groupCount()
Returns the number of capturing groups in the underlying automaton.
Note that because the automaton does not support capturing groups this method will always return 0.

Specified by:
groupCount in interface java.util.regex.MatchResult
Returns:
The number of capturing groups in the underlying automaton.

start

public int start()
          throws java.lang.IllegalStateException
Returns the offset of the first character matched.

Specified by:
start in interface java.util.regex.MatchResult
Returns:
The offset of the first character matched.
Throws:
java.lang.IllegalStateException - if there has not been a match attempt or if the last attempt yielded no results.

start

public int start(int group)
          throws java.lang.IndexOutOfBoundsException,
                 java.lang.IllegalStateException
Returns the offset of the first character matched of the specified capturing group.
Note that because the automaton does not support capturing groups the only valid group is 0 (the entire match).

Specified by:
start in interface java.util.regex.MatchResult
Parameters:
group - the desired capturing group.
Returns:
The offset of the first character matched of the specified capturing group.
Throws:
java.lang.IllegalStateException - if there has not been a match attempt or if the last attempt yielded no results.
java.lang.IndexOutOfBoundsException - if the specified capturing group does not exist in the underlying automaton.

onlyZero

private static void onlyZero(int group)
                      throws java.lang.IndexOutOfBoundsException
Helper method that requires the group argument to be 0.

Throws:
java.lang.IndexOutOfBoundsException

matchGood

private void matchGood()
                throws java.lang.IllegalStateException
Helper method to check that the last match attempt was valid.

Throws:
java.lang.IllegalStateException

getMatchIDs

public java.util.ArrayList<java.lang.String> getMatchIDs()
Returns:
the IDs of the matches from the last call to find()