字符流类
根据各种选项的组合生成4 种不同类型的CharStream类。
类型 | 选项 | 描述 |
---|---|---|
ASCII_CharStream | 当两个选项 - UNICODE_INPUTor都未JAVA_UNICODE_ESCAPE设置时生成。 | 此类将输入视为 1 字节 ( ISO-LATIN1) 字符流。请注意,此类也可用于解析二进制文件。它只是读取一个字节并将其作为 16 位数量返回给词法分析器。所以这个类返回的任何字符都在\u0000-范围内\u00ff。 |
ASCII_UCodeESC_CharStream | JAVA_UNICODE_ESCAPE在设置选项和UNICODE_INPUT未设置选项时生成。 | 此类将输入视为 1 字节字符流。但是,特殊转义序列(“\\”)* “\” (“u”)+被视为一个标记,指示标记后面的接下来的 4 个字节将是十六进制数字,形成一个 4 位十六进制数,其值将被视为第一个反斜杠指示的位置处的字符值. 请注意,此值可以是0x0-范围内的任何值0xffff。 |
UCode_CharStream | UNICODE_INPUT在设置选项和JAVA_UNICODE_ESCAPE未设置选项时生成。 | 此类将输入视为 2 字节字符流。所以它读取 2 个字节b1,并使用假定大端顺序b2的表达式将它们作为单个字符返回。b1 << 8 |
UCode_UCodeESC_CharStream | 当同时设置选项UNICODE_INPUT和时生成JAVA_UNICODE_ESCAPE。 | This class input is a stream of 2-byte characters (just like the UCode_CharStream class) and the special escape sequence (“\\”)* “\” (“u”)+ is treated as a tag indicating that the next 4 2-byte characters following the tag will be hexadecimal digits forming a 4-digit hex number whose value will be treated as the value of the character at the position indicated by the first backslash. Note that this value can be any value in the range 0x0-0xffff. Also note that the backslash(es) and u(s) are all assumed to be given as 2-byte characters (with the higher order byte value being 0). |
NB 以上类都不能用于以混合模式读取字符,即一些字符以 1 字节字符给出,而另一些字符为 2 字节字符。为此,您需要将USER_CHAR_STREAM选项设置为true并定义您自己的CharStream.
在以下部分中,我们将使用XXXCharStream代表上述 4 个类中的任何一个的符号。
构造函数
/**
* Takes an input stream, starting line and column numbers
* and constructs a CharStream object. It also creates buffers
* of initial size 4K for buffering the characters and also for
* line and column numbers for each of those characters.
*/
public XXXCharStream(java.io.InputStream dstream, int startline, int startcolumn)
/**
* Takes an input stream, starting line and column numbers
* and constructs a CharStream object. It also creates buffers
* of initial size buffsize for buffering the characters and also
* for line and column numbers for each of those characters.
*/
public XXXCharStream(java.io.InputStream dstream, int startline, int startcolumn, int buffersize)
因此,当您对可能出现的任何令牌的最大大小进行估计时,您可以使用该大小来优化缓冲区大小。请注意,这些尺寸只是初始尺寸,它们将在需要时扩展(以 2K 为步长)。
方法
以下所有方法将是静态的或非静态的,具体取决于STATIC选项是在生成时true还是false在生成时。此外,此处仅记录了用户可以在其词法操作中使用的那些方法(使用input_stream词法分析器的变量)。
其余(公共)方法与词法分析器的实现紧密耦合,因此不应在词法操作中使用。将来我们将通过使接口的那部分成为词法分析器的内部类来简化它。
/**
* This method returns the next "character" in the input according
* to the rules of the CharStream class as described above. It will
* throw java.io.IOException if it reaches EOF during the process
* of "constructing" the character. It also updates the line and
* column number and buffers the character for any possible
* backtracking that may be required later. It also stores the
* line and column numbers for the same purpose.
*/
public final char readChar() throws java.io.IOException
/**
* This method returns the line number for the beginning of the current match.
*/
public final int getBeginLine()
/**
* This method returns the column number for the beginning of the current match.
*/
public final int getBeginColumn()
/**
* This method returns the line number for the ending of the current match.
*/
public final int getEndLine()
/**
* This method returns the column number for the ending of the current match.
*/
public final int getEndColumn()
/**
* This method puts back amount number of characters into the stream.
*
* N.B. The amount indicates the number of characters as constructed
* by readChar. Since the buffers used are circular buffers, it cannot
* check for illegal amount values, it just wraps around. So it is the
* user's responsibility to make sure that those many characters are
* really produced before a call to this method.
*/
public final void backup(int amount)
/**
* Returns the image of the current match. As far as the XXXCharStream
* is concerned, all characters after the last call to the private
* method BeginToken are considered a part of the current match.
*/
public final String GetImage()
/**
* This method reinitializes the XXXCharStream classes with a
* (possibly new) input stream and starting line and column numbers.
*/
public void ReInit(java.io.InputStream dstream, int startline, int startcolumn)
/**
* This method reinitializes the XXXCharStream classes with a
* (possibly new) input stream and starting line and column numbers
* and adjusts the size of the buffers to buffersize, by extending them.
*
* N.B. If the value of buffersize is less than the current buffer sizes,
* they remain unchanged.
*/
public void ReInit(java.io.InputStream dstream, int startline, int startcolumn, int buffersize)
/**
* This method adjusts the line and column number of the beginning
* of the current match to newLine and newCol and also adjusts the
* line and column numbers for all the characters in the lookahead buffer.
*/
public void adjustBeginLineColumn(int newLine, int newCol)