Class Source

java.lang.Object
java.io.Reader
org.htmlparser.lexer.Source
All Implemented Interfaces:
Closeable, Serializable, AutoCloseable, Readable
Direct Known Subclasses:
InputStreamSource, StringSource

public abstract class Source extends Reader implements Serializable
A buffered source of characters. A Source is very similar to a Reader, like:
 new InputStreamReader (connection.getInputStream (), charset)
 
It differs from the above, in three ways:
  • the fetching of bytes may be asynchronous
  • the character set may be changed, which resets the input stream
  • characters may be requested more than once, so in general they will be buffered
See Also:
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final int
    Return value when the source is exhausted.

    Fields inherited from class java.io.Reader

    lock
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    abstract int
    Get the number of available characters.
    abstract void
    Does nothing.
    abstract void
    Close the source.
    abstract char
    getCharacter(int offset)
    Retrieve a character again.
    abstract void
    getCharacters(char[] array, int offset, int start, int end)
    Retrieve characters again.
    abstract void
    getCharacters(StringBuffer buffer, int offset, int length)
    Append characters already read into a StringBuffer.
    abstract String
    Get the encoding being used to convert characters.
    abstract String
    getString(int offset, int length)
    Retrieve a string comprised of characters already read.
    abstract void
    mark(int readAheadLimit)
    Mark the present position.
    abstract boolean
    Tell whether this source supports the mark() operation.
    abstract int
    Get the position (in characters).
    abstract int
    Read a single character.
    abstract int
    read(char[] cbuf)
    Read characters into an array.
    abstract int
    read(char[] cbuf, int off, int len)
    Read characters into a portion of an array.
    abstract boolean
    Tell whether this source is ready to be read.
    abstract void
    Reset the source.
    abstract void
    setEncoding(String character_set)
    Set the encoding to the given character set.
    abstract long
    skip(long n)
    Skip characters.
    abstract void
    Undo the read of a single character.

    Methods inherited from class java.io.Reader

    nullReader, read, transferTo

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • EOF

      public static final int EOF
      Return value when the source is exhausted. Has a value of -1.
      See Also:
  • Constructor Details

    • Source

      public Source()
  • Method Details

    • getEncoding

      public abstract String getEncoding()
      Get the encoding being used to convert characters.
      Returns:
      The current encoding.
    • setEncoding

      public abstract void setEncoding(String character_set) throws ParserException
      Set the encoding to the given character set. If the current encoding is the same as the requested encoding, this method is a no-op. Otherwise any subsequent characters read from this source will have been decoded using the given character set.

      If characters have already been consumed from this source, it is expected that an exception will be thrown if the characters read so far would be different if the encoding being set was used from the start.

      Parameters:
      character_set - The character set to use to convert characters.
      Throws:
      ParserException - If a character mismatch occurs between characters already provided and those that would have been returned had the new character set been in effect from the beginning. An exception is also thrown if the character set is not recognized.
    • close

      public abstract void close() throws IOException
      Does nothing. It's supposed to close the source, but use destroy() instead.
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Specified by:
      close in class Reader
      Throws:
      IOException - not used
      See Also:
    • read

      public abstract int read() throws IOException
      Read a single character. This method will block until a character is available, an I/O error occurs, or the source is exhausted.
      Overrides:
      read in class Reader
      Returns:
      The character read, as an integer in the range 0 to 65535 (0x00-0xffff), or EOF if the source is exhausted.
      Throws:
      IOException - If an I/O error occurs.
    • read

      public abstract int read(char[] cbuf, int off, int len) throws IOException
      Read characters into a portion of an array. This method will block until some input is available, an I/O error occurs, or the source is exhausted.
      Specified by:
      read in class Reader
      Parameters:
      cbuf - Destination buffer
      off - Offset at which to start storing characters
      len - Maximum number of characters to read
      Returns:
      The number of characters read, or EOF if the source is exhausted.
      Throws:
      IOException - If an I/O error occurs.
    • read

      public abstract int read(char[] cbuf) throws IOException
      Read characters into an array. This method will block until some input is available, an I/O error occurs, or the source is exhausted.
      Overrides:
      read in class Reader
      Parameters:
      cbuf - Destination buffer.
      Returns:
      The number of characters read, or EOF if the source is exhausted.
      Throws:
      IOException - If an I/O error occurs.
    • ready

      public abstract boolean ready() throws IOException
      Tell whether this source is ready to be read.
      Overrides:
      ready in class Reader
      Returns:
      true if the next read() is guaranteed not to block for input, false otherwise. Note that returning false does not guarantee that the next read will block.
      Throws:
      IOException - If an I/O error occurs.
    • reset

      public abstract void reset()
      Reset the source. Repositions the read point to begin at zero.
      Overrides:
      reset in class Reader
    • markSupported

      public abstract boolean markSupported()
      Tell whether this source supports the mark() operation.
      Overrides:
      markSupported in class Reader
      Returns:
      true if and only if this source supports the mark operation.
    • mark

      public abstract void mark(int readAheadLimit) throws IOException
      Mark the present position. Subsequent calls to reset() will attempt to reposition the source to this point. Not all sources support the mark() operation.
      Overrides:
      mark in class Reader
      Parameters:
      readAheadLimit - The minimum number of characters that can be read before this mark becomes invalid.
      Throws:
      IOException - If an I/O error occurs.
    • skip

      public abstract long skip(long n) throws IOException
      Skip characters. This method will block until some characters are available, an I/O error occurs, or the source is exhausted. Note: n is treated as an int
      Overrides:
      skip in class Reader
      Parameters:
      n - The number of characters to skip.
      Returns:
      The number of characters actually skipped
      Throws:
      IOException - If an I/O error occurs.
    • unread

      public abstract void unread() throws IOException
      Undo the read of a single character.
      Throws:
      IOException - If the source is closed or no characters have been read.
    • getCharacter

      public abstract char getCharacter(int offset) throws IOException
      Retrieve a character again.
      Parameters:
      offset - The offset of the character.
      Returns:
      The character at offset.
      Throws:
      IOException - If the source is closed or the offset is beyond offset().
    • getCharacters

      public abstract void getCharacters(char[] array, int offset, int start, int end) throws IOException
      Retrieve characters again.
      Parameters:
      array - The array of characters.
      offset - The starting position in the array where characters are to be placed.
      start - The starting position, zero based.
      end - The ending position (exclusive, i.e. the character at the ending position is not included), zero based.
      Throws:
      IOException - If the source is closed or the start or end is beyond offset().
    • getString

      public abstract String getString(int offset, int length) throws IOException
      Retrieve a string comprised of characters already read.
      Parameters:
      offset - The offset of the first character.
      length - The number of characters to retrieve.
      Returns:
      A string containing the length characters at offset.
      Throws:
      IOException - If the source is closed.
    • getCharacters

      public abstract void getCharacters(StringBuffer buffer, int offset, int length) throws IOException
      Append characters already read into a StringBuffer.
      Parameters:
      buffer - The buffer to append to.
      offset - The offset of the first character.
      length - The number of characters to retrieve.
      Throws:
      IOException - If the source is closed or the offset or (offset + length) is beyond offset().
    • destroy

      public abstract void destroy() throws IOException
      Close the source. Once a source has been closed, further read, ready, mark, reset, skip, unread, getCharacter or getString invocations will throw an IOException. Closing a previously-closed source, however, has no effect.
      Throws:
      IOException - If an I/O error occurs.
    • offset

      public abstract int offset()
      Get the position (in characters).
      Returns:
      The number of characters that have already been read, or EOF if the source is closed.
    • available

      public abstract int available()
      Get the number of available characters.
      Returns:
      The number of characters that can be read without blocking.