Package nu.validator.htmlparser.impl
Class TreeBuilder<T>
java.lang.Object
nu.validator.htmlparser.impl.TreeBuilder<T>
- All Implemented Interfaces:
TokenHandler
,TreeBuilderState<T>
- Direct Known Subclasses:
CoalescingTreeBuilder
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected char[]
protected int
protected ErrorHandler
protected Tokenizer
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected void
accumulateCharacters
(char[] buf, int start, int length) protected abstract void
addAttributesToElement
(T element, HtmlAttributes attributes) protected abstract void
appendCharacters
(T parent, char[] buf, int start, int length) protected abstract void
appendChildrenToNewParent
(T oldParent, T newParent) protected abstract void
appendComment
(T parent, char[] buf, int start, int length) protected abstract void
appendCommentToDocument
(char[] buf, int start, int length) protected void
appendDoctypeToDocument
(String name, String publicIdentifier, String systemIdentifier) protected abstract void
appendElement
(T child, T newParent) protected abstract void
appendIsindexPrompt
(T parent) boolean
Checks if the CDATA sections are allowed.final void
characters
(char[] buf, int start, int length) Receive character tokens.final void
comment
(char[] buf, int start, int length) Receive a comment token.protected abstract T
createElement
(String ns, String name, HtmlAttributes attributes) protected T
createElement
(String ns, String name, HtmlAttributes attributes, T form) protected abstract T
createHtmlElementSetAsRoot
(HtmlAttributes attributes) protected final T
protected abstract void
detachFromParent
(T element) final void
Receive a doctype token.protected void
documentMode
(DocumentMode m, String publicIdentifier, String systemIdentifier, boolean html4SpecificAdditionalErrorChecks) protected void
elementPopped
(String ns, String name, T node) protected void
elementPushed
(String ns, String name, T node) protected void
end()
void
endTag
(ElementName elementName) Receive an end tag token.final void
The perform final cleanup.final void
eof()
The end-of-file token.static String
extractCharsetFromContent
(String attributeValue) C++ memory note: The return value must be released.protected void
fatal()
Reports an condition that would make the infoset incompatible with XML 1.0 as fatal.protected final void
final void
Flushes the pending characters.Returns the deepTreeSurrogateParent.Returns the errorHandler.Returns the formPointer.Returns the headPointer.nu.validator.htmlparser.impl.StackNode<T>[]
Returns the listOfActiveFormattingElements.int
Return the length of the list of active formatting elements.int
getMode()
Returns the mode.int
Returns the originalMode.nu.validator.htmlparser.impl.StackNode<T>[]
getStack()
Returns the stack.int
Return the length of the stack.protected abstract boolean
hasChildren
(T element) protected abstract void
insertFosterParentedCharacters
(char[] buf, int start, int length, T table, T stackParent) protected abstract void
insertFosterParentedChild
(T child, T table, T stackParent) boolean
Returns the framesetOk.boolean
Returns the needToDropLF.boolean
isQuirks()
Returns the quirks.boolean
Returns the scriptingEnabled.void
loadState
(TreeBuilderState<T> snapshot, Interner interner) protected void
markMalformedIfScript
(T elt) Creates a comparable snapshot of the tree builder state.protected final void
void
setDoctypeExpectation
(DoctypeExpectation doctypeExpectation) Sets the doctypeExpectation.void
setDocumentModeHandler
(DocumentModeHandler documentModeHandler) Sets the documentModeHandler.final void
setErrorHandler
(ErrorHandler errorHandler) Sets the errorHandler.final void
setFragmentContext
(String context) The argument MUST be an interned string ornull
.final void
setFragmentContext
(String context, String ns, T node, boolean quirks) The argument MUST be an interned string ornull
.void
setIgnoringComments
(boolean ignoreComments) void
setNamePolicy
(XmlViolationPolicy namePolicy) void
setReportingDoctype
(boolean reportingDoctype) Sets the reportingDoctype.void
setScriptingEnabled
(boolean scriptingEnabled) Sets the scriptingEnabled.boolean
snapshotMatches
(TreeBuilderState<T> snapshot) protected void
start
(boolean fragmentMode) void
startTag
(ElementName elementName, HtmlAttributes attributes, boolean selfClosing) Receive a start tag token.final void
startTokenization
(Tokenizer self) This method is called at the start of tokenization before any other methods on this interface are called.boolean
If this handler implementation cares about comments, returntrue
.void
Reports a U+0000 that's being turned into a U+FFFD.
-
Field Details
-
tokenizer
-
errorHandler
-
charBuffer
protected char[] charBuffer -
charBufferLen
protected int charBufferLen
-
-
Constructor Details
-
TreeBuilder
protected TreeBuilder()
-
-
Method Details
-
fatal
Reports an condition that would make the infoset incompatible with XML 1.0 as fatal.- Throws:
SAXException
SAXParseException
-
fatal
- Throws:
SAXException
-
startTokenization
Description copied from interface:TokenHandler
This method is called at the start of tokenization before any other methods on this interface are called. Implementations should hold the reference to theTokenizer
in order to set the content model flag and in order to be able to query forLocator
data.- Specified by:
startTokenization
in interfaceTokenHandler
- Parameters:
self
- theTokenizer
.- Throws:
SAXException
- if something went wrong
-
doctype
public final void doctype(String name, String publicIdentifier, String systemIdentifier, boolean forceQuirks) throws SAXException Description copied from interface:TokenHandler
Receive a doctype token.- Specified by:
doctype
in interfaceTokenHandler
- Parameters:
name
- the namepublicIdentifier
- the public idsystemIdentifier
- the system idforceQuirks
- whether the token is correct- Throws:
SAXException
- if something went wrong
-
comment
Description copied from interface:TokenHandler
Receive a comment token. The data is junk if thewantsComments()
returnedfalse
.- Specified by:
comment
in interfaceTokenHandler
- Parameters:
buf
- a buffer holding the datastart
- the offset into the bufferlength
- the number of code units to read- Throws:
SAXException
- if something went wrong
-
characters
Description copied from interface:TokenHandler
Receive character tokens. This method has the same semantics as the SAX method of the same name.- Specified by:
characters
in interfaceTokenHandler
- Parameters:
buf
- a buffer holding the datastart
- offset into the bufferlength
- the number of code units to read- Throws:
SAXException
- if something went wrong- See Also:
-
zeroOriginatingReplacementCharacter
Description copied from interface:TokenHandler
Reports a U+0000 that's being turned into a U+FFFD.- Specified by:
zeroOriginatingReplacementCharacter
in interfaceTokenHandler
- Throws:
SAXException
- if something went wrong- See Also:
-
eof
Description copied from interface:TokenHandler
The end-of-file token.- Specified by:
eof
in interfaceTokenHandler
- Throws:
SAXException
- if something went wrong
-
endTokenization
Description copied from interface:TokenHandler
The perform final cleanup.- Specified by:
endTokenization
in interfaceTokenHandler
- Throws:
SAXException
- if something went wrong- See Also:
-
startTag
public void startTag(ElementName elementName, HtmlAttributes attributes, boolean selfClosing) throws SAXException Description copied from interface:TokenHandler
Receive a start tag token.- Specified by:
startTag
in interfaceTokenHandler
- Parameters:
elementName
- the tag nameattributes
- the attributesselfClosing
- TODO- Throws:
SAXException
- if something went wrong
-
extractCharsetFromContent
C++ memory note: The return value must be released.
- Returns:
- Throws:
SAXException
StopSniffingException
-
endTag
Description copied from interface:TokenHandler
Receive an end tag token.- Specified by:
endTag
in interfaceTokenHandler
- Parameters:
elementName
- the tag name- Throws:
SAXException
- if something went wrong
-
accumulateCharacters
- Throws:
SAXException
-
requestSuspension
protected final void requestSuspension() -
createElement
protected abstract T createElement(String ns, String name, HtmlAttributes attributes) throws SAXException - Throws:
SAXException
-
createElement
protected T createElement(String ns, String name, HtmlAttributes attributes, T form) throws SAXException - Throws:
SAXException
-
createHtmlElementSetAsRoot
- Throws:
SAXException
-
detachFromParent
- Throws:
SAXException
-
hasChildren
- Throws:
SAXException
-
appendElement
- Throws:
SAXException
-
appendChildrenToNewParent
- Throws:
SAXException
-
insertFosterParentedChild
protected abstract void insertFosterParentedChild(T child, T table, T stackParent) throws SAXException - Throws:
SAXException
-
insertFosterParentedCharacters
protected abstract void insertFosterParentedCharacters(char[] buf, int start, int length, T table, T stackParent) throws SAXException - Throws:
SAXException
-
appendCharacters
protected abstract void appendCharacters(T parent, char[] buf, int start, int length) throws SAXException - Throws:
SAXException
-
appendIsindexPrompt
- Throws:
SAXException
-
appendComment
protected abstract void appendComment(T parent, char[] buf, int start, int length) throws SAXException - Throws:
SAXException
-
appendCommentToDocument
protected abstract void appendCommentToDocument(char[] buf, int start, int length) throws SAXException - Throws:
SAXException
-
addAttributesToElement
protected abstract void addAttributesToElement(T element, HtmlAttributes attributes) throws SAXException - Throws:
SAXException
-
markMalformedIfScript
- Throws:
SAXException
-
start
- Throws:
SAXException
-
end
- Throws:
SAXException
-
appendDoctypeToDocument
protected void appendDoctypeToDocument(String name, String publicIdentifier, String systemIdentifier) throws SAXException - Throws:
SAXException
-
elementPushed
- Throws:
SAXException
-
elementPopped
- Throws:
SAXException
-
documentMode
protected void documentMode(DocumentMode m, String publicIdentifier, String systemIdentifier, boolean html4SpecificAdditionalErrorChecks) throws SAXException - Throws:
SAXException
-
wantsComments
public boolean wantsComments()Description copied from interface:TokenHandler
If this handler implementation cares about comments, returntrue
. If not, returnfalse
.- Specified by:
wantsComments
in interfaceTokenHandler
- Returns:
- whether this handler wants comments
- See Also:
-
setIgnoringComments
public void setIgnoringComments(boolean ignoreComments) -
setErrorHandler
Sets the errorHandler.- Parameters:
errorHandler
- the errorHandler to set
-
getErrorHandler
Returns the errorHandler.- Returns:
- the errorHandler
-
setFragmentContext
The argument MUST be an interned string ornull
.- Parameters:
context
-
-
cdataSectionAllowed
Description copied from interface:TokenHandler
Checks if the CDATA sections are allowed.- Specified by:
cdataSectionAllowed
in interfaceTokenHandler
- Returns:
true
if CDATA sections are allowed- Throws:
SAXException
- if something went wrong- See Also:
-
setFragmentContext
The argument MUST be an interned string ornull
.- Parameters:
context
-
-
currentNode
-
isScriptingEnabled
public boolean isScriptingEnabled()Returns the scriptingEnabled.- Returns:
- the scriptingEnabled
-
setScriptingEnabled
public void setScriptingEnabled(boolean scriptingEnabled) Sets the scriptingEnabled.- Parameters:
scriptingEnabled
- the scriptingEnabled to set
-
setDoctypeExpectation
Sets the doctypeExpectation.- Parameters:
doctypeExpectation
- the doctypeExpectation to set
-
setNamePolicy
-
setDocumentModeHandler
Sets the documentModeHandler.- Parameters:
documentModeHandler
- the documentModeHandler to set
-
setReportingDoctype
public void setReportingDoctype(boolean reportingDoctype) Sets the reportingDoctype.- Parameters:
reportingDoctype
- the reportingDoctype to set
-
flushCharacters
Flushes the pending characters. Public for document.write use cases only.- Throws:
SAXException
-
newSnapshot
Creates a comparable snapshot of the tree builder state. Snapshot creation is only supported immediately after a script end tag has been processed. In C++ the caller is responsible for callingdelete
on the returned object.- Returns:
- a snapshot.
- Throws:
SAXException
-
snapshotMatches
-
loadState
- Throws:
SAXException
-
getFormPointer
Description copied from interface:TreeBuilderState
Returns the formPointer.- Specified by:
getFormPointer
in interfaceTreeBuilderState<T>
- Returns:
- the formPointer
- See Also:
-
getHeadPointer
Returns the headPointer.- Specified by:
getHeadPointer
in interfaceTreeBuilderState<T>
- Returns:
- the headPointer
-
getDeepTreeSurrogateParent
Returns the deepTreeSurrogateParent.- Specified by:
getDeepTreeSurrogateParent
in interfaceTreeBuilderState<T>
- Returns:
- the deepTreeSurrogateParent
-
getListOfActiveFormattingElements
Description copied from interface:TreeBuilderState
Returns the listOfActiveFormattingElements.- Specified by:
getListOfActiveFormattingElements
in interfaceTreeBuilderState<T>
- Returns:
- the listOfActiveFormattingElements
- See Also:
-
getStack
Description copied from interface:TreeBuilderState
Returns the stack.- Specified by:
getStack
in interfaceTreeBuilderState<T>
- Returns:
- the stack
- See Also:
-
getMode
public int getMode()Returns the mode.- Specified by:
getMode
in interfaceTreeBuilderState<T>
- Returns:
- the mode
-
getOriginalMode
public int getOriginalMode()Returns the originalMode.- Specified by:
getOriginalMode
in interfaceTreeBuilderState<T>
- Returns:
- the originalMode
-
isFramesetOk
public boolean isFramesetOk()Returns the framesetOk.- Specified by:
isFramesetOk
in interfaceTreeBuilderState<T>
- Returns:
- the framesetOk
-
isNeedToDropLF
public boolean isNeedToDropLF()Returns the needToDropLF.- Specified by:
isNeedToDropLF
in interfaceTreeBuilderState<T>
- Returns:
- the needToDropLF
-
isQuirks
public boolean isQuirks()Returns the quirks.- Specified by:
isQuirks
in interfaceTreeBuilderState<T>
- Returns:
- the quirks
-
getListOfActiveFormattingElementsLength
public int getListOfActiveFormattingElementsLength()Description copied from interface:TreeBuilderState
Return the length of the list of active formatting elements.- Specified by:
getListOfActiveFormattingElementsLength
in interfaceTreeBuilderState<T>
- Returns:
- the length of the list of active formatting elements.
- See Also:
-
getStackLength
public int getStackLength()Description copied from interface:TreeBuilderState
Return the length of the stack.- Specified by:
getStackLength
in interfaceTreeBuilderState<T>
- Returns:
- the length of the stack.
- See Also:
-