Package nu.validator.htmlparser.impl
Class ErrorReportingTokenizer
java.lang.Object
nu.validator.htmlparser.impl.Tokenizer
nu.validator.htmlparser.impl.ErrorReportingTokenizer
- All Implemented Interfaces:
Locator
-
Field Summary
Fields inherited from class nu.validator.htmlparser.impl.Tokenizer
AFTER_ATTRIBUTE_NAME, AFTER_ATTRIBUTE_VALUE_QUOTED, AFTER_DOCTYPE_NAME, AFTER_DOCTYPE_PUBLIC_IDENTIFIER, AFTER_DOCTYPE_PUBLIC_KEYWORD, AFTER_DOCTYPE_SYSTEM_IDENTIFIER, AFTER_DOCTYPE_SYSTEM_KEYWORD, ampersandLocation, ATTRIBUTE_NAME, ATTRIBUTE_VALUE_DOUBLE_QUOTED, ATTRIBUTE_VALUE_SINGLE_QUOTED, ATTRIBUTE_VALUE_UNQUOTED, attributeName, BEFORE_ATTRIBUTE_NAME, BEFORE_ATTRIBUTE_VALUE, BEFORE_DOCTYPE_NAME, BEFORE_DOCTYPE_PUBLIC_IDENTIFIER, BEFORE_DOCTYPE_SYSTEM_IDENTIFIER, BETWEEN_DOCTYPE_PUBLIC_AND_SYSTEM_IDENTIFIERS, BOGUS_COMMENT, BOGUS_COMMENT_HYPHEN, BOGUS_DOCTYPE, CDATA_RSQB, CDATA_RSQB_RSQB, CDATA_SECTION, CDATA_START, CHARACTER_REFERENCE_HILO_LOOKUP, CHARACTER_REFERENCE_TAIL, CLOSE_TAG_OPEN, COMMENT, COMMENT_END, COMMENT_END_BANG, COMMENT_END_DASH, COMMENT_START, COMMENT_START_DASH, confident, CONSUME_CHARACTER_REFERENCE, CONSUME_NCR, cstart, currentBufferGlobalOffset, DATA, DECIMAL_NRC_LOOP, DOCTYPE, DOCTYPE_NAME, DOCTYPE_PUBLIC_IDENTIFIER_DOUBLE_QUOTED, DOCTYPE_PUBLIC_IDENTIFIER_SINGLE_QUOTED, DOCTYPE_SYSTEM_IDENTIFIER_DOUBLE_QUOTED, DOCTYPE_SYSTEM_IDENTIFIER_SINGLE_QUOTED, DOCTYPE_UBLIC, DOCTYPE_YSTEM, encodingDeclarationHandler, endTag, endTagExpectation, errorHandler, HANDLE_NCR_VALUE, HANDLE_NCR_VALUE_RECONSUME, HEX_NCR_LOOP, html4, index, lastCR, MARKUP_DECLARATION_HYPHEN, MARKUP_DECLARATION_OCTYPE, MARKUP_DECLARATION_OPEN, NON_DATA_END_TAG_NAME, PLAINTEXT, RAWTEXT, RAWTEXT_RCDATA_LESS_THAN_SIGN, RCDATA, SCRIPT_DATA, SCRIPT_DATA_DOUBLE_ESCAPE_END, SCRIPT_DATA_DOUBLE_ESCAPE_START, SCRIPT_DATA_DOUBLE_ESCAPED, SCRIPT_DATA_DOUBLE_ESCAPED_DASH, SCRIPT_DATA_DOUBLE_ESCAPED_DASH_DASH, SCRIPT_DATA_DOUBLE_ESCAPED_LESS_THAN_SIGN, SCRIPT_DATA_ESCAPE_START, SCRIPT_DATA_ESCAPE_START_DASH, SCRIPT_DATA_ESCAPED, SCRIPT_DATA_ESCAPED_DASH, SCRIPT_DATA_ESCAPED_DASH_DASH, SCRIPT_DATA_ESCAPED_LESS_THAN_SIGN, SCRIPT_DATA_LESS_THAN_SIGN, SELF_CLOSING_START_TAG, stateSave, TAG_NAME, TAG_OPEN, tokenHandler, value
-
Constructor Summary
ConstructorsConstructorDescriptionErrorReportingTokenizer
(TokenHandler tokenHandler) ErrorReportingTokenizer
(TokenHandler tokenHandler, boolean newAttributesEachTime) -
Method Summary
Modifier and TypeMethodDescriptionprotected char
checkChar
(char[] buf, int pos) protected void
errAstralNonCharacter
(int ch) protected void
protected void
errBadCharAfterLt
(char c) protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
errHtml4LtSlashInRcdata
(char folded) protected void
protected void
protected void
protected void
errLtGt()
protected void
protected void
protected void
protected void
protected void
protected char
errNcrControlChar
(char ch) protected void
errNcrCr()
protected void
protected char
errNcrNonCharacter
(char ch) protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
protected void
errQuoteBeforeAttributeName
(char c) protected void
protected void
protected void
protected void
errUnquotedAttributeValOrNull
(char c) protected void
protected void
flushChars
(char[] buf, int pos) Flushes coalesced character tokens.int
getCol()
Returns the col.int
int
getLine()
Returns the line.int
boolean
Returns the alreadyComplainedAboutNonAscii.boolean
Returns the nextCharOnNewLine.protected void
protected void
maybeErrSlashInEndTag
(boolean selfClosing) protected void
maybeWarnPrivateUse
(char ch) protected void
void
Reports on an event based on profile selected.protected void
protected void
void
setContentNonXmlCharPolicy
(XmlViolationPolicy contentNonXmlCharPolicy) Sets the contentNonXmlCharPolicy.void
setErrorProfile
(HashMap<String, String> errorProfileMap) Sets the errorProfile.void
setTransitionBaseOffset
(int offset) Sets an offset to be added to the position reported toTransitionHandler
.void
setTransitionHandler
(TransitionHandler transitionHandler) Sets the transitionHandler.protected void
protected void
protected void
protected int
transition
(int from, int to, boolean reconsume, int pos) Methods inherited from class nu.validator.htmlparser.impl.Tokenizer
becomeConfident, end, eof, err, errTreeBuilder, fatal, getErrorHandler, getPublicId, getSystemId, initializeWithoutStarting, initLocation, internalEncodingDeclaration, isInDataState, isMappingLangToXmlLang, isPrevCR, loadState, notifyAboutMetaBoundary, requestSuspension, resetToDataState, setCommentPolicy, setContentSpacePolicy, setEncodingDeclarationHandler, setErrorHandler, setHtml4ModeCompatibleWithXhtml1Schemata, setInterner, setLineNumber, setMappingLangToXmlLang, setNamePolicy, setStateAndEndTagExpectation, setStateAndEndTagExpectation, setXmlnsPolicy, start, strBufToString, tokenizeBuffer, warn
-
Constructor Details
-
ErrorReportingTokenizer
- Parameters:
tokenHandler
-newAttributesEachTime
-
-
ErrorReportingTokenizer
- Parameters:
tokenHandler
-
-
-
Method Details
-
getLineNumber
public int getLineNumber()- Specified by:
getLineNumber
in interfaceLocator
- Overrides:
getLineNumber
in classTokenizer
- See Also:
-
getColumnNumber
public int getColumnNumber()- Specified by:
getColumnNumber
in interfaceLocator
- Overrides:
getColumnNumber
in classTokenizer
- See Also:
-
setContentNonXmlCharPolicy
Sets the contentNonXmlCharPolicy.- Overrides:
setContentNonXmlCharPolicy
in classTokenizer
- Parameters:
contentNonXmlCharPolicy
- the contentNonXmlCharPolicy to set
-
setErrorProfile
Sets the errorProfile.- Parameters:
errorProfile
-
-
note
Reports on an event based on profile selected.- Parameters:
profile
- the profile this message belongs tomessage
- the message itself- Throws:
SAXException
-
startErrorReporting
- Overrides:
startErrorReporting
in classTokenizer
- Throws:
SAXException
-
silentCarriageReturn
protected void silentCarriageReturn()- Overrides:
silentCarriageReturn
in classTokenizer
-
silentLineFeed
protected void silentLineFeed()- Overrides:
silentLineFeed
in classTokenizer
-
getLine
public int getLine()Returns the line. -
getCol
public int getCol()Returns the col. -
isNextCharOnNewLine
public boolean isNextCharOnNewLine()Returns the nextCharOnNewLine.- Overrides:
isNextCharOnNewLine
in classTokenizer
- Returns:
- the nextCharOnNewLine
-
isAlreadyComplainedAboutNonAscii
public boolean isAlreadyComplainedAboutNonAscii()Returns the alreadyComplainedAboutNonAscii.- Overrides:
isAlreadyComplainedAboutNonAscii
in classTokenizer
- Returns:
- the alreadyComplainedAboutNonAscii
-
flushChars
Flushes coalesced character tokens.- Overrides:
flushChars
in classTokenizer
- Parameters:
buf
- TODOpos
- TODO- Throws:
SAXException
-
checkChar
- Overrides:
checkChar
in classTokenizer
- Throws:
SAXException
-
transition
- Overrides:
transition
in classTokenizer
- Throws:
SAXException
- See Also:
-
errGarbageAfterLtSlash
- Overrides:
errGarbageAfterLtSlash
in classTokenizer
- Throws:
SAXException
-
errLtSlashGt
- Overrides:
errLtSlashGt
in classTokenizer
- Throws:
SAXException
-
errWarnLtSlashInRcdata
- Overrides:
errWarnLtSlashInRcdata
in classTokenizer
- Throws:
SAXException
-
errHtml4LtSlashInRcdata
- Overrides:
errHtml4LtSlashInRcdata
in classTokenizer
- Throws:
SAXException
-
errCharRefLacksSemicolon
- Overrides:
errCharRefLacksSemicolon
in classTokenizer
- Throws:
SAXException
-
errNoDigitsInNCR
- Overrides:
errNoDigitsInNCR
in classTokenizer
- Throws:
SAXException
-
errGtInSystemId
- Overrides:
errGtInSystemId
in classTokenizer
- Throws:
SAXException
-
errGtInPublicId
- Overrides:
errGtInPublicId
in classTokenizer
- Throws:
SAXException
-
errNamelessDoctype
- Overrides:
errNamelessDoctype
in classTokenizer
- Throws:
SAXException
-
errConsecutiveHyphens
- Overrides:
errConsecutiveHyphens
in classTokenizer
- Throws:
SAXException
-
errPrematureEndOfComment
- Overrides:
errPrematureEndOfComment
in classTokenizer
- Throws:
SAXException
-
errBogusComment
- Overrides:
errBogusComment
in classTokenizer
- Throws:
SAXException
-
errUnquotedAttributeValOrNull
- Overrides:
errUnquotedAttributeValOrNull
in classTokenizer
- Throws:
SAXException
-
errSlashNotFollowedByGt
- Overrides:
errSlashNotFollowedByGt
in classTokenizer
- Throws:
SAXException
-
errHtml4XmlVoidSyntax
- Overrides:
errHtml4XmlVoidSyntax
in classTokenizer
- Throws:
SAXException
-
errNoSpaceBetweenAttributes
- Overrides:
errNoSpaceBetweenAttributes
in classTokenizer
- Throws:
SAXException
-
errHtml4NonNameInUnquotedAttribute
- Overrides:
errHtml4NonNameInUnquotedAttribute
in classTokenizer
- Throws:
SAXException
-
errLtOrEqualsOrGraveInUnquotedAttributeOrNull
- Overrides:
errLtOrEqualsOrGraveInUnquotedAttributeOrNull
in classTokenizer
- Throws:
SAXException
-
errAttributeValueMissing
- Overrides:
errAttributeValueMissing
in classTokenizer
- Throws:
SAXException
-
errBadCharBeforeAttributeNameOrNull
- Overrides:
errBadCharBeforeAttributeNameOrNull
in classTokenizer
- Throws:
SAXException
-
errEqualsSignBeforeAttributeName
- Overrides:
errEqualsSignBeforeAttributeName
in classTokenizer
- Throws:
SAXException
-
errBadCharAfterLt
- Overrides:
errBadCharAfterLt
in classTokenizer
- Throws:
SAXException
-
errLtGt
- Overrides:
errLtGt
in classTokenizer
- Throws:
SAXException
-
errProcessingInstruction
- Overrides:
errProcessingInstruction
in classTokenizer
- Throws:
SAXException
-
errUnescapedAmpersandInterpretedAsCharacterReference
- Overrides:
errUnescapedAmpersandInterpretedAsCharacterReference
in classTokenizer
- Throws:
SAXException
-
errNotSemicolonTerminated
- Overrides:
errNotSemicolonTerminated
in classTokenizer
- Throws:
SAXException
-
errNoNamedCharacterMatch
- Overrides:
errNoNamedCharacterMatch
in classTokenizer
- Throws:
SAXException
-
errQuoteBeforeAttributeName
- Overrides:
errQuoteBeforeAttributeName
in classTokenizer
- Throws:
SAXException
-
errQuoteOrLtInAttributeNameOrNull
- Overrides:
errQuoteOrLtInAttributeNameOrNull
in classTokenizer
- Throws:
SAXException
-
errExpectedPublicId
- Overrides:
errExpectedPublicId
in classTokenizer
- Throws:
SAXException
-
errBogusDoctype
- Overrides:
errBogusDoctype
in classTokenizer
- Throws:
SAXException
-
maybeWarnPrivateUseAstral
- Overrides:
maybeWarnPrivateUseAstral
in classTokenizer
- Throws:
SAXException
-
maybeWarnPrivateUse
- Overrides:
maybeWarnPrivateUse
in classTokenizer
- Throws:
SAXException
-
maybeErrAttributesOnEndTag
- Overrides:
maybeErrAttributesOnEndTag
in classTokenizer
- Throws:
SAXException
-
maybeErrSlashInEndTag
- Overrides:
maybeErrSlashInEndTag
in classTokenizer
- Throws:
SAXException
-
errNcrNonCharacter
- Overrides:
errNcrNonCharacter
in classTokenizer
- Throws:
SAXException
-
errAstralNonCharacter
- Overrides:
errAstralNonCharacter
in classTokenizer
- Throws:
SAXException
- See Also:
-
errNcrSurrogate
- Overrides:
errNcrSurrogate
in classTokenizer
- Throws:
SAXException
-
errNcrControlChar
- Overrides:
errNcrControlChar
in classTokenizer
- Throws:
SAXException
-
errNcrCr
- Overrides:
errNcrCr
in classTokenizer
- Throws:
SAXException
-
errNcrInC1Range
- Overrides:
errNcrInC1Range
in classTokenizer
- Throws:
SAXException
-
errEofInPublicId
- Overrides:
errEofInPublicId
in classTokenizer
- Throws:
SAXException
-
errEofInComment
- Overrides:
errEofInComment
in classTokenizer
- Throws:
SAXException
-
errEofInDoctype
- Overrides:
errEofInDoctype
in classTokenizer
- Throws:
SAXException
-
errEofInAttributeValue
- Overrides:
errEofInAttributeValue
in classTokenizer
- Throws:
SAXException
-
errEofInAttributeName
- Overrides:
errEofInAttributeName
in classTokenizer
- Throws:
SAXException
-
errEofWithoutGt
- Overrides:
errEofWithoutGt
in classTokenizer
- Throws:
SAXException
-
errEofInTagName
- Overrides:
errEofInTagName
in classTokenizer
- Throws:
SAXException
-
errEofInEndTag
- Overrides:
errEofInEndTag
in classTokenizer
- Throws:
SAXException
-
errEofAfterLt
- Overrides:
errEofAfterLt
in classTokenizer
- Throws:
SAXException
-
errNcrOutOfRange
- Overrides:
errNcrOutOfRange
in classTokenizer
- Throws:
SAXException
-
errNcrUnassigned
- Overrides:
errNcrUnassigned
in classTokenizer
- Throws:
SAXException
-
errDuplicateAttribute
- Overrides:
errDuplicateAttribute
in classTokenizer
- Throws:
SAXException
-
errEofInSystemId
- Overrides:
errEofInSystemId
in classTokenizer
- Throws:
SAXException
-
errExpectedSystemId
- Overrides:
errExpectedSystemId
in classTokenizer
- Throws:
SAXException
-
errMissingSpaceBeforeDoctypeName
- Overrides:
errMissingSpaceBeforeDoctypeName
in classTokenizer
- Throws:
SAXException
-
errHyphenHyphenBang
- Overrides:
errHyphenHyphenBang
in classTokenizer
- Throws:
SAXException
-
errNcrControlChar
- Overrides:
errNcrControlChar
in classTokenizer
- Throws:
SAXException
-
errNcrZero
- Overrides:
errNcrZero
in classTokenizer
- Throws:
SAXException
-
errNoSpaceBetweenDoctypeSystemKeywordAndQuote
- Overrides:
errNoSpaceBetweenDoctypeSystemKeywordAndQuote
in classTokenizer
- Throws:
SAXException
-
errNoSpaceBetweenPublicAndSystemIds
- Overrides:
errNoSpaceBetweenPublicAndSystemIds
in classTokenizer
- Throws:
SAXException
-
errNoSpaceBetweenDoctypePublicKeywordAndQuote
- Overrides:
errNoSpaceBetweenDoctypePublicKeywordAndQuote
in classTokenizer
- Throws:
SAXException
-
noteAttributeWithoutValue
- Overrides:
noteAttributeWithoutValue
in classTokenizer
- Throws:
SAXException
-
noteUnquotedAttributeValue
- Overrides:
noteUnquotedAttributeValue
in classTokenizer
- Throws:
SAXException
-
setTransitionHandler
Sets the transitionHandler.- Parameters:
transitionHandler
- the transitionHandler to set
-
setTransitionBaseOffset
public void setTransitionBaseOffset(int offset) Sets an offset to be added to the position reported toTransitionHandler
.- Overrides:
setTransitionBaseOffset
in classTokenizer
- Parameters:
offset
- the offset
-