Class TagNode

All Implemented Interfaces:
BaseToken, HtmlNode

public class TagNode extends TagToken implements HtmlNode

XML node tag - basic node of the cleaned HTML tree. At the same time, it represents start tag token after HTML parsing phase and before cleaning phase. After cleaning process, tree structure remains containing tag nodes (TagNode class), content (text nodes - ContentNode), comments (CommentNode) and optionally doctype node (DoctypeToken).

  • Constructor Details

    • TagNode

      public TagNode(String name)
  • Method Details

    • getName

      public String getName()
      Overrides:
      getName in class TagToken
    • getAttributeByName

      public String getAttributeByName(String attName)
      Parameters:
      attName -
      Returns:
      Value of the specified attribute, or null if it this tag doesn't contain it.
    • getAttributes

      public Map<String,String> getAttributes()
      Returns the attributes of the tagnode.
      Returns:
      Map instance containing all attribute name/value pairs.
    • getAttributesInLowerCase

      public Map<String,String> getAttributesInLowerCase()
      Returns the attributes of the tagnode in lower case.
      Returns:
      Map instance containing all attribute name/value pairs, with attribute names transformed to lower case
    • setAttributes

      public void setAttributes(Map<String,String> attributes)
      Replace the current set of attributes with a new set.
      Parameters:
      attributes -
    • hasAttribute

      public boolean hasAttribute(String attName)
      Checks existence of specified attribute.
      Parameters:
      attName -
      Returns:
      true if TagNode has attribute
    • addAttribute

      public void addAttribute(String attName, String attValue)
      Adds specified attribute to this tag or overrides existing one.
      Parameters:
      attName -
      attValue -
    • removeAttribute

      public void removeAttribute(String attName)
      Removes specified attribute from this tag.
      Parameters:
      attName -
    • getChildren

      @Deprecated public List<TagNode> getChildren()
      Deprecated.
      use getChildTagList(), will be refactored and possibly removed in future versions. TODO This method should be refactored because is does not properly match the commonly used Java's getter/setter strategy.
      Returns:
      List of child TagNode objects.
    • setChildren

      public void setChildren(List<? extends BaseToken> children)
    • getAllChildren

      public List<? extends BaseToken> getAllChildren()
    • getChildTagList

      public List<TagNode> getChildTagList()
      Returns:
      List of child TagNode objects.
    • hasChildren

      public boolean hasChildren()
      Returns:
      Whether this node has child elements or not.
    • getChildTags

      public TagNode[] getChildTags()
      Returns:
      An array of child TagNode instances.
    • getText

      public CharSequence getText()
      Returns:
      Text content of this node and it's subelements.
    • getChildIndex

      public int getChildIndex(HtmlNode child)
      Parameters:
      child - Child to find index of
      Returns:
      Index of the specified child node inside this node's children, -1 if node is not the child
    • insertChild

      public void insertChild(int index, HtmlNode childToAdd)
      Inserts specified node at specified position in array of children
      Parameters:
      index -
      childToAdd -
    • insertChildBefore

      public void insertChildBefore(HtmlNode node, HtmlNode nodeToInsert)
      Inserts specified node in the list of children before specified child
      Parameters:
      node - Child before which to insert new node
      nodeToInsert - Node to be inserted at specified position
    • insertChildAfter

      public void insertChildAfter(HtmlNode node, HtmlNode nodeToInsert)
      Inserts specified node in the list of children after specified child
      Parameters:
      node - Child after which to insert new node
      nodeToInsert - Node to be inserted at specified position
    • getDocType

      public DoctypeToken getDocType()
    • setDocType

      public void setDocType(DoctypeToken docType)
    • addChild

      public void addChild(Object child)
    • addChildren

      public void addChildren(List newChildren)
      Add all elements from specified list to this node.
      Parameters:
      newChildren -
    • getElementList

      public List<? extends TagNode> getElementList(ITagNodeCondition condition, boolean isRecursive)
      Get all elements in the tree that satisfy specified condition.
      Parameters:
      condition -
      isRecursive -
      Returns:
      List of TagNode instances with specified name.
    • getAllElementsList

      public List<? extends TagNode> getAllElementsList(boolean isRecursive)
    • getAllElements

      public TagNode[] getAllElements(boolean isRecursive)
    • findElementByName

      public TagNode findElementByName(String findName, boolean isRecursive)
    • getElementListByName

      public List<? extends TagNode> getElementListByName(String findName, boolean isRecursive)
    • getElementsByName

      public TagNode[] getElementsByName(String findName, boolean isRecursive)
    • findElementHavingAttribute

      public TagNode findElementHavingAttribute(String attName, boolean isRecursive)
    • getElementListHavingAttribute

      public List<? extends TagNode> getElementListHavingAttribute(String attName, boolean isRecursive)
    • getElementsHavingAttribute

      public TagNode[] getElementsHavingAttribute(String attName, boolean isRecursive)
    • findElementByAttValue

      public TagNode findElementByAttValue(String attName, String attValue, boolean isRecursive, boolean isCaseSensitive)
    • getElementListByAttValue

      public List<? extends TagNode> getElementListByAttValue(String attName, String attValue, boolean isRecursive, boolean isCaseSensitive)
    • getElementsByAttValue

      public TagNode[] getElementsByAttValue(String attName, String attValue, boolean isRecursive, boolean isCaseSensitive)
    • evaluateXPath

      public Object[] evaluateXPath(String xPathExpression) throws XPatherException
      Evaluates XPath expression on give node.
      This is not fully supported XPath parser and evaluator. Examples below show supported elements:
      • //div//a
      • //div//a[@id][@class]
      • /body/*[1]/@type
      • //div[3]//a[@id][@href='r/n4']
      • //div[last() >= 4]//./div[position() = last()])[position() > 22]//li[2]//a
      • //div[2]/@*[2]
      • data(//div//a[@id][@class])
      • //p/last()
      • //body//div[3][@class]//span[12.2<position()]/@id
      • data(//a['v' < @id])
      Parameters:
      xPathExpression -
      Returns:
      result of XPather evaluation.
      Throws:
      XPatherException
    • removeFromTree

      public boolean removeFromTree()
      Remove this node from the tree.
      Returns:
      True if element is removed (if it is not root node).
    • removeChild

      public boolean removeChild(Object child)
      Remove specified child element from this node.
      Parameters:
      child -
      Returns:
      True if child object existed in the children list.
    • removeAllChildren

      public void removeAllChildren()
      Removes all children (subelements and text content).
    • setAutoGenerated

      public void setAutoGenerated(boolean autoGenerated)
      Parameters:
      autoGenerated - the autoGenerated to set
    • isAutoGenerated

      public boolean isAutoGenerated()
      Returns:
      the autoGenerated
    • isPruned

      public boolean isPruned()
      Returns:
      true, if node was marked to be pruned.
    • setPruned

      public void setPruned(boolean pruned)
    • isEmpty

      public boolean isEmpty()
    • addNamespaceDeclaration

      public void addNamespaceDeclaration(String nsPrefix, String nsURI)
      Adds namespace declaration to the node
      Parameters:
      nsPrefix - Namespace prefix
      nsURI - Namespace URI
    • getNamespaceDeclarations

      public Map<String,String> getNamespaceDeclarations()
      Returns:
      Map of namespace declarations for this node
    • serialize

      public void serialize(Serializer serializer, Writer writer) throws IOException
      Specified by:
      serialize in interface BaseToken
      Overrides:
      serialize in class BaseHtmlNode
      Throws:
      IOException
    • makeCopy

      public TagNode makeCopy()
    • isCopy

      public boolean isCopy()
    • traverse

      public void traverse(TagNodeVisitor visitor)
      Traverses the tree and performs visitor's action on each node. It stops when it finishes all the tree or when visitor returns false.
      Parameters:
      visitor - TagNodeVisitor implementation
    • isForeignMarkup

      public boolean isForeignMarkup()
      Returns:
      the isForeignMarkup
    • setForeignMarkup

      public void setForeignMarkup(boolean isForeignMarkup)
      Parameters:
      isForeignMarkup - the isForeignMarkup to set
    • isTrimAttributeValues

      public boolean isTrimAttributeValues()
      Returns:
      the isTrimAttributeValues
    • setTrimAttributeValues

      public void setTrimAttributeValues(boolean isTrimAttributeValues)
      Parameters:
      isTrimAttributeValues - the isTrimAttributeValues to set