DZone Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

Snippets has posted 5883 posts at DZone. View Full User Profile

Match An Html Attribute

07.18.2007
| 6924 views |
  • submit to reddit
        The goal of this expression is to match all "id" attributes of the "div" tags on a page

<div(?=\s)
(?:
  (?!\sid=|>)
  .
)* # Consume everything until finding " id=" or ">"
   # (">" is just for failing faster)

\sdiv=(?P<__quote>['"])? # Consume " div=" and save the quote type (' or ") if any

(?: # while
  (?! # next character isn't, 
    (?(__quote) # if a quote has been opened,
      (?P=__quote) # the closing quote ;
      |[\s>] # a space or ">" else,
    )
  )
  . # consume this character
)*


(?(__quote) # If we got a quote
  (?P=__quote)| # it must be closed
  (?=[\s>]) # else, the attribute is ended by a space or ">"
)

[^>]*> # Consume the rest of the tag