By unchqua
via theserverside.com
Submitted: Jul 16 2008 / 09:08
HtmlCleaner is Java library used to safely parse and transform any HTML found on web to well-formed XML. It is designed to be small, fast, flexible and independant. HtmlCleaner may be used in java code, as command line tool or as Ant task. Result of parsing is lightweight document object model which can easily be transformed to standards like DOM or JDom, or serialized to XML output in various ways (compact, pretty printed and so on).
Comments
Voters For This Link (3)
Voters Against This Link (0)