DZone Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

Snippets has posted 5883 posts at DZone. View Full User Profile

Html Stripping

08.01.2005
| 4685 views |
  • submit to reddit
        Stripping html tags, with an optional array of tags to be preserved.

	def strip_html(str, allow = ['a','img','p','br','i','b','u','ul','li'])
		str = str.strip || ''
		allow_arr = allow.join('|') << '|\/'
		str.gsub(/<(\/|\s)*[^(#{allow_arr})][^>]*>/,'')
	end
    

Comments

Snippets Manager replied on Thu, 2006/05/18 - 11:47pm

It won't strip
tags.

Snippets Manager replied on Thu, 2006/05/18 - 11:47pm

joanofarctan's snippet doesn't work. For example, it won't strip
 tags, because they contain the letter 'p'.  Her [^(#{allow_arr}] is not working the way it was probably intended to.  Here is a slightly improved version.


    def self.strip_html(str, allow = ['a','img','p','br','i','b','u','ul','li'])
        str.strip!
        allow_arr = allow.join('\\b|') << '|/'
        tag_pat = %r,<(?:(?:/?)|(?:\s*))(?!#{allow_arr}).*?>,
        str.gsub(tag_pat, ' ')
    end

Snippets Manager replied on Mon, 2012/05/07 - 2:13pm

Coming from a PHP background, I've been looking for a way to efficiently do what the PHP strip_tags function does and this looks perfect. Thanks MUCH!