DZone Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

Snippets has posted 5883 posts at DZone. View Full User Profile

Text To HTML Converter (PHP 4+)

06.20.2006
| 26942 views |
  • submit to reddit
        Simple function to convert a text into formatted HTML in PHP. The function implements some text cleanups (double space removal) and accepts <b>some</b> HTML in the text, like links (a href), lists (ul, ol), blockquotes and tables. This makes it perfect for use inside custom-made blogging engines and CMSs. There's also an implementation of case-insensitive search/replace for php < 5.

<?php

function stri_replace( $find, $replace, $string ) {
// Case-insensitive str_replace()

  $parts = explode( strtolower($find), strtolower($string) );

  $pos = 0;

  foreach( $parts as $key=>$part ){
    $parts[ $key ] = substr($string, $pos, strlen($part));
    $pos += strlen($part) + strlen($find);
  }

  return( join( $replace, $parts ) );
}


function txt2html($txt) {
// Transforms txt in html

  //Kills double spaces and spaces inside tags.
  while( !( strpos($txt,'  ') === FALSE ) ) $txt = str_replace('  ',' ',$txt);
  $txt = str_replace(' >','>',$txt);
  $txt = str_replace('< ','<',$txt);

  //Transforms accents in html entities.
  $txt = htmlentities($txt);

  //We need some HTML entities back!
  $txt = str_replace('"','"',$txt);
  $txt = str_replace('<','<',$txt);
  $txt = str_replace('>','>',$txt);
  $txt = str_replace('&','&',$txt);

  //Ajdusts links - anything starting with HTTP opens in a new window
  $txt = stri_replace("<a href=\"http://","<a target=\"_blank\" href=\"http://",$txt);
  $txt = stri_replace("<a href=http://","<a target=\"_blank\" href=http://",$txt);

  //Basic formatting
  $eol = ( strpos($txt,"\r") === FALSE ) ? "\n" : "\r\n";
  $html = '<p>'.str_replace("$eol$eol","</p><p>",$txt).'</p>';
  $html = str_replace("$eol","<br />\n",$html);
  $html = str_replace("</p>","</p>\n\n",$html);
  $html = str_replace("<p></p>","<p> </p>",$html);

  //Wipes <br> after block tags (for when the user includes some html in the text).
  $wipebr = Array("table","tr","td","blockquote","ul","ol","li");

  for($x = 0; $x < count($wipebr); $x++) {

    $tag = $wipebr[$x];
    $html = stri_replace("<$tag><br />","<$tag>",$html);
    $html = stri_replace("</$tag><br />","</$tag>",$html);

  }

  return $html;
}

?>
    

Comments

Snippets Manager replied on Fri, 2007/08/03 - 10:56am

When drawing on a resource encoded UTF8 in with multibyte chars - such as the norwegian "æ", "ø" and "å" - I inserted: $txt = utf8_decode($txt); just before $txt = htmlentities($txt); to avoid output such as "blÃ¥bærsyltetøy" instead of "blåbærsyltetøy".

Snippets Manager replied on Sun, 2007/04/15 - 5:25pm

This is useful. Thank you.

Nick Pao replied on Mon, 2006/09/04 - 8:22am

You can replace that part : while( !( strpos($txt,' ') === FALSE ) ) $txt = str_replace(' ',' ',$txt); with if( !( strpos($txt,' ') === FALSE ) ) $txt = str_replace(' ',' ',$txt); It is not a big improvment, but as str_replace is replacing all the matches you don't need a while statement