php - How to remove (most) short words from a string -


i'm using following regex remove small words ( < 4 chars) string.

$dirty = "i welcome san diego"; $clean = preg_replace("/\b[^\s]{1,3}\b/", "", $dirty); 

so, result in "welcome diego";

however, need ignore words being replaced, instance:

$ignore = array("san", "you"); 

would result in "welcome san diego"

i recommend using callback (preg_replace_callback) allows more maintainable solution if have scale large number of words:

echo preg_replace_callback(     '/\b[^\s]{1,3}\b/',     create_function(         '$matches',         '$ignore = array("san", "you");          if (in_array($matches[0], $ignore)) {             return $matches[0];          } else {             return \'\';          }'     ),     "i welcome san diego" );  // output: welcome san diego  

if you're using php 5.3 or greater, employ anonymous function rather calling create_function.


Comments