1. 用php過濾html部分標簽
$str=preg_replace("/\s+/", " ", $str); //過濾多餘回車
$str=preg_replace("/<[ ]+/si","<",$str); //過濾<__("<"號後面帶空格)
$str=preg_replace("/<\!--.*?-->/si","",$str); //注釋
$str=preg_replace("/<(\!.*?)>/si","",$str); //過濾DOCTYPE
$str=preg_replace("/<(\/?html.*?)>/si","",$str); //過濾html標簽
$str=preg_replace("/<(\/?head.*?)>/si","",$str); //過濾head標簽
$str=preg_replace("/<(\/?meta.*?)>/si","",$str); //過濾meta標簽
$str=preg_replace("/<(\/?body.*?)>/si","",$str); //過濾body標簽
$str=preg_replace("/<(\/?link.*?)>/si","",$str); //過濾link標簽
$str=preg_replace("/<(\/?form.*?)>/si","",$str); //過濾form標簽
$str=preg_replace("/cookie/si","COOKIE",$str); //過濾COOKIE標簽
$str=preg_replace("/<(applet.*?)>(.*?)<(\/applet.*?)>/si","",$str); //過濾applet標簽
$str=preg_replace("/<(\/?applet.*?)>/si","",$str); //過濾applet標簽
$str=preg_replace("/<(style.*?)>(.*?)<(\/style.*?)>/si","",$str); //過濾style標簽
$str=preg_replace("/<(\/?style.*?)>/si","",$str); //過濾style標簽
$str=preg_replace("/<(title.*?)>(.*?)<(\/title.*?)>/si","",$str); //過濾title標簽
$str=preg_replace("/<(\/?title.*?)>/si","",$str); //過濾title標簽
$str=preg_replace("/<(object.*?)>(.*?)<(\/object.*?)>/si","",$str); //過濾object標簽
$str=preg_replace("/<(\/?objec.*?)>/si","",$str); //過濾object標簽
$str=preg_replace("/<(noframes.*?)>(.*?)<(\/noframes.*?)>/si","",$str); //過濾noframes標簽
$str=preg_replace("/<(\/?noframes.*?)>/si","",$str); //過濾noframes標簽
$str=preg_replace("/<(i?frame.*?)>(.*?)<(\/i?frame.*?)>/si","",$str); //過濾frame標簽
$str=preg_replace("/<(\/?i?frame.*?)>/si","",$str); //過濾frame標簽
$str=preg_replace("/<(script.*?)>(.*?)<(\/script.*?)>/si","",$str); //過濾script標簽
$str=preg_replace("/<(\/?script.*?)>/si","",$str); //過濾script標簽
$str=preg_replace("/javascript/si","Javascript",$str); //過濾script標簽
$str=preg_replace("/vbscript/si","Vbscript",$str); //過濾script標簽
$str=preg_replace("/on([a-z]+)\s*=/si","On\\1=",$str); //過濾script標簽
$str=preg_replace("//si","&#",$str); //過濾script標簽,如javAsCript:alert(
清除空格,換行
function DeleteHtml($str)
{
$str = trim($str);
$str = strip_tags($str,"");
$str = ereg_replace("\t","",$str);
$str = ereg_replace("\r\n","",$str);
$str = ereg_replace("\r","",$str);
$str = ereg_replace("\n","",$str);
$str = ereg_replace(" "," ",$str);
return trim($str);
}
過濾HTML屬性
1,過濾所有html標簽的正則表達式:
復制代碼 代碼如下:
</?[^>]+>
//過濾所有html標簽的屬性的正則表達式:
$html = preg_replace("/<([a-zA-Z]+)[^>]*>/","<\\1>",$html);
3,過濾部分html標簽的正則表達式的排除式(比如排除<p>,即不過濾<p>):
復制代碼 代碼如下:
</?[^pP/>]+>
4,過濾部分html標簽的正則表達式的枚舉式(比如需要過濾<a><p><b>等):
復制代碼 代碼如下:
</?[aApPbB][^>]*>
5,過濾部分html標簽的屬性的正則表達式的排除式(比如排除alt屬性,即不過濾alt屬性):
復制代碼 代碼如下:
\s(?!alt)[a-zA-Z]+=[^\s]*
6,過濾部分html標簽的屬性的正則表達式的枚舉式(比如alt屬性):
復制代碼 代碼如下:
(\s)alt=[^\s]*