php正则抓取网页_PHP抓取网页指定内容

1. php正则表达式怎么抓取网页数据

会用正则就会抓取。
不会正则，一时半会也教不错。
不过，推荐你使用phpQuery这个框架，用jQuery的使用器来抓取数据。

2. 用PHP正则表达式提取页面内容

<?php
$theurl="http://www.kitco.cn/cn/";
if (!($contents = file_get_contents($theurl)))
{
echo 'Could not open URL';
exit;
}

/*
$contents=preg_replace('/<.+?>/', '', $contents)；
*/

if (preg_match("/<td class=\"tableHeader\" align=\"left\">原油价格([^^]*?)<\/tr>/u",$contents,$matches))
{
print "A match was found:".strip_tags($matches[0]);
} else {
print "A match was not found.<br />";
}
?>

试试这样
------------------------------------
呵呵，上边这段已经把你那行注释掉了，先找到唯一的一段代码，取出来你想要的以后以后，再去掉标签，你运行一下试试
运行结果：
A match was found:原油价格 68.11 +0.95
应该是你想要的结果吧？

3. PHP如何正则表达式提取网页内容

如果你要<div class="nav" monkey="nav">和<div class="head-ad">之间的所有源码，用 preg_match 就可以，不用preg_match_all ，如果你要里面的所有的 <li></li>标签中的内容，可以用preg_match_all

//提取所有代码
$pattern = '/<div class="nav" monkey="nav">(.+?)<div class="head-ad">/is';
preg_match($pattern, $string, $match);
//$match[0] 即为<div class="nav" monkey="nav">和<div class="head-ad">之间的所有源码
echo $match[0];

//然后再提取<li></li>之间的内容
$pattern = '/<li.*?>(.+?)<\/li>/is';

preg_match_all($pattern, $match[0], $results);
$new_arr=array_unique($results[0]);

foreach($new_arr as $kkk){
echo $kkk;

}

4. PHP抓取网页指定内容

<?php
/*
* 如下：方法有点笨
* 抓取网页内容用 PHP 的正则
* 用JS每隔5分钟刷新当前页面---即重新获取网页内容
*
* 注： $mode中--<title></title>-更改为所需内容（如 $mode = "#<a(.*)</a>#";>获取所有链接）
*
* window.location.href="http://localhost//refesh.php";中的http://localhost//refesh.php
* 更改为自己的URL----作用：即刷新当前页面
*
* setInterval("ref()",300000);是每隔300000毫秒（即 5 * 60 *1000 毫秒即5分钟）执行一次函数 ref()
*
* print_r($arr);输出获得的所有内容 $arr是一个数组可根据所需输出一部分（如 echo $arr[1][0];）
* 若要获得所有内容可去掉
* $mode = "#<title>(.*)</title>#";
if(preg_match_all($mode,$content,$arr)){
print_r($arr);
echo "<br/>";
echo $arr[1][0];
}
再加上 echo $content；
*/
$url = "http://www..com"; //目标站
$fp = @fopen($url, "r") or die("超时");

$content=file_get_contents($url);
$mode = "#<title>(.*)</title>#";
if(preg_match_all($mode,$content,$arr)){
//print_r($arr);
echo "<br/>";
echo $arr[1][0];
}
?>
<script language="javaScript" type="text/javascript">
<--
function ref(){
window.location.href="http://localhost//refesh.php";
}
setInterval("ref()",300000);
//-->
</script>

5. php怎么抓取其它网站数据

可以用以下4个方法来抓取网站的数据：

1. 用 file_get_contents 以 get 方式获取内容：
?

$url = 'http://localhost/test2.php';
$html = file_get_contents($url);
echo $html;

2. 用fopen打开url，以get方式获取内容
?

$url = 'http://localhost/test2.php';
$fp = fopen($url, 'r');
stream_get_meta_data($fp);
$result = '';
while(!feof($fp))
{
$result .= fgets($fp, 1024);
}
echo "url body: $result";
fclose($fp);

3. 用file_get_contents函数,以post方式获取url
?

$data = array(
'foo'=>'bar',
'baz'=>'boom',
'site'=>'www.jb51.net',
'name'=>'nowa magic');

$data = http_build_query($data);

//$postdata = http_build_query($data);
$options = array(
'http' => array(
'method' => 'POST',
'header' => 'Content-type:application/x-www-form-urlencoded',
'content' => $data
//'timeout' => 60 * 60 // 超时时间（单位:s）
)
);

$url = "http://localhost/test2.php";
$context = stream_context_create($options);
$result = file_get_contents($url, false, $context);

echo $result;

4、使用curl库，使用curl库之前，可能需要查看一下php.ini是否已经打开了curl扩展

$url = 'http://localhost/test2.php?site=jb51.net';
$ch = curl_init();
$timeout = 5;
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$file_contents = curl_exec($ch);
curl_close($ch);
echo $file_contents;

热点内容

怎么恢复加密的东西发布：2025-09-18 22:52:02 浏览：980

程序员卖茶发布：2025-09-18 22:47:44 浏览：696

后端程序员英文发布：2025-09-18 22:35:27 浏览：359

滴滴程序员平均月薪发布：2025-09-18 21:46:08 浏览：588

如何使用ftp命令发布：2025-09-18 21:16:53 浏览：785

小书亭下载的文件在哪手机文件夹发布：2025-09-18 20:55:01 浏览：175

交叉编译器编译单个c文件发布：2025-09-18 20:48:51 浏览：512

代理服务器地址列表吧发布：2025-09-18 20:39:33 浏览：929

java列出所有文件发布：2025-09-18 20:27:05 浏览：867

压缩包看图软件发布：2025-09-18 20:20:25 浏览：189

sqlite在android中的应用发布：2025-09-18 20:19:28 浏览：660

一本通pdf 发布：2025-09-18 20:13:20 浏览：914

2021免费的编程软件发布：2025-09-18 20:13:20 浏览：125

项目编译后浏览器不对应刷新发布：2025-09-18 19:59:34 浏览：566

三星升级android60 发布：2025-09-18 19:37:23 浏览：296

粘土的压缩模量发布：2025-09-18 19:37:20 浏览：119

美国程序员生活发布：2025-09-18 19:25:39 浏览：222

51单片机摘要发布：2025-09-18 19:18:54 浏览：409

英语经典pdf下载发布：2025-09-18 19:07:16 浏览：321

大学文件夹怎么删除发布：2025-09-18 19:01:31 浏览：672

导航:首页 > 编程语言 > php正则抓取网页

php正则抓取网页

与php正则抓取网页相关的资料