php IIS日志分析搜索引擎爬虫记录程序第1/2页

只看该作者 · 发表于 2018-2-14 09:23:00

         使用注意：
　　修改iis.php文件中iis日志的绝对路径
　　例如：$folder=”c:/windows/system32/logfiles/站点日志目录/”; //后面记得一定要带斜杠(/)。
　　( 用虚拟空间的不懂查看你的站点绝对路径?上传个探针查看!
　　直接查看法：http://站点域名/iis.php
　　本地查看法：把日志下载到本地 http://127.0.0.1/iis.php )
　　注意：
　　//站点日志目录，注意该目录必须要有站点用户读取权限!
　　//如果把日志下载到本地请修改143行的网址为您网站的网址，此操作不是必要操作，不影响分析结果。
　　//修改文件名称iis.php 需要同时修改对应代码 ctrl+h 把 iis.php全部替换成您要修改的文件名否则程序运行出错。
　　//如果iis日志文件过大，可能会导致程序超时!同时也不建议大家使用!
以下是PHP源代码：
[code]
=0;$i--)
      {
         $indexstr.="[tr][td]".date("Y-m-d",filectime($folder.$arr_file[$i]))."[/td]
      [td][url=]百度(Baidu)[/url][/td]
      [td][url=]谷歌(Google)[/url][/td]
      [td][url=]雅虎(yahoo)[/url][/td][/tr]";
      }
}
closedir($fp);
$html = indexhtml();
$copy = mycopy();
$html = str_replace("[showlog]",$indexstr,$html);
$html = str_replace("[copy]",$copy,$html);
echo $html;
}else{
      echo "该日志目录不存在或权限不足，请检查设置！";
      exit();
}
}elseif ($type=='Baiduspider'){
   echo show($type,$folder,$showfile,$page,$pagesize);
}elseif ($type=='Googlebot'){
   echo show($type,$folder,$showfile,$page,$pagesize);
}elseif ($type=='yahoo'){
   echo show($type,$folder,$showfile,$page,$pagesize);
}
function show($type,$folder,$showfile,$page,$pagesize)
{
if ($type=='Baiduspider')
{
      $title='百度';
}elseif ($type=='Googlebot'){
      $title='谷歌';
}elseif ($type=='yahoo'){
      $title='雅虎';
}
if ($type&&$folder&&$showfile)
{
      if(file_exists($folder.$showfile))
      {
      $fp= fopen($folder.$showfile,"r");
      }else{
         echo "该日志文件不存在，请检查设置！";
         exit;
      }
      $j=0;
      $y=0;
      $t=0;
      $h=0;
      while (!feof($fp))
      {
         $str = fgets($fp);
            $str =iconv("UTF-8","GB2312//IGNORE",$str);
         if(strpos($str,$type))
         {
            $j++;
            $temp[].=$str;
            $tmpcount = explode(" ",$str);
            if ($tmpcount[11]==200)$t++;
            if ($tmpcount[11]==304)$h++;
            if ($tmpcount[11]==404)$y++;
         }
      }
      fclose($fp);
      $count = count($temp);
      if ($page==1)
      {
         $countshow=$count;
         $mynum = $count-$pagesize;
      }else{
         $countshow =$count-($page*$pagesize-$pagesize);
         $mynum = $count-$page*$pagesize;
      }
      $pagecount =ceil(count($temp) / $pagesize);
      if ($page>=$pagecount)
      {
         $mynum = $pagecount;
      }
      $m=0;
      for ($i=$countshow-1;$i>=$mynum;$i--)
      {
         $num = explode(" ",$temp[$i]);
            $show.="
                  [tr]
                  [td]".$num[0]." ".$num[1]."[/td]
                  [td]".$num[9]."[/td]
                  [td][url=]".$num[5]."[/url][/td]
                  [td]".$num[11]."[/td]
                  [/tr]";
      }
      unset($temp);
      $showpage = "[td]每页 ".$pagesize." 条当前".$page."/$pagecount";
      $showpage.="  [url=]首页[/url]";
      if ($page!=1)
      {
         $showpage.="  [url=]上一页[/url]";
      }
      if ($page!=$pagecount)
      {
      $showpage.="  [url=]下一页[/url]";
      $weei = "  [url=]尾页[/url]";
      }
      $showpage.=$weei."[/td]";
      if ($show)
      {
      $html = pagehtml();
      $copy = mycopy();
      $htmltitle = "牛仔IIS日志蜘蛛爬行记录分析器-";//请保留，谢谢！
      $html = str_replace("[title]",$title,$html);
      $html = str_replace("[htmltitle]",$htmltitle,$html);
      $html = str_replace("[show]",$show,$html);
      $html = str_replace("[count]",$j,$html);
      $html = str_replace("
         1[url=]2[/url][url=]下一页[/url][url=]阅读全文[/url]

您可能感兴趣的文章:

php 向访客和爬虫显示不同的内容

PHP多线程抓取网页实现代码

PHP CURL模拟登录新浪微博抓取页面内容基于EaglePHP框架开发

php使用curl和正则表达式抓取网页数据示例

PHP curl实现抓取302跳转后页面的示例

PHP实现采集抓取淘宝网单个商品信息

一个PHP实现的轻量级简单爬虫

PHP代码实现爬虫记录——超管用

PHP爬虫之百万级别知乎用户数据爬取与分析

利用php抓取蜘蛛爬虫痕迹的示例代码

php与python实现的线程池多线程爬虫功能示例

PHPCrawl爬虫库实现抓取酷狗歌单的方法示例