WordPress自动分析搜索引擎蜘蛛爬行日志 | 祭夜博客
  • 欢迎光临,这个博客颜色有点多

WordPress自动分析搜索引擎蜘蛛爬行日志

其他 msojocs 来源:WordPress免插件实现蜘蛛爬行日志分析【亲测好用!】 7年前 (2017-10-03) 3616次浏览 已收录 1个评论 扫描二维码

更新记录

本文件制作人:千丝海阁
修改人:方法@http://seofangfa.com
修改日期:2015.7.8
修改日期:2017.1.19
修改人:祭夜@https://www.jysafe.cn
添加360蜘蛛,百度云观测检测情况
修改日期:2017.10.03
修改人:祭夜@https://www.jysafe.cn
支持php7.0,pjax(需要jQuery)
Git主题请注意!

直接将以下代码保存为php文件放入您的服务器的plugins目录中,在后台插件选项即可看见名为“自动分析蜘蛛”的插件,
WordPress自动分析搜索引擎蜘蛛爬行日志
启用之后可以通过短代码进行调用。(其它主题自测)
通用方法:

将代码稍作处理放入主题的functions.php中即可调用

本站地址
代码
<?php
/*
    Plugin Name: 自动分析蜘蛛
    Description: 自动分析蜘蛛----调用:[#spiderlogs#]调用是去掉#
    Author: 祭夜
    文件功能:WordPress自动分析搜索引擎蜘蛛爬行日志
    使用方法:http://seofangfa.com/wordpress-study/wordpress-spider.html
    本文件制作人:千丝海阁
    修改人:方法@http://seofangfa.com
    修改日期:2015.7.8 
    修改日期:2017.1.19 
    修改人:祭夜@https://www.jysafe.cn
    添加360蜘蛛,百度云观测检测情况
    修改日期:2017.10.03
    修改人:祭夜@https://www.jysafe.cn
    支持php7.0
    */
?>
<?php
//自动分析蜘蛛
make_log_file ();
function  make_log_file (){
	//log文件名
	$filename='mylogs.txt';
	//去除rc-ajax评论以及cron机制访问记录
	if (strstr($_SERVER["REQUEST_URI"],"rc-ajax")==false && strstr($_SERVER["REQUEST_URI"],"wp-cron.php")==false){
		$word.=date('mdHis',$_SERVER['REQUEST_TIME']+3600*8)." ";
		//访问页面
		$word.=$_SERVER["REQUEST_URI"]." ";
		//协议
		$word.=$_SERVER['SERVER_PROTOCOL']." ";
		//方法,POST OR GET
		$word.=$_SERVER['REQUEST_METHOD']." ";
		//$word .= $_SERVER['HTTP_ACCEPT'] . " ";
		//获得浏览器信息
		$word.=getbrowser ()." ";
		//传递参数
		$word.="[".$_SERVER['QUERY_STRING']."] ";
		//跳转地址
		$word.=$_SERVER['HTTP_REFERER']." ";
		//获取IP
		$word.=getIP ()." ";
		$word.="\n";
		$fh=fopen($filename,"a");
		fwrite($fh,$word);
		fclose($fh);
	}
}
//获取IP地址,网上现成代码
function  getIP ()//get ip address
{
	if (getenv('HTTP_CLIENT_IP')){
		$ip=getenv('HTTP_CLIENT_IP');
	}else if (getenv('HTTP_X_FORWARDED_FOR')){
		$ip=getenv('HTTP_X_FORWARDED_FOR');
	}else if (getenv('REMOTE_ADDR')){
		$ip=getenv('REMOTE_ADDR');
	}else {
		$ip=$_SERVER['REMOTE_ADDR'];
	}
	return $ip;
}
//获取浏览器信息,移动端,平板电脑数据还未加上。
function  getbrowser (){
	$Agent=$_SERVER['HTTP_USER_AGENT'];
	$browser='';
	$browserver='';
	if (preg_match('/Mozilla/',$Agent) && preg_match('/Chrome/',$Agent)){
		$temp=explode('(',$Agent);
		$Part=$temp[2];
		$temp=explode('/',$Part);
		$browserver=$temp[1];
		$temp=explode(' ',$browserver);
		$browserver=$temp[0];
		$browserver=$browserver;
		$browser='Chrome';
	}
	if (preg_match('/Mozilla/',$Agent) && preg_match('/Firefox/',$Agent)){
		$temp=explode('(',$Agent);
		$Part=$temp[1];
		$temp=explode('/',$Part);
		$browserver=$temp[2];
		$temp=explode(' ',$browserver);
		$browserver=$temp[0];
		$browserver=$browserver;
		$browser='Firefox';
	}
	if (preg_match('/Mozilla/',$Agent) && preg_match('/Opera/',$Agent)){
		$temp=explode('(',$Agent);
		$Part=$temp[1];
		$temp=explode(')',$Part);
		$browserver=$temp[1];
		$temp=explode(' ',$browserver);
		$browserver=$temp[2];
		$browserver=$browserver;
		$browser='Opera';
	}
	if (preg_match('/Mozilla/',$Agent) && preg_match('/MSIE/',$Agent)){
		$temp=explode('(',$Agent);
		$Part=$temp[1];
		$temp=explode(';',$Part);
		$Part=$temp[1];
		$temp=explode(' ',$Part);
		$browserver=$temp[2];
		$browserver=$browserver;
		$browser='Internet Explorer';
	}
	if ($browser!=''){
		$browseinfo=$browser.' '.$browserver;
	}else {
		$browseinfo=$_SERVER['HTTP_USER_AGENT'];
	}
	return $browseinfo;
}
function  get_spider_log ($atts){
	extract(shortcode_atts (array ('text'=>'yes'),$atts));
	$fh=fopen(site_url ()."/mylogs.txt","r");
	$contents="";
	while (!feof($fh)){
		$contents.=fread($fh,8080);
	}
	fclose($fh);
	$str="";
	$showtime=date("md");
	// Baidu-YunGuanCe-SLABot
	$bdyjs=0;
	if ($text=="yes")$str.="<br><a href=http://ce.baidu.com/ target=_blank>Baidu-YunGuanCe-SLABot</a>: ";
	$mytmp=show_spider_result ($showtime,$contents,"Baidu-YunGuanCe-SLABot",$text);
	$bdyjs+=$mytmp[0];
	$str.=$mytmp[1];
	if ($text=="yes"){
		$str.="<br>当天蜘蛛爬行记录:";
		$str.="<div style='background-color:#33A1C9;color:white;text-align:center;'>以下为国内常用蜘蛛。</div>";
	}
	$mytmp=array ();
	//google
	$google=0;
	if ($text=="yes")$str.="<a href=http://www.google.com/bot.html target=_blank>Google Spider</a>: ";
	$mytmp=show_spider_result ($showtime,$contents,"Googlebot\/",$text);
	$google+=$mytmp[0];
	$str.=$mytmp[1];
	$mytmp=show_spider_result ($showtime,$contents,"Googlebot-Image\/",$text);
	$google+=$mytmp[0];
	$str.=$mytmp[1];
	$mytmp=show_spider_result ($showtime,$contents,"Googlebot-Mobile\/",$text);
	$google+=$mytmp[0];
	$str.=$mytmp[1];
	$mytmp=show_spider_result ($showtime,$contents,"Feedfetcher-Google",$text);
	$google+=$mytmp[0];
	$str.=$mytmp[1];
	// 360蜘蛛
	$sanll=0;
	if ($text=="yes")$str.="<br><a href=http://www.so.com/help/spider_ip.html target=_blank>360 Spider</a>: ";
	$mytmp=ip_spider_result ($showtime,$contents,"180.153",$text);
	$sanll+=$mytmp[0];
	$str.=$mytmp[1];
	$mytmp=ip_spider_result ($showtime,$contents,"42.236",$text);
	$sanll+=$mytmp[0];
	$str.=$mytmp[1];
	$mytmp=show_spider_result ($showtime,$contents,"HaoSouSpider",$text);
	$sanll+=$mytmp[0];
	$str.=$mytmp[1];
	$mytmp=show_spider_result ($showtime,$contents,"360Spider",$text);
	$sanll+=$mytmp[0];
	$str.=$mytmp[1];
	$mytmp=show_spider_result ($showtime,$contents,"360Spider-Image",$text);
	$sanll+=$mytmp[0];
	$str.=$mytmp[1];
	$mytmp=show_spider_result ($showtime,$contents,"360Spider-Video",$text);
	$sanll+=$mytmp[0];
	$str.=$mytmp[1];
	// baidu
	$baidu=0;
	if ($text=="yes")$str.="<br><a href=http://www.baidu.com/search/spider.html target=_blank>Baidu Spider</a>: ";
	$mytmp=show_spider_result ($showtime,$contents,"Baiduspider\/",$text);
	$baidu+=$mytmp[0];
	$str.=$mytmp[1];
	//bing
	$bing=0;
	if ($text=="yes")$str.="<br><a href=http://www.bing.com/bingbot.htm target=_blank>bingbot Spider</a>: ";
	$mytmp=show_spider_result ($showtime,$contents,"bingbot\/",$text);
	$bing+=$mytmp[0];
	$str.=$mytmp[1];
	$mytmp=show_spider_result ($showtime,$contents,"msnbot-media\/",$text);
	$bing+=$mytmp[0];
	$str.=$mytmp[1];
	//sogou
	$sogou=0;
	if ($text=="yes")$str.="<br><a href=http://www.sogou.com/docs/help/webmasters.htm#07 target=_blank>Sogou Spider</a>: ";
	$mytmp=show_spider_result ($showtime,$contents,"Sogou web spider\/",$text);
	$sogou+=$mytmp[0];
	$str.=$mytmp[1];
	//soso
	$soso=0;
	if ($text=="yes")$str.="<br><a href=http://help.soso.com/webspider.htm target=_blank>Soso Spider</a>: ";
	$mytmp=show_spider_result ($showtime,$contents,"Sosospider\/",$text);
	$soso+=$mytmp[0];
	$str.=$mytmp[1];
	if ($text=="yes")$str.="<div style='background-color:#FA8072;color:white;text-align:center;'>以下为垃圾蜘蛛,可屏蔽抓取。</div>";
	//jike
	/*$else = 0;
	     if($text == "yes")
	     $str.= "<a href=http://shoulu.jike.com/spider.html target=_blank>Jike Spider</a>: ";
	     $mytmp = show_spider_result($showtime,$contents,"JikeSpider",$text);
	     $else += $mytmp[0];
	     $str.= $mytmp[1];*/
	//easou
	if ($text=="yes")$str.="<br><a href=http://www.easou.com/search/spider.html target=_blank>Easou Spider</a>: ";
	$mytmp=show_spider_result ($showtime,$contents,"EasouSpider",$text);
	$else+=$mytmp[0];
	$str.=$mytmp[1];
	//yisou
	if ($text=="yes")$str.="<br>YisouSpider:";
	$mytmp=show_spider_result ($showtime,$contents,"YisouSpider",$text);
	$else+=$mytmp[0];
	$str.=$mytmp[1];
	if ($text=="yes")$str.="<br><a href=http://yandex.com/bots target=_blank>YandexBot Spider</a>: ";
	$mytmp=show_spider_result ($showtime,$contents,"YandexBot\/",$text);
	$else+=$mytmp[0];
	$str.=$mytmp[1];
	if ($text=="yes")$str.="<br><a href=http://go.mail.ru/help/robots target=_blank>Mail.RU Spider</a>: ";
	$mytmp=show_spider_result ($showtime,$contents,"Mail.RU_Bot\/",$text);
	$else+=$mytmp[0];
	$str.=$mytmp[1];
	if ($text=="yes")$str.="<br><a href=http://www.acoon.de/robot.asp target=_blank>AcoonBot Spider</a>: ";
	$mytmp=show_spider_result ($showtime,$contents,"AcoonBot\/",$text);
	$else+=$mytmp[0];
	$str.=$mytmp[1];
	if ($text=="yes")$str.="<br><a href=http://www.exabot.com/go/robot target=_blank>Exabot Spider</a>: ";
	$mytmp=show_spider_result ($showtime,$contents,"Exabot\/",$text);
	$else+=$mytmp[0];
	$str.=$mytmp[1];
	if ($text=="yes")$str.="<br><a href=http://www.seoprofiler.com/bot target=_blank>spbot Spider</a>: ";
	$mytmp=show_spider_result ($showtime,$contents,"spbot\/",$text);
	$else+=$mytmp[0];
	$str.=$mytmp[1];
	$str.=draw_canvas ($google,$sanll,$baidu,$bing,$sogou,$soso,$else);
	return $str;
}
function  ip_spider_result ($time,$contents,$str,$text){
	$count=array ();
	$count[0]=preg_match_all("/".$time."\d*\s\/\S*\s.*".$str."/",$contents,$mymatches);
	if ($text=="yes"){
		$str=preg_replace("{\\\/}","",$str);
		$count[1].="<br> 蜘蛛类型=>官方IP($str.xxx.xxx): 爬行次数=".$count[0];
		if ($count[0]>0){
			$tmp=substr($mymatches[0][$count[0]-1],4,6);
			$tmp=substr($tmp,0,2).":".substr($tmp,2,2).":".substr($tmp,4,2);
			$count[1].=" 最后爬行时间:".$tmp;
		}
	}
	return $count;
}
function  show_spider_result ($time,$contents,$str,$text){
	$count=array ();
	$count[0]=preg_match_all("/".$time."\d*\s\/\S*\s.*".$str."/",$contents,$mymatches);
	if ($text=="yes"){
		$str=preg_replace("{\\\/}","",$str);
		$count[1].="<br> 蜘蛛类型=>".$str.": 爬行次数=".$count[0];
		if ($count[0]>0){
			$tmp=substr($mymatches[0][$count[0]-1],4,6);
			$tmp=substr($tmp,0,2).":".substr($tmp,2,2).":".substr($tmp,4,2);
			$count[1].=" 最后爬行时间:".$tmp;
		}
	}
	return $count;
}
function  draw_canvas ($google,$sanll,$baidu,$bing,$sogou,$soso,$else){
	$tmp=$google+$sanll+$baidu+$bing+$sogou+$soso+$else;
	if ($tmp==0){
		return "<br><br>数据不足,无法生成分析图。<br><br>";
	}
	$google2=$google*100/$tmp;
	$sanll2=$sanll*100/$tmp;
	$baidu2=$baidu*100/$tmp;
	$bing2=$bing*100/$tmp;
	$sogou2=$sogou*100/$tmp;
	$soso2=$soso*100/$tmp;
	$else2=$else*100/$tmp;
	$str.="<br><div style='border-top: 1px solid #e6e6e6;'><br>     <div style='float:left;width:150px;border-width:1px;border-style:groove;padding:15px;'><b>蜘蛛爬行分析图:</b><br>";
	$str.="日期:".date("Y-m-d");
	$str.="<br>蜘蛛一共爬行".$tmp."次:<br>";
	$str.="<li><span style='color:#33A1C9;'>google:".$google."次(".intval($google2)."%)</span></li>";
	$str.="<li><span style='color:#33B8C9;'>360:".$sanll."次(".intval($sanll2)."%)</span></li>";
	$str.="<li><span style='color:#0033ff;'>baidu:".$baidu."次(".intval($baidu2)."%)</span></li>";
	$str.="<li><span style='color:#872657;'>bing:".$bing."次(".intval($bing2)."%)</span></li>";
	$str.="<li><span style='color:#FF9912;'>sogou:".$sogou."次(".intval($sogou2)."%)</span></li>";
	$str.="<li><span style='color:#FF6347;'>soso:".$soso."次(".intval($soso2)."%)</span></li>";
	$str.="<li><span style='color:#55aa00;'>else:".$else."次(".(100-intval($google2)-intval($sanll2)-intval($baidu2)-intval($bing2)-intval($sogou2)-intval($soso2))."%)</span></li></div>";
	$d = "'";
	$str.='无图?<a href="javascript:location.reload()">刷新</a>!<script src="//lib.baomitu.com/echarts/3.7.2/echarts-en.js"></script><!--为ECharts准备一个具备大小(宽高)的Dom--><div id="main"style="float:right;width: 600px;height:400px;"></div><div style="display:none;"><img src=""onerror='.$d.'$.getScript("//lib.baomitu.com/echarts/3.7.2/echarts-en.js");setTimeout(function(){if(typeof(proxy2016)=="undefined"){var myChart=echarts.init(document.getElementById("main"));var option={title:{text:"祭夜blog蜘蛛访问情况",subtext:"今天",x:"center"},tooltip:{trigger:"item",formatter:"{a} <br/>{b} : {c} ({d}%)"},legend:{orient:"vertical",left:"left",data:["Google","百度","360","Bing","搜狗","else"]},series:[{name:"访问来源",type:"pie",radius:"55%",center:["50%","60%"],data:[{value:'.$google.',name:"Google"},{value:'.$baidu.',name:"百度"},{value:'.$sanll.',name:"360"},{value:'.$bing.',name:"Bing"},{value:'.$sogou.',name:"搜狗"},{value:'.$else.',name:"else"}],itemStyle:{emphasis:{shadowBlur:10,shadowOffsetX:0,shadowColor:"rgba(0, 0, 0, 0.5)"}}}]};myChart.setOption(option)}},2000);'.$d.'></div></div><br><br><br><br><br><br><br><br><br><br>';
	/*$str.=	"<img src = '//www.jysafe.cn/tb/chart?cht=p3&chco=33A1C9,0033ff,872657,FF9912,FF6347,55aa00&chd=t:".$google2 .",".$sanll2.",".$baidu2.",".$bing2.",".$sogou2.",".$soso2.",".$else2."&chs=400x200&chl=google|360|baidu|bing|sogou|soso|else' /></div><br>";*/
	return $str;
}
add_shortcode ('spiderlogs','get_spider_log');
//自动分析蜘蛛结束
?>

祭夜の咖啡馆 , 版权所有丨如未注明 , 均为原创丨本网站采用BY-NC-SA协议进行授权
转载请注明原文链接:WordPress自动分析搜索引擎蜘蛛爬行日志
喜欢 (0)
[1690127128@qq.com]
分享 (0)
发表我的评论
取消评论
OwO表情
贴图 加粗 删除线 居中 斜体 签到

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
(1)个小伙伴在吐槽
  1. Thanks for sharing your thoughts on wordpress插件.Regards
    music slot machines gaming gambling2020-01-25 05:11 回复 Windows 7 | Firefox浏览器 45.9