.net WebClient抓取google收录问题
我抓取的页面内容,里面的汉字全部都是繁体的。。然后抓取到的收录数和在google用site命令搜索出来的数 还不一样。。这是怎么个科学道理啊,求助~~! --------------------编程问答-------------------- 沙发没人来啊。。自己顶一个吧 --------------------编程问答-------------------- 使用HttpWebRequest设置Accept Language进行处理 --------------------编程问答-------------------- 2楼这是我代码:public string GetHtml(string url, Encoding encoding)
{
WebRequest request;
request = WebRequest.Create(url);
request.Credentials = CredentialCache.DefaultCredentials;
WebResponse response;
response = request.GetResponse();
return new StreamReader(response.GetResponseStream(), encoding).ReadToEnd();
}
应该怎么改? --------------------编程问答--------------------
这是我代码:
public string GetHtml(string url, Encoding encoding)
{
WebRequest request;
request = WebRequest.Create(url);
request.Credentials = CredentialCache.DefaultCredentials;
WebResponse response;
response = request.GetResponse();
return new StreamReader(response.GetResponseStream(), encoding).ReadToEnd();
}
应该怎么改? --------------------编程问答-------------------- request.Headers.Add("Accept-Language", "zh-cn,en-us;q=0.8,zh-hk;q=0.6,ja;q=0.4,zh;q=0.2");
request.Headers.Add("Accept-Charset", "GB2312,utf-8;q=0.7,*;q=0.7"); --------------------编程问答--------------------
报错啊操作超时:
public string GetHtml(string url, Encoding encoding)
{
WebRequest request;
request = WebRequest.Create(url);
request.Headers.Add("Accept-Language", "zh-cn,en-us;q=0.8,zh-hk;q=0.6,ja;q=0.4,zh;q=0.2");
request.Headers.Add("Accept-Charset", "GB2312,utf-8;q=0.7,*;q=0.7");
request.Credentials = CredentialCache.DefaultCredentials;
WebResponse response;
response = request.GetResponse();
return new StreamReader(response.GetResponseStream(), encoding).ReadToEnd();
}
--------------------编程问答-------------------- 不会吧?你抓的是哪个页面? --------------------编程问答--------------------
这个啊:http://www.google.com.hk/search?q=site%3Awww.cnhan.com&ie=utf-8&oe=utf-8&aq=t --------------------编程问答--------------------
System.Net.CookieContainer c = new System.Net.CookieContainer();
HttpWebRequest request;
request = WebRequest.Create("http://www.google.com.hk/search?q=site%3Awww.cnhan.com&ie=utf-8&oe=utf-8&aq=t") as HttpWebRequest;
request.AllowAutoRedirect = true;
request.CookieContainer = c;
request.Headers.Add("Accept-Language", "zh-cn,en-us;q=0.8,zh-hk;q=0.6,ja;q=0.4,zh;q=0.2");
request.Headers.Add("Accept-Charset", "GB2312,utf-8;q=0.7,*;q=0.7");
request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0";
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
String html = new System.IO.StreamReader(response.GetResponseStream(), Encoding.UTF8).ReadToEnd() ;
Response.Write(html);
正确的代码
补充:.NET技术 , ASP.NET