当前位置:编程学习 > C#/ASP.NET >>

一个关于网页抓取数据的问题

这是我的测试代码:
 private string GetPageData(string url)
        {
            string strResult =string.Empty ;
            try
            {
                HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
                //声明一个HttpWebRequest请求
                request.Timeout = 30000;
                //设置连接超时时间
                request.Headers.Set("Pragma", "no-cache");
                HttpWebResponse response = (HttpWebResponse)request.GetResponse();
                Stream streamReceive = response.GetResponseStream();
                Encoding encoding = Encoding.GetEncoding("GB2312");
                StreamReader streamReader = new StreamReader(streamReceive, encoding);
                strResult = streamReader.ReadToEnd();
            }
            catch
            {
                
            }
            return strResult;
        }

GetPageData("http://www.bijiaqi.com/cyw");
可是网页并没有抓取出来。 --------------------编程问答-------------------- 字符串仍然是空的吗? --------------------编程问答-------------------- 抓取出来的是这样的字符串
<!--@R--><script>
        var k ='ecfbbca6'; 
        var d = new Date(); 
        d.setTime(d.getTime() + (3600*24*365*5*1000));
        document.cookie = "pd5=" + k +"; expires=" + d.toGMTString();
        setTimeout(function(){
            window.location.reload();
        },2000);
        </script>正在下载服务器数据... --------------------编程问答--------------------
引用 2 楼 fa159004287 的回复:
抓取出来的是这样的字符串
<!--@R--><script>
        var k ='ecfbbca6'; 
        var d = new Date(); 
        d.setTime(d.getTime() + (3600*24*365*5*1000));
        document.cookie = "pd5=" + k +"; expires=" ……

我还以为你这样的还是空的字符串
那你必须解析js啊
你搜索下 在C#下如何解析javascript
--------------------编程问答-------------------- 抓取网页内容 --------------------编程问答--------------------

private string GetWebContent(string sUrl)
        {
            string strResult = "";
            try
            {
                HttpWebRequest request = (HttpWebRequest)WebRequest.Create(sUrl);
                //声明一个HttpWebRequest请求
                request.Timeout = 3000000;
                //设置连接超时时间
                request.Headers.Set("Pragma", "no-cache");
                HttpWebResponse response = (HttpWebResponse)request.GetResponse();
                if (response.ToString() != "")
                {
                    Stream streamReceive = response.GetResponseStream();
                    Encoding encoding = Encoding.GetEncoding("UTF-8");
                    StreamReader streamReader = new StreamReader(streamReceive, encoding);
                    strResult = streamReader.ReadToEnd();
                }
            }
            catch (Exception exp)
            {
                //MessageBox.Show("出错");
                MessageBox.Show(exp.Message);
            }
            return strResult;
        }
//测试:string str=GetWebContent("http://www.baidu.com");
补充:.NET技术 ,  C#
CopyRight © 2012 站长网 编程知识问答 www.zzzyk.com All Rights Reserved
部份技术文章来自网络,