当前位置:编程学习 > JAVA >>

求大神赐教一个抓取页面信息的问题

最近在做一个关于NBA的应用,要用到新浪NBA里的球员数据,但是抓取时总是抓不到球员信息。求大神指导。比如要抓取 http://sports.sina.com.cn/nba/live.html?id=2013112928 页面下的球员数据统计信息,下面是代码:

import java.io.IOException;
import java.io.InputStream;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.zip.GZIPInputStream;

public class NBA_Sina {

public static void main(String[] args) throws IOException {
//传递请求参数
URL url = new URL(
"http://sports.sina.com.cn/nba/live.html?id=2013112928");
HttpURLConnection httpURLConnection = (HttpURLConnection) url
.openConnection();
httpURLConnection.setDoInput(true);
httpURLConnection.setRequestProperty("Host", "live.sinajs.cn");
httpURLConnection
.setRequestProperty("User-Agent",
"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0");
httpURLConnection
.setRequestProperty("Accept",
"*/*");
httpURLConnection.setRequestProperty("Accept-Language",
"zh-cn,zh;q=0.8,en-us;q=0.5,en;q=0.3");
httpURLConnection
.setRequestProperty("Accept-Encoding", "gzip, deflate");
httpURLConnection.setRequestProperty("Rerfer", "http://sports.sina.com.cn/nba/live.html?id=2013112923");
httpURLConnection.setRequestProperty("Connection", "keep-alive");


//gzip解压
httpURLConnection.connect();
InputStream inputStream = httpURLConnection.getInputStream();
GZIPInputStream gzipis=new GZIPInputStream(inputStream);


byte[]buf=new byte[1024];
int off=0;
while ((gzipis.read(buf, off, buf.length)!=-1)) {


System.out.println(new String(buf));//输出

}
httpURLConnection.disconnect();


java 页面抓取 web
补充:Java ,  Web 开发
CopyRight © 2012 站长网 编程知识问答 www.zzzyk.com All Rights Reserved
部份技术文章来自网络,