答案:本文章源地址:http://www.itican.net/lmy0083/?cat=3
最近搞一个WAP项目,发现有些手机用POST方式提交中文字符的编码有些奇怪,我的环境是GB18030,支持GBK/GB2312。我开发测试Opera7.60。开发语言JSP/JAVA。
据我所知,JAVA默认传输的字符集是8859_1(单字节字符集),手机上大部分是UTF-8,少部分是GB2312;
关于这些字符集请参考http://www.itican.net/lmy0083/?page_id=88我主要拿”人”这个汉字测试,
人
GBK/GB2312(Hex) C8 CB
UTF-8(Hex) E4 BA BA
Unicode 人;我开发测试以Opera7.60 这个浏览器,比较方便,当然,开发都是按这个浏览器支持的开发的,比如我把opera的字符集设为UTF-8,所有POST的中文字符就会以8859_1 默认的字符集发送(这是单字节字符集,不可能包含中文字符的,中文字符都是双字节,UTF-8为3字节)但是奇怪的是,接收必须
String sPost = new String(request.getParameter(”input_name”).getBytes(”8859_1″),”UTF-8″);
否则不能满足我应用,我不能直接拿
request.getParameter(”input_name”)
来用,当然也不能写成
String sPost = new String(request.getParameter(”input_name”).getBytes(”8859_1″),”其他字符集”);这样开发,一些手机没问题,可以支持,比如Nokia的6681,但是实际测试发现,有些手机比如N800,Nokia6670,输入中文以这样的方式接收,将会接收到乱码;
于是我拿一些浏览器/模拟器/手机做测试,来看我到底接收到的是那种编码;
测试程序(接收部分,发送部分任意写个input,name为”input_name”):
- String sPost= request.getParameter("input_name");
- if(sPost== null || sPost.length()==0){sPost= "0";}
- String GB2312_TO_UTF_8 = new String(sPost.getBytes("GB2312"),"UTF-8");
- String GB2312_TO_GBK = new String(sPost.getBytes("GB2312"),"GBK");
- String GB2312_TO_8859_1 = new String(sPost.getBytes("GB2312"),"8859_1");
- String GBK_TO_UTF_8 = new String(sPost.getBytes("GBK"),"UTF-8");
- String GBK_TO_GB2312 = new String(sPost.getBytes("GBK"),"GB2312");
- String GBK_TO_8859_1 = new String(sPost.getBytes("GBK"),"8859_1");
- String ISO_8859_1_TO_UTF_8 = new String(sPost.getBytes("8859_1"),"UTF-8");
- String ISO_8859_1_TO_GB2312 = new String(sPost.getBytes("8859_1"),"GB2312");
- String ISO_8859_1_TO_GBK = new String(sPost.getBytes("8859_1"),"GBK");
- String UTF_8_TO_8859_1 = new String(sPost.getBytes("UTF-8"),"8859_1");
- String UTF_8_TO_GB2312 = new String(sPost.getBytes("UTF-8"),"GB2312");
- String UTF_8_TO_GBK = new String(sPost.getBytes("UTF-8"),"GBK");
- System.out.println("<WAP TEST>*********************************************************");
- System.out.println("<WAP TEST>ren");
- System.out.println("<WAP TEST> UTF_8 |" + "E4BABA|" + java.net.URLEncoder.encode(sPost,"UTF-8"));
- System.out.println("<WAP TEST> GBK |" + "C8CB |" + java.net.URLEncoder.encode(sPost,"GBK"));
- System.out.println("<WAP TEST> GB2312 |" + "C8CB |" + java.net.URLEncoder.encode(sPost,"GB2312"));
- System.out.println("<WAP TEST> 8859_1 |" + " |" + java.net.URLEncoder.encode(sPost,"8859_1"));
- System.out.println("<WAP TEST>GB2312 TO UTF_8 |" + "E4BABA|" + java.net.URLEncoder.encode(GB2312_TO_UTF_8,"UTF-8"));
- System.out.println("<WAP TEST>GB2312 TO GBK |" + "C8CB |" + java.net.URLEncoder.encode(GB2312_TO_GBK,"GBK"));
- System.out.println("<WAP TEST>GB2312 TO 8859_1 |" + "C8CB |" + java.net.URLEncoder.encode(GB2312_TO_8859_1,"8859_1"));
- System.out.println("<WAP TEST>GBK TO UTF_8 |" + "E4BABA|" + java.net.URLEncoder.encode(GBK_TO_UTF_8,"UTF-8"));
- System.out.println("<WAP TEST>GBK TO GB2312 |" + "C8CB |" + java.net.URLEncoder.encode(GBK_TO_GB2312,"GB2312"));
- System.out.println("<WAP TEST>GBK TO 8859_1 |" + "C8CB |" + java.net.URLEncoder.encode(GBK_TO_8859_1,"8859_1"));
- System.out.println("<WAP TEST>8859_1 TO UTF_8 |" + "E4BABA|" + java.net.URLEncoder.encode(ISO_8859_1_TO_UTF_8,"UTF-8"));
- System.out.println("<WAP TEST>8859_1 TO GB2312 |" + "C8CB |" + java.net.URLEncoder.encode(ISO_8859_1_TO_GB2312,"GB2312"));
- System.out.println("<WAP TEST>8859_1 TO GBK |" + "C8CB |" + java.net.URLEncoder.encode(ISO_8859_1_TO_GBK,"GBK"));
- System.out.println("<WAP TEST>UTF_8 TO 8859_1 |" + "E4BABA|" + java.net.URLEncoder.encode(UTF_8_TO_8859_1,"8859_1"));
- System.out.println("<WAP TEST>UTF_8 TO GB2312 |" + "C8CB |" + java.net.URLEncoder.encode(UTF_8_TO_GB2312,"GB2312"));
- System.out.println("<WAP TEST>UTF_8 TO GBK |" + "C8CB |" + java.net.URLEncoder.encode(UTF_8_TO_GBK,"GBK"));
测试结果如下:
N800 Openwave V6.1
- <WAP TEST>*********************************************************
- <WAP TEST>ren
- <WAP TEST> UTF_8 |E4BABA|%E4%BA%BA
- <WAP TEST> GBK |C8CB |%C8%CB
- <WAP TEST> GB2312 |C8CB |%C8%CB
- <WAP TEST> 8859_1 | |%3F
- <WAP TEST>GB2312 TO UTF_8 |E4BABA|%EF%BF%BD%EF%BF%BD
- <WAP TEST>GB2312 TO GBK |C8CB |%C8%CB
- <WAP TEST>GB2312 TO 8859_1 |C8CB |%C8%CB
- <WAP TEST>GBK TO UTF_8 |E4BABA|%EF%BF%BD%EF%BF%BD
- <WAP TEST>GBK TO GB2312 |C8CB |%C8%CB
- <WAP TEST>GBK TO 8859_1 |C8CB |%C8%CB
- <WAP TEST>8859_1 TO UTF_8 |E4BABA|%3F
- <WAP TEST>8859_1 TO GB2312 |C8CB |%3F
- <WAP TEST>8859_1 TO GBK |C8CB |%3F
- <WAP TEST>UTF_8 TO 8859_1 |E4BABA|%E4%BA%BA
- <WAP TEST>UTF_8 TO GB2312 |C8CB |%E4%BA%3F
- <WAP TEST>UTF_8 TO GBK |C8CB |%E4%BA%3F
Opera UTF-8
- <WAP TEST>*********************************************************
- <WAP TEST>ren
- <WAP TEST> UTF_8 |E4BABA|%C3%A4%C2%BA%C2%BA
- <WAP TEST> GBK |C8CB |%3F%3F%3F
- <WAP TEST> GB2312 |C8CB |%3F%3F%3F
- <WAP TEST> 8859_1 | |%E4%BA%BA
- <WAP TEST>GB2312 TO UTF_8 |E4BABA|%3F%3F%3F
- <WAP TEST>GB2312 TO GBK |C8CB |%3F%3F%3F
- <WAP TEST>GB2312 TO 8859_1 |C8CB |%3F%3F%3F
- <WAP TEST>GBK TO UTF_8 |E4BABA|%3F%3F%3F
- <WAP TEST>GBK TO GB2312 |C8CB |%3F%3F%3F
- <WAP TEST>GBK TO 8859_1 |C8CB |%3F%3F%3F
- <WAP TEST>8859_1 TO UTF_8 |E4BABA|%E4%BA%BA
- <WAP TEST>8859_1 TO GB2312 |C8CB |%E4%BA%3F
- <WAP TEST>8859_1 TO GBK |C8CB |%E4%BA%3F
- <WAP TEST>UTF_8 TO 8859_1 |E4BABA|%C3%A4%C2%BA%C2%BA
- <WAP TEST>UTF_8 TO GB2312 |C8CB |%C3%A4%C2%BA%C2%BA
- <WAP TEST>UTF_8 TO GBK |C8CB |%C3%A4%C2%BA%C2%BA
Opera GBK/GB2312
Nokia 6681
Openwave V7 Simulator
Openwave V6.2.2 GB2312
由测试结果可以看出:
1 N800 (Openwave V6.1)接收的Post中文字符 可以转换为UTF-8,GB2312/GBK都可以,都正确。
2 Opera UTF-8 只能采取8859_1转换UTF-8形式,某些手机是这样,可以支持。
3 Opera GBK/GB2312 只能采取8859_1转换GB2312/GBK形式,某些手机是这样,可以支持。
4 Nokia 6681 该手机就是和Opera UTF-8 测试结果关键部分相同,实际测试也是这样;
5 Openwave V7 Simulator测试结果同N800 (Openwave V6.1);
6 Openwave V6.2.2 GB2312 测试结果同Opera GBK/GB2312 ;
然而我是按Opera UTF-8 模式开发的(当初想的比较少),所以我开发的程序只适合2,4;起码N800就不支持;
解决方法:
这个解决办法可以解决N800,Nokia6670等手机的中文输入乱码问题(一系列手机);
由于测试手机太少,不能断定该方法可以解决全部手机的中文乱码问题,只是权宜之计;
本测试和系统环境,以及程序编码环境等诸多因素相关;
大家还有什么好的建议和方法呢??
上一个:WAP与PHP - 进阶篇 之一
下一个:WAP 2.0--XHTML mobile profile