文本处理求助
文本信息如【】所示,如何将加粗链接给删除呢?【http://item.taobao.com/item.htm?id=15743974559, http://item.taobao.com/item.htm?id=15743974559&on_comment=1#reviews, http://item.taobao.com/item.htm?id=15482006763, http://item.taobao.com/item.htm?id=15482006763&on_comment=1#reviews, http://item.taobao.com/item.htm?id=20254752472, http://item.taobao.com/item.htm?id=20254752472&on_comment=1#reviews, http://item.taobao.com/item.htm?id=17026847099, http://item.taobao.com/item.htm?id=17026847099&on_comment=1#reviews, http://item.taobao.com/item.htm?id=20544352592, http://item.taobao.com/item.htm?id=20544352592&on_comment=1#reviews】 --------------------编程问答-------------------- 加粗?是html格式么?不管什么格式,首先搞清楚加粗是用什么格式实现的,然后用正则表达式应该可以。 --------------------编程问答-------------------- 补充下哈,处理后的结果应如下:
【http://item.taobao.com/item.htm?id=15743974559,http://item.taobao.com/item.htm?id=15482006763, http://item.taobao.com/item.htm?id=20254752472, http://item.taobao.com/item.htm?id=17026847099, http://item.taobao.com/item.htm?id=20544352592】 --------------------编程问答-------------------- 是String格式的 --------------------编程问答--------------------
--------------------编程问答-------------------- 改成这个效率要高些:
String str = "【http://item.taobao.com/item.htm?id=15743974559," +
" http://item.taobao.com/item.htm?id=15743974559&on_comment=1#reviews," +
" http://item.taobao.com/item.htm?id=15482006763, " +
"http://item.taobao.com/item.htm?id=15482006763&on_comment=1#reviews," +
" http://item.taobao.com/item.htm?id=20254752472," +
" http://item.taobao.com/item.htm?id=20254752472&on_comment=1#reviews, " +
"http://item.taobao.com/item.htm?id=17026847099, " +
"http://item.taobao.com/item.htm?id=17026847099&on_comment=1#reviews," +
" http://item.taobao.com/item.htm?id=20544352592, " +
"http://item.taobao.com/item.htm?id=20544352592&on_comment=1#reviews】 ";
System.out.println(str.replaceAll("(http:.*?), http:(:?.*?)#reviews", "$1"));
System.out.println(str.replaceAll("(http:[^,]*), http:(:?[^,]*)#reviews", "$1")); --------------------编程问答-------------------- 前面都傻了,这个最简单:System.out.println(str.replaceAll(", http:[^,]*#reviews", "")); --------------------编程问答-------------------- str.replaceAll("(http:.*?), http:(.*?)#reviews",
"$1")
这个问号在此什么意思 好久不用正则了 有点忘了 问号有一个或者没有 这里不是这个意思吧 --------------------编程问答--------------------
懒惰匹配,你试试\\d+和\\d+?分别去匹配12345就知道什么意思了
补充:Java , Java SE