设为首页设为首页
 添加收藏添加收藏
 进入音乐版音 乐 版
  汉南在线网页设计JavaScript脚本

java 正则表达式过滤html元素
作  者:匿名
关键字:JavaScript



/**
     * filter all html element.
     * For example:<a href=www.sohu.com/test">hello!</a>
     * The filter result is :hello!
     * Notice:This method filter the text between "<" and ">"
     * @param element
     * @return
     */
     public static String getTxtWithoutHTMLElement (String element)
     {
//       String reg="<[^<|^>]+>";
//       return   element.replaceAll(reg,"");
        
         if(null==element||"".equals(element.trim()))
         {
             return element;
         }

         Pattern pattern=Pattern.compile("<[^<|^>]*>");
         Matcher matcher=pattern.matcher(element);
         StringBuffer txt=new StringBuffer();
         while(matcher.find())
         {
             String group=matcher.group();
             if(group.matches("<[\\s]*>"))
             {
                 matcher.appendReplacement(txt,group);    
             }
             else
             {
                 matcher.appendReplacement(txt,"");
             }
         }
         matcher.appendTail(txt);
         repaceEntities(txt,"&","&");
         repaceEntities(txt,"<","<");        
         repaceEntities(txt,">",">");
         repaceEntities(txt,""","\"");
         repaceEntities(txt," ","");
        
         return txt.toString();
     }



下面是测试用例:
public void testGetTxtWithoutHTMLElement ()
     {
        
         assertEquals("test",ExcelHssfView.getTxtWithoutHTMLElement("<a href='a/test'>test</a>"));
        
         assertEquals("test",ExcelHssfView.getTxtWithoutHTMLElement("<a href='a/test'>test"));
        
         assertEquals("test",ExcelHssfView.getTxtWithoutHTMLElement("<input type='text'>test</input>"));
        
         assertEquals("test",ExcelHssfView.getTxtWithoutHTMLElement("<p>test"));
        
         assertEquals("test",ExcelHssfView.getTxtWithoutHTMLElement("<table><tr><td>test</td></tr></table>"));
        
         assertEquals("te<st",ExcelHssfView.getTxtWithoutHTMLElement("<p>te<st"));
        
         assertEquals("te>st",ExcelHssfView.getTxtWithoutHTMLElement("<p>te>st"));
        
         assertEquals("tst",ExcelHssfView.getTxtWithoutHTMLElement("<p>t<e>st"));
        
         assertEquals("t<st",ExcelHssfView.getTxtWithoutHTMLElement("<p>t<<e>st"));
        
         assertEquals("<>test",ExcelHssfView.getTxtWithoutHTMLElement("<p><>test"));
        
         assertEquals("< >test",ExcelHssfView.getTxtWithoutHTMLElement("<p>< >test"));
        
         assertEquals("<<>test",ExcelHssfView.getTxtWithoutHTMLElement("<p><<>test"));
        
         assertEquals("test",ExcelHssfView.getTxtWithoutHTMLElement("<table><tr><td> test</td></tr></table>"));
        
     }


来源:网络
阅读:18
日期:2008-8-19

【 双击滚屏 】 【 收藏 】 【 打印 】 【 关闭 】 【 字体: 】 
上一篇:ASP取出HTML文件中图片地址的函数
下一篇:HTML代码过滤工具(正则表达式的应用)

  >> 相关文章
 
  ·js数据库操作的四种方法
  ·输入表单内容判断过滤
  ·javascript解析XML的方法
  ·JavaScript的9个陷阱及评点
  ·Javascript----文件操作
  ·js文件操作封装类
  ·asp过滤html代码函数
  ·html网页特效代码集
授权使用:汉南在线 http://www.hzwz.net/(2008-2009)   
Copyright (c) 2002-2007 汉南在线. All Rights Reserved . 
经营许可证:陕ICP备05000109号 Powered by:汉南在线