JWorld@TW the best professional Java site in Taiwan
      註冊 | 登入 | 全文檢索 | 排行榜  

» JWorld@TW » Java & XML、Web Service  

按列印兼容模式列印這個話題 列印話題    把這個話題寄給朋友 寄給朋友    訂閱主題
reply to topicthreaded modego to previous topicgo to next topic
本主題所含的標籤
無標籤
作者 最快的xml 處理方式(parsing): Streaming Parser [精華]
saijone

Web Services

版主

發文: 470
積分: 24
於 2003-07-17 07:03 user profilesend a private message to userreply to postreply to postsearch all posts byselect and copy to clipboard. 
ie only, sorry for netscape users:-)add this post to my favorite list
http://www-106.ibm.com/developerworks/xml/library/x-injava/index.html
http://www.extreme.indiana.edu/xgws/xsoap/xpp/

在java中, XMP parsing 除了JAXP中已標準化的 SAX跟DOM之外, 另有一種
steaming/pull parser在 WebServices SOAP message處裡中受到喜愛
BEA的 WebLogic WebServices 及 SUN 的JAXRPC-RI都採用了(proprietary)
Streaming parser.

未來Streaming Parsing應該也會成為Java標準的API. BEA領導的 JSR-173
(Streaming API for XML or StAX) http://www.jcp.org/en/jsr/detail?id=173
應該會成為未來 Java Streaming Parsing標準的API.

我在一年多前作的實測中, 使用streaming parsing的jaxrpc-ri確實 有比使用SAX
的Axis快些, 但在使用JDK1.4/Crimson距離有拉近(當然那是一年多前的事, 現
在的Axis或jaxrpc-ri在performance上可能已有明顯不同)
在WebService中Performance表現突出的Systinet(http://www.systinet.com/products/java_ws)
也是使用Streaming Pull parser(據說改自於XPP)

另外, SAX 的 Performance雖然也不錯, 但在 Axis 的程式碼中可以了解 SAX
不適合用在 SOAP處理上的事實. Axis將SAX Event收集(record)起來以便
deserialization使用

Steaming Pull parsing 除了速度快外, 在記憶體使用上亦表現相當突出
詳情請閱: http://www-106.ibm.com/developerworks/xml/library/x-injava/index.html


reply to postreply to post
作者 Re:最快的xml 處理方式(parsing): Streaming Parser [Re:saijone]
popcorny

Jakarta 2%

版主

發文: 752
積分: 20
於 2003-07-17 09:55 user profilesend a private message to userreply to postreply to postsearch all posts byselect and copy to clipboard. 
ie only, sorry for netscape users:-)add this post to my favorite list
saijone wrote:
http://www-106.ibm.com/developerworks/xml/library/x-injava/index.html
http://www.extreme.indiana.edu/xgws/xsoap/xpp/

在java中, XMP parsing 除了JAXP中已標準化的 SAX跟DOM之外, 另有一種
steaming/pull parser在 WebServices SOAP message處裡中受到喜愛
BEA的 WebLogic WebServices 及 SUN 的JAXRPC-RI都採用了(proprietary)
Streaming parser.

未來Streaming Parsing應該也會成為Java標準的API. BEA領導的 JSR-173
(Streaming API for XML or StAX) http://www.jcp.org/en/jsr/detail?id=173
應該會成為未來 Java Streaming Parsing標準的API.

我在一年多前作的實測中, 使用streaming parsing的jaxrpc-ri確實 有比使用SAX
的Axis快些, 但在使用JDK1.4/Crimson距離有拉近(當然那是一年多前的事, 現
在的Axis或jaxrpc-ri在performance上可能已有明顯不同)
在WebService中Performance表現突出的Systinet(http://www.systinet.com/products/java_ws)
也是使用Streaming Pull parser(據說改自於XPP)

另外, SAX 的 Performance雖然也不錯, 但在 Axis 的程式碼中可以了解 SAX
不適合用在 SOAP處理上的事實. Axis將SAX Event收集(record)起來以便
deserialization使用

Steaming Pull parsing 除了速度快外, 在記憶體使用上亦表現相當突出
詳情請閱: http://www-106.ibm.com/developerworks/xml/library/x-injava/index.html

不過pull parser應該對特殊用途才有比較好的表現吧..
譬如RPC router(還是叫XML router)才比較有用處...
因為他只需要看到SOAP header
並不需要把整份文件看完
因此就比較適合用pull的方式取得資料
希望沒有說錯.. Big Smile


reply to postreply to post
作者 Re:最快的xml 處理方式(parsing): Streaming Parser [Re:saijone]
saijone

Web Services

版主

發文: 470
積分: 24
於 2003-07-17 10:26 user profilesend a private message to userreply to postreply to postsearch all posts byselect and copy to clipboard. 
ie only, sorry for netscape users:-)add this post to my favorite list
My Chinese input is slow, forgive me for using eng ...

Partial parsing is just one of many benefits of streaming parsing.
Theoretically, SAX parsing should have the best performance and should
not take too much memory size. However, the reality
(http://www-106.ibm.com/developerworks/xml/library/x-injava/index.html) is not that good
for SAX. An important thing I found in my previous project is that
streaming parser/reader is much LESS expensive to instanciate/create
than SAX-reader or DOM builder. Think about normal soap requests
processing that you may need to parse/process hundreds/thousands of
SOAP messages in one second. For each SOAP, you need to create, or
possibly reuse, a parser. Therefore, parser's initilization speed and memory
footprint become significant. Because sax parser internally is a complex
state machine, it may be difficult (or even always buggy) for
reuse/pooling.

And of course, I totally agree that steaming parsing is definitly NOT
suitable for every XML use case. For WS-Routing or some WebService
management implementation that only process SOAP header as service
context, Streaming pull parser is apparently the best choice.


browser edited on 2003-07-17 10:32
reply to postreply to post
作者 Re:最快的xml 處理方式(parsing): Streaming Parser [Re:saijone]
vaduta





發文: 1
積分: 0
於 2003-07-18 16:34 user profilesend a private message to userreply to postreply to postsearch all posts byselect and copy to clipboard. 
ie only, sorry for netscape users:-)add this post to my favorite list
saijone wrote:
Partial parsing is just one of many benefits of streaming parsing.
Theoretically, SAX parsing should have the best performance and should
not take too much memory size. However, the reality
(http://www-106.ibm.com/developerworks/xml/library/x-injava/index.html) is not that good
for SAX. An important thing I found in my previous project is that
streaming parser/reader is much LESS expensive to instanciate/create
than SAX-reader or DOM builder. Think about normal soap requests
processing that you may need to parse/process hundreds/thousands of
SOAP messages in one second. For each SOAP, you need to create, or
possibly reuse, a parser. Therefore, parser's initilization speed and memory
footprint become significant. Because sax parser internally is a complex
state machine, it may be difficult (or even always buggy) for
reuse/pooling.

And of course, I totally agree that steaming parsing is definitly NOT
suitable for every XML use case. For WS-Routing or some WebService
management implementation that only process SOAP header as service
context, Streaming pull parser is apparently the best choice.


http://www-106.ibm.com/developerworks/library/x-databdopt2/
可以參考這篇文章, jibx 用的是 XPP.
processing large/small full ducument, SAX 會比 XPP 快.


reply to postreply to post
作者 Re:最快的xml 處理方式(parsing): Streaming Parser [Re:saijone]
ray_linn

什么都不懂的小白

版主

發文: 540
積分: 10
於 2003-07-18 17:00 user profilesend a private message to usersend email to ray_linnreply to postreply to postsearch all posts byselect and copy to clipboard. 
ie only, sorry for netscape users:-)add this post to my favorite list
如果我沒理解錯的話,這種pull的概唸來自于MS.而且是dotNET處理XML的標註方法之一(目前dotNet僅僅支持pull和dom,而不支持sax的push方式)。當時的設計理唸是--放棄sax的event處理方式,讓程序更簡單。

例如在book.xml中search title的整個c#代碼如下:
1
2
3
4
5
6
7
8
9
10
using System.xml;
...
XmlTextReader tr=new XmlTextReader(@".\book.xml");
while(!tr.EOF)
{
     if(tr.MoveToContent()==XmlNodeType.Element&&tr.Name=="title")
     {
             Console.WriteLine(tr.ReadElementString());
     }
}


ray_linn edited on 2003-07-18 18:29
reply to postreply to post
飞翔的候鸟
作者 Re:最快的xml 處理方式(parsing): Streaming Parser [Re:ray_linn]
popcorny

Jakarta 2%

版主

發文: 752
積分: 20
於 2003-07-18 17:41 user profilesend a private message to userreply to postreply to postsearch all posts byselect and copy to clipboard. 
ie only, sorry for netscape users:-)add this post to my favorite list
ray_linn wrote:
如果我沒理解錯的話,這種pull的概唸來自于MS.而且是dotNET處理XML的標註方法之一(目前dotNet僅僅支持pull和dom,而不支持sax的push方式)。當時的設計理唸是--放棄sax的event處理方式,讓程序更簡單。

例如在book.xml中search title的整個c#代碼如下:
1
2
3
4
5
6
7
8
9
10
using System.xml;
...
XmlTextReader tr=new XmlTextReader(@".\book.xml");
while(!tr.EOF)
{
     if(tr.MoveToContent()=XmlNodeType.Element&&tr.Name="title")
     {
             Console.WriteLine(tr.ReadElementString());
     }
}



Cool...
不過我不知道pull parser是M$還是BEA先提出的
在去年javatwo中...BEA的廠商sesssion(勞虎主講)中提到
這是他們最先提出來的

還有...你的code有小小的bug
那就是
應該是'==' 而非'=' Big SmileBig SmileBig Smile


reply to postreply to post
作者 Re:最快的xml 處理方式(parsing): Streaming Parser [Re:saijone]
ray_linn

什么都不懂的小白

版主

發文: 540
積分: 10
於 2003-07-18 18:38 user profilesend a private message to usersend email to ray_linnreply to postreply to postsearch all posts byselect and copy to clipboard. 
ie only, sorry for netscape users:-)add this post to my favorite list
哈哈,趕緊改正。

我在2001年8月玩VS.NET BETA 1的時候,M$就已經是如此處理XML了,
那麼更早以前的版本也是支持的落。


reply to postreply to post
飞翔的候鸟
作者 Re:最快的xml 處理方式(parsing): Streaming Parser [Re:ray_linn]
saijone

Web Services

版主

發文: 470
積分: 24
於 2003-07-18 20:29 user profilesend a private message to userreply to postreply to postsearch all posts byselect and copy to clipboard. 
ie only, sorry for netscape users:-)add this post to my favorite list
Pull Parsing不是Alex(http://www.extreme.indiana.edu/~aslom/)提出的嗎?

一 Alex 的 Paper: http://www.extreme.indiana.edu/xgws/papers/xml_push_pull.pdf

在 Java 中 Alex 的 XPP應最早的, 但確實是BEA帶領的 JSR-173 "第一個提出要將之標準化", Alex 亦有參與 JSR-173 的發展

ps. 而 Java WebServices 的核心 jax-rpc 中, 尚未納入正式規格的
Serialization Framework其實也有為使用Pull Parsing 鋪路


reply to postreply to post
作者 Re:最快的xml 處理方式(parsing): Streaming Parser [Re:saijone]
ray_linn

什么都不懂的小白

版主

發文: 540
積分: 10
於 2003-07-18 21:56 user profilesend a private message to usersend email to ray_linnreply to postreply to postsearch all posts byselect and copy to clipboard. 
ie only, sorry for netscape users:-)add this post to my favorite list
of couse, He is the No 1 in Java ,but only in Java, The first RC XPP is released in 2001 8 15 , but before that , MS has already adopted the Pull method as the standard XML Parsing Method. (Visual studio.Net Beta 2 has already been released before 2001.6 , and Beta 1 is earlier and earlier).

But anyway, it is useless to argue about this , right? I hope the C# source code can help someone who is interesting in this to understand how the XPP works.


ray_linn edited on 2003-07-18 22:08
reply to postreply to post
飞翔的候鸟
作者 Re:最快的xml 處理方式(parsing): Streaming Parser [Re:saijone]
mosman





發文: 1
積分: 0
於 2004-07-26 02:59 user profilesend a private message to userreply to postreply to postsearch all posts byselect and copy to clipboard. 
ie only, sorry for netscape users:-)add this post to my favorite list
可以說明一下,如果執行同樣的操作,怎樣的XML對應到parser的效能會較好?
也就是,
怎樣的XML檔案,使用pull parser的效能會較好?
怎樣的XML檔案,使用model parser的效能會較好?
怎樣的XML檔案,使用push parser的效能會較好?


reply to postreply to post
» JWorld@TW »  Java & XML、Web Service

reply to topicthreaded modego to previous topicgo to next topic
  已讀文章
  新的文章
  被刪除的文章
Jump to the top of page

JWorld@TW 本站商標資訊

Powered by Powerful JuteForum® Version Jute 1.5.8