The best Ruby XML, XSLT and XPath Tutorial Tutorial In 2024, In this tutorial you can learn What is XML?,XML parser and API structure,Ruby to parse and create XML,DOM parser,SAX parser,XPath and Ruby,XSLT and Ruby,more info,
It refers to Extensible Markup Language XML (eXtensible Markup Language).
Extensible Markup Language, a subset of the Standard Generalized Markup Language, a method for marking an electronic document to have a structured markup language.
It can be used to tag data, defining data types, is a technology that allows users to define their own markup language source language. It is ideal for Web transmission, providing a unified approach to describing and exchanging independent of applications or vendors of structured data.
For more information, please see our XML tutorial
XML parser SAX and DOM are mainly two kinds.
RUBY parsing of XML documents can use this library REXML library.
REXML library is an XML toolkit ruby is to use pure Ruby language, follow XML1.0 norms.
In Ruby1.8 and later versions, the library will contain RUBY REXML.
Path REXML library is: rexml / document
All methods and classes are packaged into a REXML module.
REXML parser has the following advantages over other parsers:
The following is an example of XML code, save it as movies.xml:
<collection shelf="New Arrivals"> <movie title="Enemy Behind"> <type>War, Thriller</type> <format>DVD</format> <year>2003</year> <rating>PG</rating> <stars>10</stars> <description>Talk about a US-Japan war</description> </movie> <movie title="Transformers"> <type>Anime, Science Fiction</type> <format>DVD</format> <year>1989</year> <rating>R</rating> <stars>8</stars> <description>A schientific fiction</description> </movie> <movie title="Trigun"> <type>Anime, Action</type> <format>DVD</format> <episodes>4</episodes> <rating>PG</rating> <stars>10</stars> <description>Vash the Stampede!</description> </movie> <movie title="Ishtar"> <type>Comedy</type> <format>VHS</format> <rating>PG</rating> <stars>2</stars> <description>Viewable boredom</description> </movie> </collection>
Let's start to parse XML data First we introduced rexml / document library, we can usually be in the top-level namespace REXML introduced:
#!/usr/bin/ruby -w require 'rexml/document' include REXML xmlfile = File.new("movies.xml") xmldoc = Document.new(xmlfile) # 获取 root 元素 root = xmldoc.root puts "Root element : " + root.attributes["shelf"] # 以下将输出电影标题 xmldoc.elements.each("collection/movie"){ |e| puts "Movie Title : " + e.attributes["title"] } # 以下将输出所有电影类型 xmldoc.elements.each("collection/movie/type") { |e| puts "Movie Type : " + e.text } # 以下将输出所有电影描述 xmldoc.elements.each("collection/movie/description") { |e| puts "Movie Description : " + e.text }
The above example output is:
Root element : New Arrivals Movie Title : Enemy Behind Movie Title : Transformers Movie Title : Trigun Movie Title : Ishtar Movie Type : War, Thriller Movie Type : Anime, Science Fiction Movie Type : Anime, Action Movie Type : Comedy Movie Description : Talk about a US-Japan war Movie Description : A schientific fiction Movie Description : Vash the Stampede! Movie Description : Viewable boredom SAX-like Parsing:
Processing the same data file: movies.xml, SAX parsing is not recommended as a small file, the following is a simple example:
#!/usr/bin/ruby -w require 'rexml/document' require 'rexml/streamlistener' include REXML class MyListener include REXML::StreamListener def tag_start(*args) puts "tag_start: #{args.map {|x| x.inspect}.join(', ')}" end def text(data) return if data =~ /^\w*$/ # whitespace only abbrev = data[0/en40] + (data.length > 40 ? "/en." : "") puts " text : #{abbrev.inspect}" end end list = MyListener.new xmlfile = File.new("movies.xml") Document.parse_stream(xmlfile, list)
Above output is:
tag_start: "collection", {"shelf"=>"New Arrivals"} tag_start: "movie", {"title"=>"Enemy Behind"} tag_start: "type", {} text : "War, Thriller" tag_start: "format", {} tag_start: "year", {} tag_start: "rating", {} tag_start: "stars", {} tag_start: "description", {} text : "Talk about a US-Japan war" tag_start: "movie", {"title"=>"Transformers"} tag_start: "type", {} text : "Anime, Science Fiction" tag_start: "format", {} tag_start: "year", {} tag_start: "rating", {} tag_start: "stars", {} tag_start: "description", {} text : "A schientific fiction" tag_start: "movie", {"title"=>"Trigun"} tag_start: "type", {} text : "Anime, Action" tag_start: "format", {} tag_start: "episodes", {} tag_start: "rating", {} tag_start: "stars", {} tag_start: "description", {} text : "Vash the Stampede!" tag_start: "movie", {"title"=>"Ishtar"} tag_start: "type", {} tag_start: "format", {} tag_start: "rating", {} tag_start: "stars", {} tag_start: "description", {} text : "Viewable boredom"
We can use XPath to view XML, XPath to find information is a document in XML language (See: XPath Tutorial ).
XPath is the XML Path Language, it is a method used to determine the XML (a subset of the Standard Generalized Markup Language) document language a part of the location. XPath-based XML tree, and provides the ability to look for in the data structure nodes in the tree.
Ruby's XPath support XPath by REXML class, which is based on the analysis (Document Object Model) tree.
#!/usr/bin/ruby -w require 'rexml/document' include REXML xmlfile = File.new("movies.xml") xmldoc = Document.new(xmlfile) # 第一个电影的信息 movie = XPath.first(xmldoc, "//movie") p movie # 打印所有电影类型 XPath.each(xmldoc, "//type") { |e| puts e.text } # 获取所有电影格式的类型,返回数组 names = XPath.match(xmldoc, "//format").map {|x| x.text } p names
The above example output is:
<movie title='Enemy Behind'> /en. </> War, Thriller Anime, Science Fiction Anime, Action Comedy ["DVD", "DVD", "DVD", "VHS"]
Ruby has two XSLT parser, a brief description is given below:
This parser is written and maintained by the justice Masayoshi Takahash. This is mainly written for the Linux operating system, you need the following libraries:
You can Ruby-Sablotron find these libraries.
XSLT4R need XMLScan operation, including XSLT4R archive, which is a 100% Ruby module. These modules can use the standard Ruby installation method (ie Ruby install.rb) installation.
XSLT4R syntax is as follows:
ruby xslt.rb stylesheet.xsl document.xml [arguments]
If you want to use XSLT4R in your application, you can introduce XSLT and input parameters you need. Examples are as follows:
require "xslt" stylesheet = File.readlines("stylesheet.xsl").to_s xml_doc = File.readlines("document.xml").to_s arguments = { 'image_dir' => '//en/en' } sheet = XSLT::Stylesheet.new( stylesheet, arguments ) # output to StdOut sheet.apply( xml_doc ) # output to 'str' str = "" sheet.output = [ str ] sheet.apply( xml_doc )