Skip to content

Commit 6f8e942

Browse files
committed
Convert README to markdown, remove benchmarks, update documentation links.
1 parent 459042e commit 6f8e942

1 file changed

Lines changed: 164 additions & 208 deletions

File tree

README.md

Lines changed: 164 additions & 208 deletions
Original file line numberDiff line numberDiff line change
@@ -1,208 +1,164 @@
1-
= LibXML Ruby
2-
3-
== Overview
4-
The libxml gem provides Ruby language bindings for GNOME's Libxml2
5-
XML toolkit. It is free software, released under the MIT License.
6-
7-
We think libxml-ruby is the best XML library for Ruby because:
8-
9-
* Speed - Its much faster than REXML and Hpricot
10-
* Features - It provides an amazing number of featues
11-
* Conformance - It passes all 1800+ tests from the OASIS XML Tests Suite
12-
13-
== Requirements
14-
libxml-ruby requires Ruby 3.0.0 or higher. It depends on libxml2 to
15-
function properly. libxml2, in turn, depends on:
16-
17-
* libm (math routines: very standard)
18-
* libz (zlib)
19-
* libiconv
20-
21-
If you are running Linux or Unix you'll need a C compiler so the
22-
extension can be compiled when it is installed. If you are running
23-
Windows, then install the x64-mingw-ucr gem or build it yourself using (Ruby
24-
for Windows)[https://rubyinstaller.org/] or directly with msys2[https://msys2.github.io/]
25-
and ucrt64.
26-
27-
== Installation
28-
The easiest way to install libxml-ruby is via RubyGems. To install:
29-
30-
<tt>gem install libxml-ruby</tt>
31-
32-
If the extension compile process cannot find libxml2, you may need to indicate
33-
the location of the libxml2 configuration utility as it is used to find the
34-
required header and include files. (If you need to indicate a location for the
35-
libxml2 library or header files different than reported by <tt>xml2-config</tt>,
36-
see the additional configuration options.)
37-
38-
This may be done with RubyGems:
39-
40-
<tt>gem install libxml-ruby -- --with-xml2-dir=/path/to/xml2-config</tt>
41-
42-
Or bundler:
43-
44-
<tt>bundle config build.libxml-ruby --with-xml2-config=/path/to/xml2-config</tt>
45-
46-
<tt>bundle install libxml-ruby</tt>
47-
48-
If you are running Windows, then install the libxml-ruby-x64-mingw32 gem.
49-
The gem includes prebuilt extensions for Ruby 3.2 and 3.3.
50-
51-
The gem also includes a Microsoft VC++ solution and XCode project - these
52-
are very useful for debugging.
53-
54-
libxml-ruby's source codes lives on GitHub[https://github.com/xml4r/libxml-ruby].
55-
56-
== Getting Started
57-
Using libxml is easy. First decide what parser you want to use:
58-
59-
* Generally you'll want to use the LibXML::XML::Parser which provides a tree based API.
60-
* For larger documents that don't fit into memory, or if you prefer an input based API, use the LibXML::XML::Reader.
61-
* To parse HTML files use LibXML::XML::HTMLParser.
62-
* If you are masochistic, then use the LibXML::XML::SaxParser, which provides a callback API.
63-
64-
Once you have chosen a parser, choose a datasource. Libxml can parse files, strings, URIs
65-
and IO streams. For each data source you can specify an LibXML::XML::Encoding, a base uri and
66-
various parser options. For more information, refer the LibXML::XML::Parser.document,
67-
LibXML::XML::Parser.file, LibXML::XML::Parser.io or LibXML:::XML::Parser.string methods (the
68-
same methods are defined on all four parser classes).
69-
70-
== Advanced Functionality
71-
Beyond the basics of parsing and processing XML and HTML documents,
72-
libxml provides a wealth of additional functionality.
73-
74-
Most commonly, you'll want to use its LibXML::XML::XPath support, which makes
75-
it easy to find data inside an XML document. Although not as popular,
76-
LibXML::XML::XPointer provides another API for finding data inside an XML document.
77-
78-
Often times you'll need to validate data before processing it. For example,
79-
if you accept user generated content submitted over the Web, you'll
80-
want to verify that it does not contain malicious code such as embedded scripts.
81-
This can be done using libxml's powerful set of validators:
82-
83-
* DTDs (LibXML::XML::Dtd)
84-
* Relax Schemas (LibXML::XML::RelaxNG)
85-
* XML Schema (LibXML::XML::Schema)
86-
87-
Finally, if you'd like to use XSL Transformations to process data, then install
88-
the {libxslt gem}[https://github.com/xml4r/libxslt-rubygem].
89-
90-
== Usage
91-
For information about using libxml-ruby please refer to its
92-
documentation[https://xml4r.github.io/libxml-ruby]. Some tutorials are also
93-
available[https://github.com/xml4r/libxml-ruby/wiki].
94-
95-
All libxml classes are in the LibXML::XML module. The easiest
96-
way to use libxml is to <tt>require 'xml'</tt>. This will mixin
97-
the LibXML module into the global namespace, allowing you to
98-
write code like this:
99-
100-
require 'xml'
101-
document = XML::Document.new
102-
103-
However, when creating an application or library you plan to
104-
redistribute, it is best to not add the LibXML module to the global
105-
namespace, in which case you can either write your code like this:
106-
107-
require 'libxml'
108-
document = LibXML::XML::Document.new
109-
110-
Or you can utilize a namespace for your own work and include LibXML into it.
111-
For example:
112-
113-
require 'libxml'
114-
115-
module MyApplication
116-
include LibXML
117-
118-
class MyClass
119-
def some_method
120-
document = XML::Document.new
121-
end
122-
end
123-
end
124-
125-
For simplicity's sake, the documentation uses the xml module in its examples.
126-
127-
== Tests
128-
129-
To run tests you first need to build the shared libary:
130-
131-
rake compile
132-
133-
Once you have build the shared libary, you can then run tests using rake:
134-
135-
rake test
136-
137-
+Build status: {rdoc-image:https://github.com/xml4r/libxml-ruby/actions/workflows/mri.yml/badge.svg}[https://github.com/xml4r/libxml-ruby/actions/workflows/mri.yml]
138-
139-
== Performance
140-
141-
In addition to being feature rich and conformation, the main reason
142-
people use libxml-ruby is for performance. Here are the results
143-
of a couple simple benchmarks recently blogged about on the
144-
Web (you can find them in the benchmark directory of the
145-
libxml distribution).
146-
147-
From http://depixelate.com/2008/4/23/ruby-xml-parsing-benchmarks
148-
149-
user system total real
150-
libxml 0.032000 0.000000 0.032000 ( 0.031000)
151-
Hpricot 0.640000 0.031000 0.671000 ( 0.890000)
152-
REXML 1.813000 0.047000 1.860000 ( 2.031000)
153-
154-
From https://svn.concord.org/svn/projects/trunk/common/ruby/xml_benchmarks/
155-
156-
user system total real
157-
libxml 0.641000 0.031000 0.672000 ( 0.672000)
158-
hpricot 5.359000 0.062000 5.421000 ( 5.516000)
159-
rexml 22.859000 0.047000 22.906000 ( 23.203000)
160-
161-
162-
== Documentation
163-
Documentation is available via rdoc, and is installed automatically with the
164-
gem.
165-
166-
libxml-ruby's {online
167-
documentation}[https://xml4r.github.io/libxml-ruby/rdoc/index.html] is generated
168-
using Hanna, which is a development gem dependency.
169-
170-
Note that older versions of Rdoc, which ship with Ruby 1.8.x, will report
171-
a number of errors. To avoid them, install Rdoc 2.1 or higher. Once you have
172-
installed the gem, you'll have to disable the version of Rdoc that Ruby 1.8.x
173-
includes. An easy way to do that is rename the directory
174-
<tt>ruby/lib/ruby/1.8/rdoc</tt> to
175-
<tt>ruby/lib/ruby/1.8/rdoc_old</tt>.
176-
177-
== Support
178-
If you have any questions about using libxml-ruby, please report an issue
179-
on GitHub[https://github.com/xml4r/libxml-ruby/issues].
180-
181-
== Memory Management
182-
libxml-ruby automatically manages memory associated with the
183-
underlying libxml2 library. The bindings create a one-to-one mapping between
184-
Ruby objects and libxml documents and libxml parent nodes (ie, nodes that do not
185-
have a parent and do not belong to a document). In these cases,
186-
the bindings manage the memory. They do this by installing a free
187-
function and storing a back pointer to the Ruby object from the xmlnode
188-
using the _private member on libxml structures. When the Ruby object
189-
goes out of scope, the underlying libxml structure is freed. Libxml
190-
itself then frees all child nodes (recursively).
191-
192-
For all other nodes (the vast majority), the bindings create temporary
193-
Ruby objects that get freed once they go out of scope. Thus there can be
194-
more than one Ruby object pointing to the same xml node. To mostly hide
195-
this from a programmer on the Ruby side, the <tt>#eql?</tt> and <tt>#==</tt> methods are
196-
overriden to check if two Ruby objects wrap the same xmlnode. If they do,
197-
then the methods return true. During the mark phase, each of these temporary
198-
objects marks its owning document, thereby keeping the Ruby document object
199-
alive and thus the xmldoc tree.
200-
201-
In the sweep phase of the garbage collector, or when a program ends,
202-
there is no order to how Ruby objects are freed. In fact, the Ruby document
203-
object is almost always freed before any Ruby objects that wrap child nodes.
204-
However, this is ok because those Ruby objects do not have a free function
205-
and are no longer in scope (since if they were the document would not be freed).
206-
207-
== License
208-
See LICENSE for license information.
1+
# LibXML Ruby
2+
3+
## Overview
4+
The libxml gem provides Ruby language bindings for GNOME's Libxml2
5+
XML toolkit. It is free software, released under the MIT License.
6+
7+
We think libxml-ruby is the best XML library for Ruby because:
8+
9+
* Speed - It's much faster than REXML
10+
* Features - It provides an amazing number of features
11+
* Conformance - It passes all 1800+ tests from the OASIS XML Tests Suite
12+
13+
## Requirements
14+
libxml-ruby requires Ruby 3.2 or higher. It depends on libxml2 to
15+
function properly. libxml2, in turn, depends on:
16+
17+
* libm (math routines: very standard)
18+
* libz (zlib)
19+
* libiconv
20+
21+
If you are running Linux or Unix you'll need a C compiler so the
22+
extension can be compiled when it is installed. If you are running
23+
Windows, then install the x64-mingw-ucr gem or build it yourself using
24+
[Ruby for Windows](https://rubyinstaller.org/) or directly with
25+
[msys2](https://msys2.github.io/) and ucrt64.
26+
27+
## Installation
28+
The easiest way to install libxml-ruby is via RubyGems. To install:
29+
30+
```
31+
gem install libxml-ruby
32+
```
33+
34+
If the extension compile process cannot find libxml2, you may need to indicate
35+
the location of the libxml2 configuration utility as it is used to find the
36+
required header and include files. (If you need to indicate a location for the
37+
libxml2 library or header files different than reported by `xml2-config`,
38+
see the additional configuration options.)
39+
40+
This may be done with RubyGems:
41+
42+
```
43+
gem install libxml-ruby -- --with-xml2-dir=/path/to/xml2-config
44+
```
45+
46+
Or bundler:
47+
48+
```
49+
bundle config build.libxml-ruby --with-xml2-config=/path/to/xml2-config
50+
bundle install libxml-ruby
51+
```
52+
53+
If you are running Windows, then install the libxml-ruby-x64-mingw32 gem.
54+
The gem includes prebuilt extensions for Ruby 3.2 and 3.3.
55+
56+
The gem also includes a Microsoft VC++ solution and XCode project - these
57+
are very useful for debugging.
58+
59+
libxml-ruby's source code lives on [GitHub](https://github.com/xml4r/libxml-ruby).
60+
61+
## Getting Started
62+
Using libxml is easy. First decide what parser you want to use:
63+
64+
* Generally you'll want to use the `LibXML::XML::Parser` which provides a tree based API.
65+
* For larger documents that don't fit into memory, or if you prefer an input based API, use the `LibXML::XML::Reader`.
66+
* To parse HTML files use `LibXML::XML::HTMLParser`.
67+
* If you are masochistic, then use the `LibXML::XML::SaxParser`, which provides a callback API.
68+
69+
Once you have chosen a parser, choose a datasource. Libxml can parse files, strings, URIs
70+
and IO streams. For each data source you can specify an `LibXML::XML::Encoding`, a base uri and
71+
various parser options. For more information, refer the `LibXML::XML::Parser.document`,
72+
`LibXML::XML::Parser.file`, `LibXML::XML::Parser.io` or `LibXML::XML::Parser.string` methods (the
73+
same methods are defined on all four parser classes).
74+
75+
## Advanced Functionality
76+
Beyond the basics of parsing and processing XML and HTML documents,
77+
libxml provides a wealth of additional functionality.
78+
79+
Most commonly, you'll want to use its `LibXML::XML::XPath` support, which makes
80+
it easy to find data inside an XML document. Although not as popular,
81+
`LibXML::XML::XPointer` provides another API for finding data inside an XML document.
82+
83+
Often times you'll need to validate data before processing it. For example,
84+
if you accept user generated content submitted over the Web, you'll
85+
want to verify that it does not contain malicious code such as embedded scripts.
86+
This can be done using libxml's powerful set of validators:
87+
88+
* DTDs (`LibXML::XML::Dtd`)
89+
* Relax Schemas (`LibXML::XML::RelaxNG`)
90+
* XML Schema (`LibXML::XML::Schema`)
91+
92+
Finally, if you'd like to use XSL Transformations to process data, then install
93+
the [libxslt gem](https://github.com/xml4r/libxslt-ruby).
94+
95+
## Usage
96+
For information about using libxml-ruby please refer to its
97+
[documentation](https://xml4r.github.io/libxml-ruby/).
98+
99+
All libxml classes are in the `LibXML::XML` module. The easiest
100+
way to use libxml is to `require 'xml'`. This will mixin
101+
the LibXML module into the global namespace, allowing you to
102+
write code like this:
103+
104+
```ruby
105+
require 'xml'
106+
document = XML::Document.new
107+
```
108+
109+
However, when creating an application or library you plan to
110+
redistribute, it is best to not add the LibXML module to the global
111+
namespace, in which case you can either write your code like this:
112+
113+
```ruby
114+
require 'libxml'
115+
document = LibXML::XML::Document.new
116+
```
117+
118+
Or you can utilize a namespace for your own work and include LibXML into it.
119+
For example:
120+
121+
```ruby
122+
require 'libxml'
123+
124+
module MyApplication
125+
include LibXML
126+
127+
class MyClass
128+
def some_method
129+
document = XML::Document.new
130+
end
131+
end
132+
end
133+
```
134+
135+
For simplicity's sake, the documentation uses the xml module in its examples.
136+
137+
## Tests
138+
139+
To run tests you first need to build the shared library:
140+
141+
```
142+
rake compile
143+
```
144+
145+
Once you have built the shared library, you can then run tests using rake:
146+
147+
```
148+
rake test
149+
```
150+
151+
[![Build Status](https://github.com/xml4r/libxml-ruby/actions/workflows/mri.yml/badge.svg)](https://github.com/xml4r/libxml-ruby/actions/workflows/mri.yml)
152+
153+
## Documentation
154+
Documentation is available at [xml4r.github.io/libxml-ruby](https://xml4r.github.io/libxml-ruby/).
155+
156+
API reference documentation is generated via rdoc and is available at
157+
[xml4r.github.io/libxml-ruby/reference](https://xml4r.github.io/libxml-ruby/reference/).
158+
159+
## Support
160+
If you have any questions about using libxml-ruby, please report an issue
161+
on [GitHub](https://github.com/xml4r/libxml-ruby/issues).
162+
163+
## License
164+
See [LICENSE](LICENSE) for license information.

0 commit comments

Comments
 (0)