Skip to content

Commit e86dc23

Browse files
authored
XML-processor implementation (#1)
1 parent ac37369 commit e86dc23

35 files changed

Lines changed: 1414 additions & 3 deletions

.github/workflows/ci.yml

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
name: CI
2+
3+
on: [ push, pull_request ]
4+
5+
jobs:
6+
test:
7+
runs-on: ubuntu-latest
8+
strategy:
9+
matrix:
10+
env:
11+
- { php: 7.4, phpunit: 9 }
12+
- { php: 8.0, phpunit: 10 }
13+
- { php: 8.1, phpunit: 10 }
14+
- { php: 8.2, phpunit: 10 }
15+
steps:
16+
- uses: actions/checkout@v3
17+
- name: Composer cache
18+
uses: actions/cache@v3
19+
with:
20+
path: "vendor"
21+
key: ${{ runner.os }}-${{ matrix.env.php }}-composer-${{ hashFiles('composer.json') }}
22+
- name: Setup PHP
23+
uses: shivammathur/setup-php@v2
24+
with:
25+
php-version: ${{ matrix.env.php }}
26+
- name: Update Composer
27+
run: |
28+
sudo composer self-update
29+
composer --version
30+
- name: Validate composer.json and composer.lock
31+
run: composer validate
32+
- name: Install composer
33+
run: composer install -o --no-interaction --no-suggest --prefer-dist
34+
35+
- name: PHPUnit tests
36+
uses: php-actions/phpunit@v3
37+
env:
38+
XDEBUG_MODE: coverage
39+
with:
40+
coverage_cobertura: "cobertura.xml"
41+
php_version: "${{ matrix.env.php }}"
42+
php_extensions: "xdebug"
43+
version: "${{ matrix.env.phpunit }}"
44+
configuration: "phpunit.xml"
45+
- name: "Code Coverage Report"
46+
if: "matrix.env.php == '8.2' && github.event_name == 'pull_request'"
47+
uses: irongut/CodeCoverageSummary@v1.3.0
48+
with:
49+
filename: cobertura.xml
50+
badge: true
51+
fail_below_min: true
52+
format: markdown
53+
hide_branch_rate: false
54+
hide_complexity: true
55+
indicators: true
56+
output: both
57+
thresholds: '60 80'
58+
- name: Add Coverage PR Comment
59+
if: "matrix.env.php == '8.2' && github.event_name == 'pull_request'"
60+
uses: marocchino/sticky-pull-request-comment@v2
61+
with:
62+
recreate: true
63+
path: code-coverage-results.md
64+
65+
- name: Behat tests
66+
uses: php-actions/behat@master
67+
with:
68+
php_version: "${{ matrix.env.php }}"

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
.idea/
2+
vendor/
3+
composer.lock

READEME.md

Lines changed: 0 additions & 3 deletions
This file was deleted.

README.md

Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
# XML-Processor
2+
3+
PHP XML-Processor based on XMLReader.
4+
5+
The [`XMLProcessor`] walks through the XML-file with the `\XMLReader` and fires events on each node of the `\XMLReader`.
6+
So its ease to process huge XML files with low memory usage.
7+
8+
## Events
9+
10+
The following events are available:
11+
12+
| Event | XMLReader NodeType<br>event const | react on | callback arguments |
13+
|---------------|--------------------------------------|-----------------------------------|--------------------------|
14+
| `openFile` | `XmlProcessor::EVENT_OPEN_FILE` | after open file before first read | [`NodeProcessorContext`] |
15+
| `endOfFile` | `XmlProcessor::EVENT_END_OF_FILE` | after last read before close | [`NodeProcessorContext`] |
16+
| `NodeType_0` | `\XMLReader::NONE` | No node type | [`NodeProcessorContext`] |
17+
| `NodeType_1` | `\XMLReader::ELEMENT` | Start element | [`OpenContext`] |
18+
| `NodeType_2` | `\XMLReader::ATTRIBUTE` | Attribute node | [`NodeProcessorContext`] |
19+
| `NodeType_3` | `\XMLReader::TEXT` | Text node | [`TextContext`] |
20+
| `NodeType_4` | `\XMLReader::CDATA` | CDATA node | [`NodeProcessorContext`] |
21+
| `NodeType_5` | `\XMLReader::ENTITY_REF` | Entity Reference node | [`NodeProcessorContext`] |
22+
| `NodeType_6` | `\XMLReader::ENTITY` | Entity Declaration node | [`NodeProcessorContext`] |
23+
| `NodeType_7` | `\XMLReader::PI` | Processing Instruction node | [`NodeProcessorContext`] |
24+
| `NodeType_8` | `\XMLReader::COMMENT` | Comment node | [`NodeProcessorContext`] |
25+
| `NodeType_9` | `\XMLReader::DOC` | Document node | [`NodeProcessorContext`] |
26+
| `NodeType_10` | `\XMLReader::DOC_TYPE` | Document Type node | [`NodeProcessorContext`] |
27+
| `NodeType_11` | `\XMLReader::DOC_FRAGMENT` | Document Fragment node | [`NodeProcessorContext`] |
28+
| `NodeType_12` | `\XMLReader::NOTATION` | Notation node | [`NodeProcessorContext`] |
29+
| `NodeType_13` | `\XMLReader::WHITESPACE` | Whitespace node | [`NodeProcessorContext`] |
30+
| `NodeType_14` | `\XMLReader::SIGNIFICANT_WHITESPACE` | Significant Whitespace node | [`NodeProcessorContext`] |
31+
| `NodeType_15` | `\XMLReader::END_ELEMENT` | End Element | [`CloseContext`] |
32+
| `NodeType_16` | `\XMLReader::END_ENTITY` | End Entity | [`NodeProcessorContext`] |
33+
| `NodeType_17` | `\XMLReader::XML_DECLARATION` | XML Declaration node | [`NodeProcessorContext`] |
34+
35+
## How to use
36+
37+
To process an XML file, you need to create a nodeProcessor class.
38+
It has to implement the [`NodeProcessorInterface`].
39+
40+
Where you can define `NodeProcessorInterface::getSubscribedEvents` on which events you want to react.
41+
42+
For easier use, you can extend the [`AbstractNodeProcessor`] class and implement one of the following interfaces:
43+
44+
| Interface | description |
45+
|---------------------------------|-------------------------------|
46+
| [`OpenNodeProcessorInterface`] | To react on opening tags |
47+
| [`CloseNodeProcessorInterface`] | To react on closing tags |
48+
| [`TextNodeProcessorInterface`] | To react on text between tags |
49+
50+
## Example
51+
52+
To extract all values of `<value>` nodes of the following XML:
53+
54+
**file.xml**
55+
56+
```xml
57+
<?xml version="1.0" encoding="UTF-8"?>
58+
<root>
59+
<value>foo</value>
60+
<value>bar</value>
61+
<value>baz</value>
62+
</root>
63+
```
64+
65+
Create a simple nodeProcessor class which collect all values of the `<value>` nodes.
66+
67+
**OpenTestNodeProcessor.php**
68+
69+
```php
70+
use Netlogix\XmlProcessor\NodeProcessor\AbstractNodeProcessor;
71+
use Netlogix\XmlProcessor\NodeProcessor\OpenNodeProcessorInterface;
72+
use Netlogix\XmlProcessor\NodeProcessor\Context\OpenContext;
73+
74+
class OpenValueNodeProcessor extends AbstractNodeProcessor implements OpenNodeProcessorInterface
75+
{
76+
const NODE_PATH = 'value';
77+
private $nodeValues = [];
78+
79+
public function openElement(OpenContext $context)
80+
{
81+
$xml = $context->getXmlProcessorContext()->getXmlReader();
82+
$node = $xml->expand();
83+
$this->nodeValues[] = $node->nodeValue;
84+
}
85+
86+
function getNodeValues(): array
87+
{
88+
return $this->nodeValues;
89+
}
90+
}
91+
```
92+
93+
Create a new instance of the [`XmlProcessor`] class and attach the new nodeProcessor.
94+
95+
```php
96+
require_once 'OpenTestNodeProcessor.php';
97+
98+
require_once 'vendor/autoload.php';
99+
100+
$valueNodeProcessor = new OpenValueNodeProcessor();
101+
$processor = new XMLProcessor([$valueNodeProcessor]);
102+
$processor->processFile('file.xml');
103+
104+
var_dump($valueNodeProcessor->getNodeValues());
105+
```
106+
107+
**result:**
108+
109+
```php
110+
array(3) {
111+
[0]=>
112+
string(3) "foo"
113+
[1]=>
114+
string(3) "bar"
115+
[2]=>
116+
string(3) "baz"
117+
}
118+
```
119+
120+
[`XmlProcessor`]: src/XmlProcessor.php
121+
122+
[`NodeProcessorInterface`]: src/NodeProcessor/NodeProcessorInterface.php
123+
124+
[`AbstractNodeProcessor`]: src/NodeProcessor/AbstractNodeProcessor.php
125+
126+
[`OpenNodeProcessorInterface`]: src/NodeProcessor/OpenNodeProcessorInterface.php
127+
128+
[`CloseNodeProcessorInterface`]: src/NodeProcessor/CloseNodeProcessorInterface.php
129+
130+
[`TextNodeProcessorInterface`]: src/NodeProcessor/TextNodeProcessorInterface.php
131+
132+
[`NodeProcessorContext`]: src/NodeProcessor/Context/NodeProcessorContext.php
133+
134+
[`OpenContext]: src/NodeProcessor/Context/OpenContext.php
135+
136+
[`TextContext`]: src/NodeProcessor/Context/TextContext.php
137+
138+
[`CloseContext`]: src/NodeProcessor/Context/CloseContext.php

composer.json

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
{
2+
"name": "netlogix/xml-processor",
3+
"description": "PHP XML-Processor based on XMLReader",
4+
"license": "MIT",
5+
"require": {
6+
"php": "^7.4 || ^8.0",
7+
"ext-xmlreader": "*"
8+
},
9+
"require-dev": {
10+
"behat/behat": "^3.12.0",
11+
"phpunit/phpunit": "^9.6.6"
12+
},
13+
"autoload": {
14+
"psr-4": {
15+
"Netlogix\\XmlProcessor\\": "src/"
16+
}
17+
},
18+
"autoload-dev": {
19+
"psr-4": {
20+
"Netlogix\\XmlProcessor\\Tests\\": "tests/",
21+
"Netlogix\\XmlProcessor\\Behat\\": "features/"
22+
}
23+
},
24+
"prefer-stable": true
25+
}
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
Feature: run XMLProcessor with TextNodeProcessor
2+
3+
Scenario: run XMLProcessor
4+
Given initialize XMLProcessor with "Netlogix\XmlProcessor\Behat\NodeProcessor\ArrayNodeProcessor"
5+
When process xml with current XMLProcessor instance:
6+
"""
7+
<root name="main">
8+
<product id="1">foo</product>
9+
<product id="2">bar</product>
10+
<category id="1">
11+
<category id="2">
12+
<product id="3">baz</product>
13+
</category>
14+
<category><bar/></category>
15+
</category>
16+
</root>
17+
"""
18+
Then NodeProcessor "Netlogix\XmlProcessor\Behat\NodeProcessor\ArrayNodeProcessor" should return:
19+
"""
20+
[
21+
{
22+
"node": "root",
23+
"level": 1,
24+
"attributes": {
25+
"name": "main"
26+
},
27+
"children": [
28+
{
29+
"node": "product",
30+
"level": 2,
31+
"attributes": {
32+
"id": "1"
33+
},
34+
"children": [],
35+
"text": "foo"
36+
},
37+
{
38+
"node": "product",
39+
"level": 2,
40+
"attributes": {
41+
"id": "2"
42+
},
43+
"children": [],
44+
"text": "bar"
45+
},
46+
{
47+
"node": "category",
48+
"level": 2,
49+
"attributes": {
50+
"id": "1"
51+
},
52+
"children": [
53+
{
54+
"node": "category",
55+
"level": 3,
56+
"attributes": {
57+
"id": "2"
58+
},
59+
"children": [
60+
{
61+
"node": "product",
62+
"level": 4,
63+
"attributes": {
64+
"id": "3"
65+
},
66+
"children": [],
67+
"text": "baz"
68+
}
69+
]
70+
},
71+
{
72+
"node": "category",
73+
"level": 3,
74+
"attributes": [],
75+
"children": [
76+
{
77+
"node": "bar",
78+
"level": 4,
79+
"attributes": [],
80+
"children": []
81+
}
82+
]
83+
}
84+
]
85+
}
86+
]
87+
}
88+
]
89+
"""

0 commit comments

Comments
 (0)