| layout | default |
|---|---|
| title | Scraping JSON Responses |
| parent | How-To Guides |
| grand_parent | Ruby Gem |
| nav_order | 6 |
When a website returns a JSON response (i.e., with a Content-Type of application/json), html2rss converts the JSON to XML, allowing you to use CSS selectors for data extraction.
Note
The JSON response must be an Array or a Hash for the conversion to work.
A JSON object like this:
{
"data": [{ "title": "Headline", "url": "https://example.com" }]
}is converted to this XML structure:
<object>
<data>
<array>
<object>
<title>Headline</title>
<url>https://example.com</url>
</object>
</array>
</data>
</object>You would use array > object as your items selector.
A JSON array like this:
[{ "title": "Headline", "url": "https://example.com" }]is converted to this XML structure:
<array>
<object>
<title>Headline</title>
<url>https://example.com</url>
</object>
</array>You would use array > object as your items selector.
Html2rss.feed(
headers: {
Accept: 'application/json'
},
channel: {
url: 'http://domainname.tld/whatever.json'
},
selectors: {
title: { selector: 'foo' }
}
)headers:
Accept: application/json
channel:
url: "http://domainname.tld/whatever.json"
selectors:
items:
selector: "array > object"
title:
selector: "foo"