| title | Custom HTTP Requests |
|---|---|
| description | Learn how to customize HTTP requests with custom headers, authentication, and API interactions for html2rss. |
Some websites require custom HTTP headers, authentication, or other request settings to access their content. html2rss lets you customize requests for those cases.
You might need custom HTTP requests when:
- APIs require authentication (Bearer tokens, API keys)
- Websites block default user agents (need to appear as a real browser)
- Content is behind login (session cookies, authorization headers)
- Rate limiting (custom headers to identify your requests)
- Content negotiation (specific Accept headers for different formats)
Add a headers section to your feed configuration. This example is a complete, valid config:
headers:
User-Agent: "Mozilla/5.0 (compatible; html2rss/1.0)"
Authorization: "Bearer YOUR_API_TOKEN"
Accept: "application/json"
channel:
url: https://api.example.com/posts
selectors:
items:
selector: "array > object"
title:
selector: "title"
url:
selector: "url"Many APIs require authentication tokens:
headers:
Authorization: "Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
X-API-Key: "your-api-key-here"
channel:
url: "https://api.example.com/posts"
selectors:
items:
selector: "array > object"
title:
selector: "title"
url:
selector: "url"Some websites block requests that don't look like real browsers:
headers:
User-Agent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
Accept: "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
Accept-Language: "en-US,en;q=0.5"
Accept-Encoding: "gzip, deflate"
channel:
url: "https://example.com/articles"
selectors:
items:
selector: "article"
title:
selector: "h2"
url:
selector: "a"
extractor: "href"Request specific content types:
headers:
Accept: "application/json"
channel:
url: "https://api.example.com/posts"
selectors:
items:
selector: "array > object"
title:
selector: "title"
url:
selector: "url"Some APIs require specific headers:
headers:
X-Requested-With: "XMLHttpRequest"
X-Custom-Header: "your-value"
Content-Type: "application/json"
channel:
url: "https://api.example.com/posts"
selectors:
items:
selector: "array > object"
title:
selector: "title"
url:
selector: "url"You can use dynamic parameters in headers for runtime values:
headers:
Authorization: "Bearer %<api_token>s"
X-User-ID: "%<user_id>s"
channel:
url: "https://api.example.com/users/%<user_id>s/posts"
selectors:
items:
selector: "array > object"
title:
selector: "title"
url:
selector: "url"See our Dynamic Parameters guide for more details.
- Header examples that target third-party APIs are illustrative. Authentication requirements, header names, and response shapes can change independently of
html2rss. - For JSON APIs, validate the response structure before assuming selectors like
array > objectorhtml_urlwill match. - If you document or share a config for reuse, prefer placeholder values and parameterized headers over embedding real tokens.
Test your configuration to ensure headers work correctly:
# Test with curl first
curl -H "Authorization: Bearer YOUR_TOKEN" https://api.example.com/posts
# Then test with html2rss
html2rss feed your-config.yml- 401 Unauthorized: Check your authentication headers
- 403 Forbidden: Verify API keys and permissions
- 429 Too Many Requests: Add rate limiting or different user agents
- Empty responses: Some APIs require specific Accept headers
- Use browser developer tools to see what headers successful requests use
- Test with curl before configuring html2rss
- Check API documentation for required headers
- Enable debug logging to see what headers are being sent
headers:
Authorization: "token YOUR_GITHUB_TOKEN"
Accept: "application/vnd.github.v3+json"
User-Agent: "html2rss/1.0"
channel:
url: https://api.github.com/repos/owner/repo/issues
selectors:
items:
selector: "array > object"
title:
selector: "title"
url:
selector: "html_url"headers:
User-Agent: "html2rss/1.0 by your-username"
Accept: "application/json"
channel:
url: https://www.reddit.com/r/programming.json
selectors:
items:
selector: "data > children > object > data"
title:
selector: "title"
url:
selector: "url"- Headers Reference - Complete headers documentation
- Dynamic Parameters - Runtime header values
- Scraping JSON APIs - Working with JSON responses
- Strategy Selection - Choose the right strategy for your needs
- Troubleshooting - Common issues and solutions
- Community Discussions - Ask for help
- Advanced Features - Performance optimization
- Ruby Gem Documentation - Complete API reference