diff --git a/feed-directory/index.html b/feed-directory/index.html index 2aeef3a8..4d9ff82c 100644 --- a/feed-directory/index.html +++ b/feed-directory/index.html @@ -3,6 +3,7 @@ title: Feed Directory nav_order: 2 noindex: true +has_children: true ---
diff --git a/get-involved/contributing.md b/get-involved/contributing.md index 40cb43e1..a5306b83 100644 --- a/get-involved/contributing.md +++ b/get-involved/contributing.md @@ -25,7 +25,24 @@ Here are some of the ways you can contribute to the `html2rss` project: Are you missing an RSS feed for a website? You can create your own feed config and share it with the community. It's a great way to get started with `html2rss` and help other users. -[**Learn how to create a feed config**](https://github.com/html2rss/html2rss-configs) +The html2rss "ecosystem" is a community project. We welcome contributions of all kinds. This includes new feed configs, suggesting and implementing features, providing bug fixes, documentation improvements, and any other kind of help. + +Which way you choose to add a new feed config is up to you. You can do it manually. Please [submit a pull request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork)! + +After you're done, you can test your feed config by running `bundle exec html2rss feed lib/html2rss/configs//.yml`. + +#### Preferred way: manually + +1. Fork the `html2rss-config` git repository and run `bundle install` (you need to have Ruby >= 3.3 installed). +2. Create a new folder and file following this convention: `lib/html2rss/configs//.yml` +3. Create the feed config in the `.yml` file. +4. Add this spec file in the `spec/html2rss/configs//_spec.rb` file. + +```ruby + RSpec.describe '/' do + include_examples 'config.yml', described_class + end +``` ### 2. Improve this Website @@ -37,13 +54,13 @@ This website is built with Jekyll and is hosted on GitHub Pages. If you have any The [`html2rss-web`](https://github.com/html2rss/html2rss-web) project is a web application that allows you to create and manage your RSS feeds through a user-friendly interface. You can host your own public instance to help other users create feeds. -[**Learn how to host a public instance**](https://github.com/html2rss/html2rss-web/wiki/Instances) +[**Learn how to host a public instance**]({{ '/web-application/how-to/deployment' | relative_url }}) ### 4. Improve the `html2rss` Gem Are you a Ruby developer? You can help us improve the core `html2rss` gem. Whether you're fixing a bug, adding a new feature, or improving the documentation, your contributions are welcome. -[**Check out the repository on GitHub**](https://github.com/html2rss/html2rss) +[**Check out the documentation for the `html2rss` Gem**]({{ '/ruby-gem/' | relative_url }}) ### 5. Report Bugs & Discuss Features diff --git a/html2rss-configs/index.md b/html2rss-configs/index.md new file mode 100644 index 00000000..36559997 --- /dev/null +++ b/html2rss-configs/index.md @@ -0,0 +1,173 @@ +--- +layout: default +title: html2rss-configs +has_children: false +nav_order: 5 +--- + +# Creating Feed Configurations + +Welcome to the guide for `html2rss-configs`. This document explains how to create your own configuration files to convert any website into an RSS feed. + +You can find a list of all community-contributed configurations in the [Feed Directory]({{ '/feed-directory/' | relative_url }}). + +--- + +## Core Concepts + +An `html2rss` config is a YAML file that defines how to extract data from a web page. It consists of two main building blocks: `channel` and `selectors`. + +### The `channel` Block + +The `channel` block contains metadata about the RSS feed itself, such as its title and the source URL. + +**Example:** + +```yaml +channel: + url: https://example.com/blog + title: My Awesome Blog +``` + +For a complete list of all available channel options, please see the [Channel Reference]({{ '/ruby-gem/reference/channel/' | relative_url }}). + +### The `selectors` Block + +The `selectors` block is the core of the configuration, defining the rules for extracting content. It always contains an `items` selector to identify the list of articles and individual selectors for the data points within each item (e.g., `title`, `link`). + +**Example:** + +```yaml +selectors: + items: + selector: "article.post" + title: + selector: "h2 a" + link: + selector: "h2 a" +``` + +For a comprehensive guide on all available selectors, extractors, and post-processors, please see the [Selectors Reference]({{ '/ruby-gem/reference/selectors/' | relative_url }}). + +--- + +## Tutorial: Your First Config + +This tutorial walks you through creating a basic configuration file from scratch. + +### Step 1: Identify the Target Content + +First, identify the HTML structure of the website you want to create a feed for. For this example, we'll use a simple blog structure: + +```html +
+
+

First Post

+

This is the summary of the first post.

+
+
+

Second Post

+

This is the summary of the second post.

+
+
+``` + +### Step 2: Create the Config File and Define the Channel + +Create a new YAML file (e.g., `my-blog.yml`) and define the `channel`: + +```yaml +# my-blog.yml +channel: + url: https://example.com/blog + title: My Awesome Blog + description: The latest news from my awesome blog. +``` + +### Step 3: Define the Selectors + +Next, add the `selectors` block to extract the content for each post. + +```yaml +# my-blog.yml +selectors: + items: + selector: "article.post" + title: + selector: "h2 a" + link: + selector: "h2 a" + description: + selector: "p" +``` + +- `items`: This CSS selector identifies the container for each article. +- `title`, `link`, `description`: These selectors target the specific data points within each item. For a `link` selector, `html2rss` defaults to extracting the `href` attribute from the matched `` tag. + +--- + +## Advanced Techniques + +### Handling Pagination + +To aggregate content from multiple pages, use the `pagination` option within the `items` selector. + +```yaml +selectors: + items: + selector: ".post-listing .post" + pagination: + selector: ".pagination .next-page" + limit: 5 # Optional: sets the maximum number of pages to follow +``` + +### Dynamic Feeds with Parameters + +Use the `parameters` block to create flexible configs. This is useful for feeds based on search terms, categories, or regions. + +```yaml +# news-search.yml +parameters: + query: + type: string + default: "technology" + +channel: + url: "https://news.example.com/search?q={query}" + title: "News results for '{query}'" +``` + +--- + +## Contributing Your Config + +Have you created a config that others might find useful? We strongly encourage you to contribute it to the project! By sharing your config, you make it available to all users of the public `html2rss-web` service and the Feed Directory. + +To contribute, please [create a pull request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request) to the `html2rss-configs` repository. + +--- + +## Usage and Integration + +### With `html2rss-web` + +Once your pull request is reviewed and merged, your config will become available on the public [`html2rss-web`]({{ '/web-application/' | relative_url }}) instance. You can then access it at the path `/.rss`. + +### Programmatic Usage in Ruby + +You can also use `html2rss-configs` programmatically in your Ruby applications. + +Add this to your Gemfile: + +```ruby +gem 'html2rss-configs', git: 'https://github.com/html2rss/html2rss-configs.git' +``` + +And use it in your code: + +```ruby +require 'html2rss/configs' + +config = Html2rss::Configs.find_by_name('domainname.tld/whatever') +rss = Html2rss.feed(config) +```