Skip to content

Commit 9421f79

Browse files
committed
Merge branch 'main' into update-concept-description
2 parents fd0c054 + 745d239 commit 9421f79

10 files changed

Lines changed: 211 additions & 114 deletions

File tree

README.md

Lines changed: 15 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -5,22 +5,31 @@ The OpenML documentation in written in MarkDown. The sources are generated by [M
55

66
The overal structure (navigation) of the docs is configurated in the `mkdocs.yml` file.
77

8-
Some of the API's use other documentation generators, such as [Sphinx](https://restcoder.readthedocs.io/en/latest/sphinx-docgen.html) in openml-python. This documentation is pulled in via iframes to gather all docs into the same place, but they need to be edited in their own GitHub repo's.
8+
This documentation of other APIs is pulled in using the [multirepo plugin](https://github.com/jdoiro3/mkdocs-multirepo-plugin) to gather all docs into the same place, but they need to be edited in their own GitHub repo's. This allows the documentation to live closer to the code and follow conventions of the respective community.
99

1010
## Editing documentation
1111
Documentation can be edited by simply editing the markdown files in the `docs` folder and creating a pull request.
1212

1313
End users can edit the docs by simply clicking the edit button (the pencil icon) on the top of every documentation page. It will open up an editing page on [GitHub](https://github.com/) (you do need to be logged in on GitHub). When you are done, add a small message explaining the change and click 'commit changes'. On the next page, just launch the pull request. We will then review it and approve the changes, or discuss them if necessary.
1414

15+
For other information on how to write and build documentation locally, see our [contributing](./contributing/OpenML-Docs.md#General-Documentation) page.
16+
1517
## Deployment
1618
The documentation is hosted on GitHub pages.
1719

18-
To deploy the documentation, you need to have MkDocs and MkDocs-Material installed, and then run `mkdocs gh-deploy` in the top directory (with the `mkdocs.yml` file). This will build the HTML files and push them to the gh-pages branch of openml/docs. `https://docs.openml.org` is just a reverse proxy for `https://openml.github.io/docs/`.
20+
To deploy the documentation, you need to have MkDocs installed locally, and then run `mkdocs gh-deploy` in the top directory (with the `mkdocs.yml` file). This will build the HTML files and push them to the gh-pages branch of openml/docs. `https://docs.openml.org` is just a reverse proxy for `https://openml.github.io/docs/`.
21+
22+
MkDocs and all required extensions can be installed as follows:
23+
```
24+
pip install -r requirements.txt
25+
```
1926

20-
MKDocs and MkDocs-Material can be installed as follows:
27+
To test the documentation locally, run
2128
```
22-
pip install mkdocs
23-
pip install mkdocs-material
24-
pip install -U fontawesome_markdown
29+
mkdocs serve
2530
```
2631

32+
To deploy to GitHub Pages, run
33+
```
34+
mkdocs gh-deploy
35+
```

docs/concepts/benchmarking.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,11 +9,11 @@ Collections of tasks can be published as _benchmarking suites_. Seamlessly integ
99
- standardized train-test splits are provided to ensure that results can be objectively compared - results can be shared in a reproducible way through the APIs
1010
- results from other users can be easily downloaded and reused
1111

12-
You can search for <a href="https://www.openml.org/search?type=benchmark&sort=tasks_included&study_type=task" target="_blank">all existing benchmarking suites</a> or create your own. For all further details, see the [benchmarking guide](../benchmark/benchmark.md).
12+
You can search for <a href="https://www.openml.org/search?type=benchmark&sort=tasks_included&study_type=task" target="_blank">all existing benchmarking suites</a> or create your own. For all further details, see the [benchmarking guide](../benchmark/index.md).
1313

1414
<img src="../../img/studies.png" style="width:100%; max-width:800px;"/>
1515

1616
## Benchmark studies
1717
Collections of runs can be published as _benchmarking studies_. They contain the results of all runs (possibly millions) executed on a specific benchmarking suite. OpenML allows you to easily download all such results at once via the APIs, but also visualized them online in the Analysis tab (next to the complete list of included tasks and runs). Below is an example of <a href="https://www.openml.org/search?type=benchmark&study_type=run&id=226" target="_blamnk">a benchmark study for AutoML algorithms</a>.
1818

19-
<img src="../../img/run_study.png" style="width:100%; max-width:1000px;"/>
19+
<img src="../../img/run_study.png" style="width:100%; max-width:1000px;"/>

docs/contributing/OpenML-Docs.md

Lines changed: 22 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,34 @@
1+
## Documentation
2+
3+
Documentation of OpenML consists of the general information pages, such as these, that include common concepts.
4+
Additionally, each software package such as the Python, Java, and R connectors has their own documentation.
5+
For convenience, those documentation pages are also available through this common documentation portal.
6+
7+
We always value contributions to our documentation. If you notice any mistake in these documentation pages, click the :material-pencil: button (on the top right). It will open up an editing page on [GitHub](https://github.com/) (you do need to be logged in). When you are done, add a small message explaining the change and click 'commit changes'. On the next page, just launch the pull request. We will then review it and approve the changes, or discuss them if necessary.
8+
9+
Below you can find more information about how each set of documentation pages is built.
10+
111
## General Documentation
2-
High-quality and up-to-date documentation are crucial. If you notice any mistake in these documentation pages, click the :material-pencil: button (on the top right). It will open up an editing page on [GitHub](https://github.com/) (you do need to be logged in). When you are done, add a small message explaining the change and click 'commit changes'. On the next page, just launch the pull request. We will then review it and approve the changes, or discuss them if necessary.
312

413
The sources are generated by [MkDocs](http://www.mkdocs.org/), using the [Material theme](https://squidfunk.github.io/mkdocs-material/).
514
Check these docs to see what is possible in terms of styling.
615

7-
OpenML is a big project with multiple repositories. To keep the documentation close to the code, it will always be kept in the relevant repositories (see below), and
16+
OpenML is a big project with multiple repositories.
17+
To keep the documentation close to the code, it will always be kept in the relevant repositories (see below), and
818
combined into these documentation pages using [MkDocs multirepo](https://github.com/jdoiro3/mkdocs-multirepo-plugin/issues/3).
919

10-
!!! note "Developer note"
11-
To work on the documentation locally, do the following:
12-
```
13-
git clone https://github.com/openml/docs.git
14-
pip install -r requirements.txt
15-
```
16-
To build the documentation, run `mkdocs serve` in the top directory (with the `mkdocs.yml` file). Any changes made after that will be hot-loaded.
20+
To build the documentation locally, first make sure all dependencies specified in `requirements.txt` are installed:
21+
22+
```bash
23+
python -m venv .venv
24+
source .venv/bin/activate
25+
python -m pip install uv
26+
uv pip install -r requirements.txt
27+
```
1728

18-
The documentation will be auto-deployed with every push or merge with the master branch of `https://www.github.com/openml/docs/`. In the background, a CI job
19-
will run `mkdocs gh-deploy`, which will build the HTML files and push them to the gh-pages branch of openml/docs. `https://docs.openml.org` is just a reverse proxy for `https://openml.github.io/docs/`.
29+
After installing the dependencies, run `mkdocs serve -f mkdocs-local.yml` in the top directory (with the `mkdocs.yml` file). Any changes made after that will be hot-loaded.
2030

31+
To build the full documentation, including importing the documentation from other repositories, run `mkdocs serve` in the top directory (with the `mkdocs.yml` file). This can take a while to compile, so only use this when needed. You might also need to set `export NUMPY_EXPERIMENTAL_DTYPE_API=1` (or `set NUMPY_EXPERIMENTAL_DTYPE_API=1` on Windows).
2132

2233
## Python API
2334
To edit the tutorial, you have to edit the `reStructuredText` files on [openml-python/doc](https://github.com/openml/openml-python/tree/master/doc). When done, you can do a pull request.

docs/index.md

Lines changed: 5 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -15,56 +15,15 @@ icon: material/creation
1515
<p><i class="fa fa-graduation-cap fa-fw fa-lg"></i>&nbsp; Make your work more visible and reusable</p>
1616
<p><i class="fa fa-bolt fa-fw fa-lg"></i>&nbsp; Built for automation: streamline your experiments and model building</p>
1717

18-
## Installation
18+
## How to use OpenML
1919

20-
The OpenML package is available in many languages and across libraries. For more information about them, see the [Integrations](./ecosystem/index.md) page.<br><br>
20+
OpenML is accessible to a wide range of people:
2121

22-
=== "Python/sklearn"
22+
:computer: <a href="https://www.openml.org" target='blank_'>Explore the OpenML website</a> to discover, download and upload ML resources.
2323

24-
- [Python/sklearn repository](https://github.com/openml/openml-python)
25-
- `pip install openml`
24+
:robot: [Install an OpenML library](intro/index.md) to access and share resources programmatically through our APIs. Select one of the detailed guides in the top menu.
2625

27-
=== "Pytorch"
28-
29-
- [Pytorch repository](https://github.com/openml/openml-pytorch)
30-
- `pip install openml-pytorch`
31-
32-
=== "Keras"
33-
34-
- [Keras repository](https://github.com/openml/openml-keras)
35-
- `pip install openml-keras`
36-
37-
=== "TensorFlow"
38-
39-
- [TensorFlow repository](https://github.com/openml/openml-tensorflow)
40-
- `pip install openml-tensorflow`
41-
42-
=== "R"
43-
44-
- [R repository](https://github.com/openml/openml-R)
45-
- `install.packages("mlr3oml")`
46-
=== "Julia"
47-
48-
- [Julia repository](https://github.com/JuliaAI/OpenML.jl/tree/master)
49-
- `using Pkg;Pkg.add("OpenML")`
50-
51-
=== "RUST"
52-
53-
- [RUST repository](https://github.com/mbillingr/openml-rust)
54-
- Install from source
55-
56-
=== ".Net"
57-
58-
- [.Net repository](https://github.com/openml/openml-dotnet)
59-
- `Install-Package openMl`
60-
61-
62-
You might also need to set up the API key. For more information, see [Authentication](http://localhost:8000/concepts/openness/).
63-
64-
## Learning OpenML
65-
66-
Aside from the individual package documentations, you can learn more about OpenML through the following resources:<br>
67-
The core concepts of OpenML are explained in the [Concepts](./concepts/index.md) page. These concepts include the principle behind using Datasets, Runs, Tasks, Flows, Benchmarking and much more. Going through them will help you leverage OpenML even better in your work.<br>
26+
:mortar_board: [Get started](./concepts/index.md) by learning more about the structure and concepts behind OpenML, such as Datasets, Tasks, Flows, Runs, Benchmarking and much more. This will help you leverage OpenML even better in your work.
6827

6928
## Contributing to OpenML
7029

docs/intro/index.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
---
2+
icon: material/rocket-launch
3+
---
4+
5+
## :computer: Installation
6+
7+
The OpenML package is available in many languages and has deep integration in many machine learning libraries.
8+
9+
=== "Python/sklearn"
10+
11+
- [Python/sklearn repository](https://github.com/openml/openml-python)
12+
- `pip install openml`
13+
14+
=== "Pytorch"
15+
16+
- [Pytorch repository](https://github.com/openml/openml-pytorch)
17+
- `pip install openml-pytorch`
18+
19+
=== "TensorFlow"
20+
21+
- [TensorFlow repository](https://github.com/openml/openml-tensorflow)
22+
- `pip install openml-tensorflow`
23+
24+
=== "R"
25+
26+
- [R repository](https://github.com/openml/openml-R)
27+
- `install.packages("mlr3oml")`
28+
29+
=== "Julia"
30+
31+
- [Julia repository](https://github.com/JuliaAI/OpenML.jl/tree/master)
32+
- `using Pkg;Pkg.add("OpenML")`
33+
34+
=== "RUST"
35+
36+
- [RUST repository](https://github.com/mbillingr/openml-rust)
37+
- Install from source
38+
39+
=== ".Net"
40+
41+
- [.Net repository](https://github.com/openml/openml-dotnet)
42+
- `Install-Package openMl`
43+
44+
You can find detailed guides for the different libraries in the top menu.
45+
46+
47+
## :key: Authentication
48+
49+
OpenML is entirely open and you do not need an account to access data (rate limits apply). However, <a href="https://www.openml.org" target='blank_'>signing up via the OpenML website</a> is very easy (and free) and required to upload new resources to OpenML and to manage them online.
50+
51+
API authentication happens via an **API key**, which you can find in your profile after logging in to openml.org.
52+
53+
```
54+
openml.config.apikey = "YOUR KEY"
55+
```
56+
57+
## :joystick: Minimal Example
58+
59+
:material-database: Use the following code to load the [credit-g](https://www.openml.org/search?type=data&sort=runs&status=active&id=31) [dataset](https://docs.openml.org/concepts/data/) directly into a pandas dataframe. Note that OpenML can automatically load all datasets, separate data X and labels y, and give you useful dataset metadata (e.g. feature names and which ones have categorical data).
60+
61+
```python
62+
import openml
63+
64+
dataset = openml.datasets.get_dataset("credit-g") # or by ID get_dataset(31)
65+
X, y, categorical_indicator, attribute_names = dataset.get_data(target="class")
66+
```
67+
68+
69+
:trophy: Get a [task](https://docs.openml.org/concepts/tasks/) for [supervised classification on credit-g](https://www.openml.org/search?type=task&id=31&source_data.data_id=31).
70+
Tasks specify how a dataset should be used, e.g. including train and test splits.
71+
72+
```python
73+
task = openml.tasks.get_task(31)
74+
dataset = task.get_dataset()
75+
X, y, categorical_indicator, attribute_names = dataset.get_data(target=task.target_name)
76+
# get splits for the first fold of 10-fold cross-validation
77+
train_indices, test_indices = task.get_train_test_split_indices(fold=0)
78+
```
79+
80+
:bar_chart: Use an [OpenML benchmarking suite](https://docs.openml.org/concepts/benchmarking/) to get a curated list of machine-learning tasks:
81+
```python
82+
suite = openml.study.get_suite("amlb-classification-all") # Get a curated list of tasks for classification
83+
for task_id in suite.tasks:
84+
task = openml.tasks.get_task(task_id)
85+
```
86+
87+
:star2: You can now benchmark your models easily across many datasets at once. A model training is called a run:
88+
89+
```python
90+
from sklearn import neighbors
91+
92+
task = openml.tasks.get_task(403)
93+
clf = neighbors.KNeighborsClassifier(n_neighbors=5)
94+
run = openml.runs.run_model_on_task(clf, task)
95+
```
96+
97+
:raised_hands: You can now publish your experiment on OpenML so that others can build on it:
98+
99+
```python
100+
myrun = run.publish()
101+
print(f"kNN on {data.name}: {myrun.openml_url}")
102+
```
103+
104+
105+
## Learning more OpenML
106+
107+
Next, check out the :rocket: [10 minute tutorial](notebooks/getting_started.ipynb) and the :mortar_board: [short description of OpenML concepts](concepts/index.md).

docs/notebooks/getting_started.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@
4949
"cell_type": "markdown",
5050
"metadata": {},
5151
"source": [
52-
"# Getting Started\n",
52+
"# OpenML in 10 minutes\n",
5353
"\n",
5454
"This page will guide you through the process of getting started with OpenML. While this page is a good starting point, for more detailed information, please refer to the [integrations section](Scikit-learn/index.md) and the rest of the documentation.\n",
5555
"\n"

mkdocs-local.yml

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,12 @@ markdown_extensions:
8282
plugins:
8383
- autorefs
8484
- section-index
85+
- mkdocs-jupyter:
86+
ignore: ['temp_dir/**/*','docs/examples/**/*']
87+
theme: light
88+
remove_tag_config:
89+
remove_input_tags:
90+
- hide_code
8591
- redirects:
8692
redirect_maps:
8793
'APIs.md': 'https://www.openml.org/apis'
@@ -98,9 +104,10 @@ plugins:
98104
- git-committers:
99105
repository: openml/docs
100106
nav:
101-
- OpenML:
102-
- Introduction: index.md
103-
- Getting Started: notebooks/getting_started.ipynb
107+
- OpenML: index.md
108+
- Get Started:
109+
- OpenML: intro/index.md
110+
- 10 Minute Tutorial: notebooks/getting_started.ipynb
104111
- Concepts:
105112
- Main concepts: concepts/index.md
106113
- Data: concepts/data.md

mkdocs.yml

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,8 @@ plugins:
120120
docstring_section_style: table
121121
show_docstring_functions: true
122122
docstring_style: numpy
123+
follow_imports: false
124+
show_submodules: false
123125
- gen-files:
124126
scripts:
125127
- scripts/gen_python_ref_pages.py
@@ -131,9 +133,10 @@ plugins:
131133
- git-committers:
132134
repository: openml/docs
133135
nav:
134-
- OpenML:
135-
- Introduction: index.md
136-
- Getting Started: notebooks/getting_started.ipynb
136+
- OpenML: index.md
137+
- Get Started:
138+
- OpenML: intro/index.md
139+
- 10 Minute Tutorial: notebooks/getting_started.ipynb
137140
- Concepts:
138141
- Main concepts: concepts/index.md
139142
- Data: concepts/data.md
@@ -213,6 +216,7 @@ extra_css:
213216
- css/extra.css
214217
extra_javascript:
215218
- js/extra.js
219+
- js/reset_nav.js
216220
exclude_docs: |
217221
scripts/
218222
old/

requirements.txt

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -5,14 +5,15 @@ mkdocs-redirects==1.2.1
55
mkdocs-jupyter==0.25.0
66
mkdocs-awesome-pages-plugin==2.9.3
77
mkdocs-multirepo-plugin==0.8.3
8-
mkdocs-autorefs
9-
mkdocs-section-index
10-
mkdocs-gen-files
11-
mkdocs-literate-nav
12-
mkdocs-git-committers-plugin-2
13-
mkdocs-git-revision-date-localized-plugin
14-
mkdocstrings
15-
mkdocstrings-python
16-
markdown-include
8+
mkdocs-autorefs==1.2.0
9+
mkdocs-section-index==0.3.9
10+
mkdocs-gen-files==0.5.0
11+
mkdocs-literate-nav==0.6.1
12+
mkdocs-git-committers-plugin-2==2.5.0
13+
mkdocs-git-revision-date-localized-plugin==1.3.0
14+
mkdocstrings==0.26.2
15+
mkdocstrings-python==1.12.1
16+
markdown-include==0.8.1
1717
notebook==6.4.12
18-
tqdm
18+
jupyter_contrib_nbextensions==0.7.0
19+
tqdm

0 commit comments

Comments
 (0)