Skip to content

Commit 6fc03d5

Browse files
authored
Merge pull request #682 from splitgraph/docs/update-readme
Rename Splitgraph to `sgr` in the README
2 parents d2147f9 + 07b4a31 commit 6fc03d5

2 files changed

Lines changed: 175 additions & 110 deletions

File tree

README.md

Lines changed: 101 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
1-
# Splitgraph
1+
# `sgr`
2+
23
![Build status](https://github.com/splitgraph/splitgraph/workflows/build_all/badge.svg)
34
[![Coverage Status](https://coveralls.io/repos/github/splitgraph/splitgraph/badge.svg?branch=master)](https://coveralls.io/github/splitgraph/splitgraph?branch=master)
45
[![PyPI version](https://badge.fury.io/py/splitgraph.svg)](https://badge.fury.io/py/splitgraph)
@@ -7,95 +8,144 @@
78

89
## Overview
910

10-
**Splitgraph** is a tool for building, versioning and querying reproducible datasets. It's inspired
11-
by Docker and Git, so it feels familiar. And it's powered by [PostgreSQL](https://postgresql.org), so it [works seamlessly with existing tools](https://www.splitgraph.com/connect) in the Postgres ecosystem. Use Splitgraph to package your data into self-contained **data images** that you can [share with other Splitgraph instances](https://www.splitgraph.com/docs/getting-started/decentralized-demo).
12-
13-
[**Splitgraph.com**](https://www.splitgraph.com), or **Splitgraph Cloud**, is a public Splitgraph instance where you can share and discover data. It's a Splitgraph peer powered by the **Splitgraph Core** code in this repository, adding proprietary features like a data catalog, multitenancy, and a distributed SQL proxy.
11+
**`sgr`** is the CLI for [**Splitgraph**](https://www.splitgraph.com), a
12+
serverless API for data-driven Web applications.
1413

15-
You can explore [40k+ open datasets](https://www.splitgraph.com/explore) in the catalog. You can also connect directly to the [Data Delivery Network](https://www.splitgraph.com/connect) and query any of the datasets, without installing anything.
14+
With addition of the optional [`sgr` Engine](engine/README.md) component, `sgr`
15+
can become a stand-alone tool for building, versioning and querying reproducible
16+
datasets. We use it as the storage engine for Splitgraph. It's inspired by
17+
Docker and Git, so it feels familiar. And it's powered by
18+
[PostgreSQL](https://postgresql.org), so it works seamlessly with existing tools
19+
in the Postgres ecosystem. Use `sgr` to package your data into self-contained
20+
**Splitgraph data images** that you can
21+
[share with other `sgr` instances](https://www.splitgraph.com/docs/getting-started/decentralized-demo).
1622

17-
To install `sgr` (the command line client) or a local Splitgraph Engine, see the [Installation](#installation) section of this readme.
23+
To install the `sgr` CLI or a local `sgr` Engine, see the
24+
[Installation](#installation) section of this readme.
1825

1926
### Build and Query Versioned, Reproducible Datasets
2027

21-
[**Splitfiles**](https://www.splitgraph.com/docs/concepts/splitfiles) give you a declarative language, inspired by Dockerfiles, for expressing data transformations in ordinary SQL familiar to any researcher or business analyst. You can reference other images, or even other databases, with a simple JOIN.
28+
[**Splitfiles**](https://www.splitgraph.com/docs/concepts/splitfiles) give you a
29+
declarative language, inspired by Dockerfiles, for expressing data
30+
transformations in ordinary SQL familiar to any researcher or business analyst.
31+
You can reference other images, or even other databases, with a simple JOIN.
2232

2333
![](pics/splitfile.png)
2434

25-
When you build data with Splitfiles, you get provenance tracking of the resulting data: it's possible to find out what sources went into every dataset and know when to rebuild it if the sources ever change. You can easily integrate Splitgraph into your existing CI pipelines, to keep your data up-to-date and stay on top of changes to upstream sources.
26-
27-
Splitgraph images are also version-controlled, and you can manipulate them with Git-like operations through a CLI. You can check out any image into a PostgreSQL schema and interact with it using any PostgreSQL client. Splitgraph will capture your changes to the data, and then you can commit them as delta-compressed changesets that you can package into new images.
28-
29-
Splitgraph supports PostgreSQL [foreign data wrappers](https://wiki.postgresql.org/wiki/Foreign_data_wrappers). We call this feature [mounting](https://www.splitgraph.com/docs/concepts/mounting). With mounting, you can query other databases (like PostgreSQL/MongoDB/MySQL) or open data providers (like [Socrata](https://www.splitgraph.com/docs/ingesting-data/socrata)) from your Splitgraph instance with plain SQL. You can even snapshot the results or use them in Splitfiles.
30-
31-
### Why Splitgraph?
32-
33-
Splitgraph isn't opinionated and doesn't break existing abstractions. To any existing PostgreSQL application, Splitgraph images are just another database. We have carefully designed Splitgraph to not break the abstraction of a PostgreSQL table and wire protocol, because doing otherwise would mean throwing away a vast existing ecosystem of applications, users, libraries and extensions. This means that a lot of tools that work with PostgreSQL work with Splitgraph out of the box.
35+
When you build data images with Splitfiles, you get provenance tracking of the
36+
resulting data: it's possible to find out what sources went into every dataset
37+
and know when to rebuild it if the sources ever change. You can easily integrate
38+
`sgr` your existing CI pipelines, to keep your data up-to-date and stay on top
39+
of changes to upstream sources.
40+
41+
Splitgraph images are also version-controlled, and you can manipulate them with
42+
Git-like operations through a CLI. You can check out any image into a PostgreSQL
43+
schema and interact with it using any PostgreSQL client. `sgr` will capture your
44+
changes to the data, and then you can commit them as delta-compressed changesets
45+
that you can package into new images.
46+
47+
`sgr` supports PostgreSQL
48+
[foreign data wrappers](https://wiki.postgresql.org/wiki/Foreign_data_wrappers).
49+
We call this feature
50+
[mounting](https://www.splitgraph.com/docs/concepts/mounting). With mounting,
51+
you can query other databases (like PostgreSQL/MongoDB/MySQL) or open data
52+
providers (like
53+
[Socrata](https://www.splitgraph.com/docs/ingesting-data/socrata)) from your
54+
`sgr` instance with plain SQL. You can even snapshot the results or use them in
55+
Splitfiles.
3456

3557
![](pics/splitfiles.gif)
3658

3759
## Components
3860

39-
The code in this repository, known as **Splitgraph Core**, contains:
61+
The code in this repository contains:
4062

41-
- **[`sgr` command line client](https://www.splitgraph.com/docs/architecture/sgr-client)**: `sgr` is the main command line tool used to work with Splitgraph "images" (data snapshots). Use it to ingest data, work with splitfiles, and push data to Splitgraph.com.
42-
- **[Splitgraph Engine](engine/README.md)**: a [Docker image](https://hub.docker.com/r/splitgraph/engine) of the latest Postgres with Splitgraph and other required extensions pre-installed.
43-
- **[Splitgraph Python library](https://www.splitgraph.com/docs/python-api/splitgraph.core)**: All Splitgraph functionality is available in the Python API, offering first-class support for data science workflows including Jupyter notebooks and Pandas dataframes.
63+
- **[`sgr` CLI](https://www.splitgraph.com/docs/architecture/sgr-client)**:
64+
`sgr` is the main command line tool used to work with Splitgraph "images"
65+
(data snapshots). Use it to ingest data, work with Splitfiles, and push data
66+
to Splitgraph.
67+
- **[`sgr` Engine](engine/README.md)**: a
68+
[Docker image](https://hub.docker.com/r/splitgraph/engine) of the latest
69+
Postgres with `sgr` and other required extensions pre-installed.
70+
- **[Splitgraph Python library](https://www.splitgraph.com/docs/python-api/splitgraph.core)**:
71+
All `sgr` functionality is available in the Python API, offering first-class
72+
support for data science workflows including Jupyter notebooks and Pandas
73+
dataframes.
4474

4575
## Docs
4676

47-
Documentation is available at https://www.splitgraph.com/docs, specifically:
48-
49-
- [Installation](https://www.splitgraph.com/docs/getting-started/installation)
50-
- [FAQ](https://www.splitgraph.com/docs/getting-started/frequently-asked-questions)
77+
- [`sgr` documentation](https://www.splitgraph.com/docs/sgr-cli/introduction)
78+
- [Advanced `sgr` documentation](https://www.splitgraph.com/docs/sgr-advanced/getting-started/introduction)
79+
- [`sgr` command reference](https://www.splitgraph.com/docs/sgr/image-management-creation/checkout_)
80+
- [`splitgraph` package reference](https://www.splitgraph.com/docs/python-api/modules)
5181

5282
We also recommend reading our Blog, including some of our favorite posts:
5383

54-
- [Supercharging `dbt` with Splitgraph: versioning, sharing, cross-DB joins](https://www.splitgraph.com/blog/dbt)
84+
- [Supercharging `dbt` with `sgr`: versioning, sharing, cross-DB joins](https://www.splitgraph.com/blog/dbt)
5585
- [Querying 40,000+ datasets with SQL](https://www.splitgraph.com/blog/40k-sql-datasets)
5686
- [Foreign data wrappers: PostgreSQL's secret weapon?](https://www.splitgraph.com/blog/foreign-data-wrappers)
5787

5888
## Installation
5989

6090
Pre-requisites:
6191

62-
- Docker is required to run the Splitgraph Engine. `sgr` must have access to Docker. You either need to [install Docker locally](https://docs.docker.com/install/) or have access to a remote Docker socket.
92+
- Docker is required to run the `sgr` Engine. `sgr` must have access to Docker.
93+
You either need to [install Docker locally](https://docs.docker.com/install/)
94+
or have access to a remote Docker socket.
6395

64-
For Linux and OSX, once Docker is running, install Splitgraph with a single script:
96+
You can get the `sgr` single binary from
97+
[the releases page](https://github.com/splitgraph/splitgraph/releases).
98+
Optionally, you can run
99+
[`sgr engine add`](https://www.splitgraph.com/docs/sgr/engine-management/engine-add)
100+
to create an engine.
65101

66-
```
102+
For Linux and OSX, once Docker is running, install `sgr` with a single script:
103+
104+
```bash
67105
$ bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"
68106
```
69107

70-
This will download the `sgr` binary and set up the Splitgraph Engine Docker container.
71-
72-
Alternatively, you can get the `sgr` single binary from [the releases page](https://github.com/splitgraph/splitgraph/releases) and run [`sgr engine add`](https://www.splitgraph.com/docs/sgr/engine-management/engine-add) to create an engine.
108+
This will download the `sgr` binary and set up the `sgr` Engine Docker
109+
container.
73110

74-
See the [installation guide](https://www.splitgraph.com/docs/getting-started/installation) for more installation methods.
111+
See the
112+
[installation guide](https://www.splitgraph.com/docs/sgr-cli/installation) for
113+
more installation methods.
75114

76115
## Quick start guide
77116

78-
You can follow the [quick start guide](https://www.splitgraph.com/docs/getting-started/five-minute-demo) that will guide you through the basics of using Splitgraph with public and private data.
117+
You can follow the
118+
[quick start guide](https://www.splitgraph.com/docs/sgr-advanced/getting-started/five-minute-demo)
119+
that will guide you through the basics of using `sgr` with Splitgraph or
120+
standalone.
79121

80-
Alternatively, Splitgraph comes with plenty of [examples](examples) to get you started.
122+
Alternatively, `sgr` comes with plenty of [examples](examples) to get you
123+
started.
81124

82-
If you're stuck or have any questions, check out the [documentation](https://www.splitgraph.com/docs/) or join our [Discord channel](https://discord.gg/4Qe2fYA)!
125+
If you're stuck or have any questions, check out the
126+
[documentation](https://www.splitgraph.com/docs/sgr-advanced/getting-started/introduction)
127+
or join our [Discord channel](https://discord.gg/4Qe2fYA)!
83128

84129
## Contributing
85130

86131
### Setting up a development environment
87132

88-
* Splitgraph requires Python 3.6 or later.
89-
* Install [Poetry](https://github.com/python-poetry/poetry): `curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python` to manage dependencies
90-
* Install pre-commit hooks (we use [Black](https://github.com/psf/black) to format code)
91-
* `git clone --recurse-submodules https://github.com/splitgraph/splitgraph.git`
92-
* `poetry install`
93-
* To build the [engine](https://www.splitgraph.com/docs/architecture/splitgraph-engine) Docker image: `cd engine && make`
133+
- `sgr` requires Python 3.7 or later.
134+
- Install [Poetry](https://github.com/python-poetry/poetry):
135+
`curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python`
136+
to manage dependencies
137+
- Install pre-commit hooks (we use [Black](https://github.com/psf/black) to
138+
format code)
139+
- `git clone --recurse-submodules https://github.com/splitgraph/splitgraph.git`
140+
- `poetry install`
141+
- To build the
142+
[engine](https://www.splitgraph.com/docs/architecture/splitgraph-engine)
143+
Docker image: `cd engine && make`
94144

95145
### Running tests
96146

97-
The test suite requires [docker-compose](https://github.com/docker/compose). You will also
98-
need to add these lines to your `/etc/hosts` or equivalent:
147+
The test suite requires [docker-compose](https://github.com/docker/compose). You
148+
will also need to add these lines to your `/etc/hosts` or equivalent:
99149

100150
```
101151
127.0.0.1 local_engine
@@ -110,20 +160,23 @@ docker-compose -f test/architecture/docker-compose.core.yml up -d
110160
poetry run pytest -m "not mounting and not example"
111161
```
112162

113-
To run the test suite related to "mounting" and importing data from other databases
114-
(PostgreSQL, MySQL, Mongo), do
163+
To run the test suite related to "mounting" and importing data from other
164+
databases (PostgreSQL, MySQL, Mongo), do
115165

116166
```
117167
docker-compose -f test/architecture/docker-compose.core.yml -f test/architecture/docker-compose.mounting.yml up -d
118168
poetry run pytest -m mounting
119169
```
120170

121-
Finally, to test the [example projects](https://github.com/splitgraph/splitgraph/tree/master/examples), do
171+
Finally, to test the
172+
[example projects](https://github.com/splitgraph/splitgraph/tree/master/examples),
173+
do
122174

123175
```
124176
# Example projects spin up their own engines
125177
docker-compose -f test/architecture/docker-compose.core.yml -f test/architecture/docker-compose.core.yml down -v
126178
poetry run pytest -m example
127179
```
128180

129-
All of these tests run in [CI](https://github.com/splitgraph/splitgraph/actions).
181+
All of these tests run in
182+
[CI](https://github.com/splitgraph/splitgraph/actions).

0 commit comments

Comments
 (0)