Warning
DO NOT COMMIT DATA FILES TO THIS REPOSITORY
setup assumes you are on a mac with mise and homebrew installed:
# update project tooling
mise install
# update python dependencies
uv sync
# (optional) install a sqlite browser
# brew install db-browser-for-sqlite
# setup local .env file, update DATA_ROOT according to where you put the .csv
# dump files
cp sample.env .envput the .csv files in raw_csv/ or change .env DATA_ROOT to be the path RELATIVE TO THIS PROJECT DIRECTORY where the files are stored.
bin/load-sqlitererun when the .csv files change
open data/dump.db using whatever Sqlite tooling you like.
bin/load-sqlite loops through every .csv file in the DATA_ROOT folder, generates a tablename based on the file basename (location.csv becomes location) and loads it into data/dump.db using Pandas' default settings.
here's a simple example of what that looks like for one file at a time:
import os
import pd
import sqlite3
csv_filename = "./raw_csv/location.csv"
table_name = "location"
database_filename = "./data/dump.db"
df = pd.read_csv(csv_filename)
conn = sqlite3.connect(database_filename)
df.to_sql(table_name, conn, if_exists="replace", index=False)
conn.close()If data/dump.db does not exist, it will be created.
If the table does not exist, it will be created. If it does exist, it will be replaced with the data present in the .csv file.