GitHub - abachman-dsac/npd-data-quality: Exploring NPD project data

Warning

DO NOT COMMIT DATA FILES TO THIS REPOSITORY

setup

setup assumes you are on a mac with mise and homebrew installed:

# update project tooling
mise install

# update python dependencies
uv sync

# (optional) install a sqlite browser
# brew install db-browser-for-sqlite

# setup local .env file, update DATA_ROOT according to where you put the .csv
# dump files
cp sample.env .env

put the .csv files in raw_csv/ or change .env DATA_ROOT to be the path RELATIVE TO THIS PROJECT DIRECTORY where the files are stored.

usage

bin/load-sqlite

rerun when the .csv files change

open data/dump.db using whatever Sqlite tooling you like.

what it does

bin/load-sqlite loops through every .csv file in the DATA_ROOT folder, generates a tablename based on the file basename (location.csv becomes location) and loads it into data/dump.db using Pandas' default settings.

here's a simple example of what that looks like for one file at a time:

import os
import pd
import sqlite3

csv_filename = "./raw_csv/location.csv"
table_name = "location"
database_filename = "./data/dump.db"

df = pd.read_csv(csv_filename)
conn = sqlite3.connect(database_filename)
df.to_sql(table_name, conn, if_exists="replace", index=False)
conn.close()

If data/dump.db does not exist, it will be created.

If the table does not exist, it will be created. If it does exist, it will be replaced with the data present in the .csv file.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
bin		bin
data		data
raw_csv		raw_csv
scripts		scripts
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
mise.toml		mise.toml
pyproject.toml		pyproject.toml
sample.env		sample.env
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

setup

usage

what it does

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

setup

usage

what it does

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages