datannurpy is the Python builder for datannur. It scans files and databases, extracts metadata and statistics, then generates a ready-to-use catalog bundled with the datannur app.
Key features:
- Broad format support - CSV, Excel, Parquet, Delta Lake, Iceberg, SAS, SPSS, Stata
- Database introspection - PostgreSQL, MySQL, Oracle, SQL Server, SQLite, DuckDB
- Remote and cloud storage - SFTP, S3, Azure Blob, GCS via fsspec
- Metadata extraction - Schemas, statistics, frequencies, enumerations, auto-tagging
- Incremental scans - Only rescan what changed between runs
- YAML or Python API - Declarative configuration or programmatic control
pip install datannurpy# catalog.yml
app_path: ./my-catalog
open_browser: true
add:
- folder: ./data
include: ["*.csv", "*.xlsx", "*.parquet"]
- database: sqlite:///mydb.sqlitepython -m datannurpy catalog.ymlThis command scans the configured sources, generates the catalog files, and opens the datannur app.
📖 Full documentation: docs.datannur.com/builder
🗂️ datannur app: github.com/datannur/datannur
🌐 Website: datannur.com
🚀 Demo: dev.datannur.com
For development documentation and contributing guidelines, see CONTRIBUTING.md.
MIT - see LICENSE. All dependencies are MIT/Apache 2.0/BSD compatible.