Skip to content

capitalone/datacompy

Repository files navigation

DataComPy

PyPI - Python Version Ruff PyPI version Anaconda-Server Badge PyPI - Downloads

DataComPy is a package to compare two DataFrames (or tables) such as Pandas, Spark, Polars, and even Snowflake. Originally it was created to be something of a replacement for SAS's PROC COMPARE for Pandas DataFrames with some more functionality than just Pandas.DataFrame.equals(Pandas.DataFrame) (in that it prints out some stats, and lets you tweak how accurate matches have to be). Supported types include:

  • Pandas
  • Polars
  • Spark
  • Snowflake

Important

datacompy has released v1. The v0.19.x line is no longer supported — users should upgrade to v1 going forward. The support/0.19.x branch is archived and will only receive critical security fixes on a best-effort basis; no new features or regular maintenance will be provided. All active development targets main.

Quick Installation

pip install datacompy

or

conda install datacompy

Installing extras

If you would like to use Spark or any other backends please make sure you install via extras:

pip install datacompy[spark]
pip install datacompy[snowflake]

Supported backends

Programmatic Report Access

Every compare object exposes build_report_data() which returns a typed ReportData object — useful for dashboards, JSON export, or custom rendering without relying on the string report:

import pandas as pd
from datacompy import PandasCompare

df1 = pd.DataFrame({"id": [1, 2, 3], "val": [10, 20, 30]})
df2 = pd.DataFrame({"id": [1, 2, 3], "val": [10, 99, 30]})

compare = PandasCompare(df1, df2, join_columns="id")

# Access structured data directly
data = compare.build_report_data()
print(data.row_summary.unequal_rows)        # 1
print(data.mismatch_stats.stats[0].column)  # 'val'

# Render / export — methods live on ReportData itself
print(data.render())      # same text as compare.report()
data.save("report.html")  # HTML file
data.to_dict()            # JSON-serializable dict

See the Report API documentation for the full reference.

Contributors

We welcome and appreciate your contributions! Before we can accept any contributions, we ask that you please be sure to sign the Contributor License Agreement (CLA).

This project adheres to the Open Source Code of Conduct. By participating, you are expected to honor this code.

Packages

 
 
 

Contributors

Languages