Skip to content

Commit 5c1e33d

Browse files
committed
clarify cps
1 parent 09e5e4c commit 5c1e33d

4 files changed

Lines changed: 31 additions & 15 deletions

File tree

paper/bibliography/references.bib

Lines changed: 21 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -123,12 +123,30 @@ @article{auerbach2018
123123
year = {2018}
124124
}
125125

126-
@techreport{bryant2022,
126+
@techreport{bryant2023a,
127+
title = {General Description Booklet for the 2015 Public Use Tax File},
128+
author = {Bryant, Victoria},
129+
institution = {Statistics of Income Division, Internal Revenue Service},
130+
year = {2023},
131+
month = {February},
132+
type = {Technical Documentation},
133+
url = {https://drive.google.com/file/d/1WoTU70GEjYMO0KHsHvTTH0NwCc-kN5cE/view}
134+
}
135+
136+
@techreport{bryant2023b,
127137
title = {General Description Booklet for the 2015 Public Use Tax File Demographic File},
128138
author = {Bryant, Victoria},
129139
institution = {Statistics of Income Division, Internal Revenue Service},
130-
year = {2022},
131-
month = {September},
140+
year = {2023},
141+
month = {February},
132142
type = {Technical Documentation},
133143
url = {https://drive.google.com/file/d/1WoTU70GEjYMO0KHsHvTTH0NwCc-kN5cE/view}
134144
}
145+
146+
@techreport{census2024,
147+
title = {Current Population Survey, 2024 Annual Social and Economic (ASEC) Supplement},
148+
author = {{U.S. Census Bureau}},
149+
institution = {U.S. Census Bureau},
150+
year = {2024},
151+
url = {https://www2.census.gov/programs-surveys/cps/datasets/2024/march/asec2024_ddl_pub_full.pdf}
152+
}

paper/main.pdf

391 Bytes
Binary file not shown.

paper/sections/data.tex

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2,28 +2,27 @@ \section{Data}\label{sec:data}
22

33
\subsection{Current Population Survey}
44

5-
The Current Population Survey Annual Social and Economic Supplement (CPS ASEC) provides comprehensive demographic and economic information for a nationally representative sample of U.S. households. For tax year 2024, our base dataset contains approximately 150,000 households representing the U.S. civilian non-institutional population.
5+
The Census Bureau administers the Current Population Survey Annual Social and Economic Supplement (CPS ASEC, or hereafter the CPS) each March. In March 2024, they surveyed 89,473 households representing the U.S. civilian non-institutional population about their activities in the 2023 calendar year.
66

77
The CPS's key strengths include:
88
\begin{itemize}
99
\item Rich demographic detail including age, sex, race, ethnicity, and education
1010
\item Complete household relationship matrices
1111
\item Program participation indicators
12-
\item State and sub-state geographic identifiers
13-
\item Monthly employment and labor force status
12+
\item State identifiers, and partial county identifiers
1413
\end{itemize}
1514

1615
However, the CPS has known limitations for tax modeling:
1716
\begin{itemize}
18-
\item Underreporting of income, particularly at the top of the distribution
17+
\item Underreporting of income, particularly at the top of the distribution due to top-coding
1918
\item Limited tax-relevant information (e.g., itemized deductions)
2019
\item No direct observation of tax units within households
2120
\item Imprecise measurement of certain income types (e.g., capital gains)
2221
\end{itemize}
2322

2423
\subsection{IRS Public Use File}
2524

26-
The Internal Revenue Service Public Use File (PUF) is a national sample of individual income tax returns, representing the 151.2 million Form 1040, Form 1040A, and Form 1040EZ Federal Individual Income Tax Returns filed for Tax Year 2015. The file contains 119,675 records sampled at varying rates across strata, with 0.07 percent sampling for strata 7 through 13 \cite{bryant2022}. The data are extensively transformed to protect taxpayer privacy while preserving statistical properties.
25+
The Internal Revenue Service Public Use File (PUF) is a national sample of individual income tax returns, representing the 151.2 million Form 1040, Form 1040A, and Form 1040EZ Federal Individual Income Tax Returns filed for Tax Year 2015. The file contains 119,675 records sampled at varying rates across strata, with 0.07 percent sampling for strata 7 through 13 \cite{bryant2023b}. The data are extensively transformed to protect taxpayer privacy while preserving statistical properties.
2726

2827
The Public Use Tax Demographic File supplements the PUF with:
2928
\begin{itemize}
@@ -53,14 +52,15 @@ \subsection{IRS Public Use File}
5352
\begin{itemize}
5453
\item Limited demographic information
5554
\item No household structure beyond the tax unit
56-
\item Geographic detail limited to state
55+
\item No geographic information such as state
5756
\item No program participation information
5857
\item Privacy protections that mask extreme values
58+
\item Lag; the latest version as of November 2024 is for the 2015 tax year
5959
\end{itemize}
6060

6161
\subsection{External Validation Sources}
6262

63-
We validate our enhanced dataset against several external sources:
63+
We validate our enhanced dataset against 570 targets from several external sources:
6464

6565
\subsubsection{IRS Statistics of Income}
6666

@@ -81,12 +81,11 @@ \subsubsection{CPS ASEC Public Tables}
8181
\item Age distribution by state
8282
\item Household size distribution
8383
\item Program participation rates
84-
\item Employment status
8584
\end{itemize}
8685

8786
\subsubsection{Administrative Program Totals}
8887

89-
We incorporate official totals from various agencies:
88+
We incorporate official totals from various agencies, including but not limited to:
9089
\begin{itemize}
9190
\item Social Security Administration beneficiary counts and benefit amounts
9291
\item SNAP participation and benefits from USDA
@@ -98,7 +97,6 @@ \subsection{Variable Harmonization}
9897

9998
A crucial preparatory step is harmonizing variables across datasets. We develop a detailed crosswalk between CPS and PUF variables, accounting for definitional differences. Key considerations include:
10099
\begin{itemize}
101-
\item Income timing (calendar year vs. tax year)
102100
\item Income classification (e.g., business vs. wage income)
103101
\item Geographic definitions
104102
\item Family relationship categories

paper/sections/methodology/overview.tex

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
\section{Methodology}\label{sec:methodology}
22

3-
Following \cite{bryant2022}, our procedure enhances the Current Population Survey (CPS) with tax information from the Public Use File (PUF) through four key steps:
3+
Following \cite{bryant2023a}, our procedure enhances the Current Population Survey (CPS) with tax information from the Public Use File (PUF) through four key steps:
44
\begin{enumerate}
55
\item Project both CPS and PUF data to the target year
66
\item Transfer tax variable distributions from PUF to CPS records
@@ -26,7 +26,7 @@ \subsection{Data Projection}
2626

2727
\subsection{Demographic Variable Construction}
2828

29-
Following \cite{bryant2022}, we construct several key demographic variables:
29+
Following \cite{bryant2023b}, we construct several key demographic variables:
3030

3131
\subsubsection{Dependent Ages}
3232
We create three dependent age variables (AGEDP1/2/3) capturing:

0 commit comments

Comments
 (0)