|
1 | 1 | # Mobility Data Specification: Metrics |
2 | 2 |
|
| 3 | +<a href="/metrics/"><img src="https://i.imgur.com/ouijHLj.png" width="120" align="right" alt="MDS Metrics Icon" border="0"></a> |
| 4 | + |
3 | 5 | The Metrics API endpoints are intended to be implemented by regulatory agencies, their third party appointed representatives, or city designated partners for requesting **historical** calculated [core metrics](core_metrics.md) and aggregations of MDS data. The Metrics API allows viewing of aggregate report data derived from some MDS endpoints that may be used for use cases like compliance, program effectiveness, and alignment on counts. The metrics [methodology](/metrics/metrics_methodology.md) definitions may be used by providers and third parties in their own calculations. |
4 | 6 |
|
5 | 7 | [Metrics Examples](examples) are available with sample implementations. |
@@ -96,15 +98,15 @@ Further scopes and requirements may be added at the discretion of the Agency, de |
96 | 98 |
|
97 | 99 | ## Data Redaction |
98 | 100 |
|
99 | | -Some combinations of dimensions, filters, time, and geography may return a small count of trips, which could increase a privacy risk of re-identification. To correct for that, Metrics does not return data below a certain count of results. This is called k-anonymity, and the threshold is set at a k-value of 10. |
| 101 | +Some combinations of dimensions, filters, time, and geography may return a small count of trips, which could increase a privacy risk of re-identification. To correct for that, Metrics does not return data below a certain count of results. This data redaction is called k-anonymity, and the threshold is set at a k-value of 10. For more explanation of this methodology, see our [Data Redaction Guidance document](https://github.com/openmobilityfoundation/mobility-data-specification/wiki/MDS-Data-Redaction). |
100 | 102 |
|
101 | | -**If the query returns less than `10` trips in a count, then that row's count value is returned as "-1".** Note "0" values are also returned as "-1" since the goal is to group low and no count values together for privacy. |
| 103 | +**If the query returns fewer than `10` trips in a count, then that row's count value is returned as "-1".** Note "0" values are also returned as "-1" since the goal is to group both low and no count values together for privacy. |
102 | 104 |
|
103 | 105 | The OMF suggests a k-value of 10 is an appropriate starting point for safe anonymization, absent analysis and a further decision from the agency. As Metrics is in [beta](#beta-feature), this value may be adjusted in future releases and/or may become dynamic to account for specific categories of use cases and users. To improve the specification and to inform future guidance, beta users are encouraged to share their feedback and questions about k-values on this [discussion thread](https://github.com/openmobilityfoundation/mobility-data-specification/discussions/622). |
104 | 106 |
|
105 | 107 | The k-value being used is always returned in the Metrics Query API [response](/metrics#response-1) to provide important context for the data consumer on the data redaction that is occurring. |
106 | 108 |
|
107 | | -Using k-anonymity with this k-value and methodology will reduce, but not necessarily eliminate the risk that an individual could be reidentified in a dataset. Higher k-values have lower re-identification risk, but may result in less complete metrics depending on the duration of time periods and size of geographic areas for which the metrics are calculated. Some use cases (such as sharing metrics with trusted parties who already have access to disaggregated trip data) may not require k-anonymization, while others (such as sharing with less trusted partners or extracts for the public) may require substantial k-anonymization. While metrics with any k-value are likely to be substantially less sensitive than disaggregated trip records, they should still be treated as potentially sensitive unless a more detailed risk analysis is performed by the hosting organization. |
| 109 | +Using k-anonymity will reduce, but not necessarily eliminate the risk that an individual could be re-identified in a dataset, and this data should still be treated as sensitive. This is just one part of good privacy protection practices, which you can read more about in our [MDS Privacy Guide for Cities](https://github.com/openmobilityfoundation/governance/blob/main/documents/OMF-MDS-Privacy-Guide-for-Cities.pdf). |
108 | 110 |
|
109 | 111 | [Top][toc] |
110 | 112 |
|
|
0 commit comments