Skip to content

Commit 3c841dc

Browse files
authored
Merge pull request #36133 from dimitri-furman/dfurman/ordered-columnstore
Update ordered columnstore
2 parents 4585f63 + c536926 commit 3c841dc

1 file changed

Lines changed: 116 additions & 108 deletions

File tree

docs/relational-databases/indexes/ordered-columnstore-indexes.md

Lines changed: 116 additions & 108 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
---
2-
title: "Performance Tuning With Ordered Columnstore Indexes"
3-
description: "Learn more about how ordered columnstore indexes can benefit your query performance."
2+
title: "Performance Tuning with Ordered Columnstore Indexes"
3+
description: "Learn more about how ordered columnstore indexes can benefit query performance."
44
author: WilliamDAssafMSFT
55
ms.author: wiassaf
66
ms.reviewer: nibruno; xiaoyul, randolphwest, dfurman
7-
ms.date: 04/14/2025
7+
ms.date: 12/29/2025
88
ms.service: sql
99
ms.subservice: performance
10-
ms.topic: conceptual
10+
ms.topic: article
1111
ms.custom:
1212
- ignite-2025
1313
monikerRange: "=azuresqldb-current || >=sql-server-ver16 || >=sql-server-linux-ver16 || =azuresqldb-mi-current || =fabric-sqldb"
@@ -17,98 +17,66 @@ monikerRange: "=azuresqldb-current || >=sql-server-ver16 || >=sql-server-linux-v
1717

1818
[!INCLUDE [sqlserver2022-asdb-asmi-fabricsqldb](../../includes/applies-to-version/sqlserver2022-asdb-asmi-fabricsqldb.md)]
1919

20-
By enabling efficient segment elimination, ordered columnstore indexes provide faster performance by skipping large amounts of ordered data that don't match the query predicate. Loading data into an ordered columnstore index and keeping it ordered via index rebuilds can take longer than in a non-ordered index because of the data sorting operation, however with ordered columnstore indexes queries can run faster afterwards.
20+
Ordered columnstore indexes can provide faster performance by skipping large amounts of ordered data that don't match the query predicate. While loading data into an ordered columnstore index and maintaining order through index rebuild takes longer than in a non-ordered index, indexed queries can run faster with ordered columnstore.
2121

22-
When users query a columnstore table, the optimizer checks the minimum and maximum values stored in each segment. Segments that are outside the bounds of the query predicate aren't read from disk to memory. A query can finish faster if the number of segments to read and their total size are smaller.
22+
When a query reads a columnstore index, the [!INCLUDE [ssDE](../../includes/ssde-md.md)] checks the minimum and maximum values stored in each column segment. The process eliminates segments that fall outside the bounds of the query predicate. In other words, it skips these segments when reading data from disk or memory. A query finishes faster if the number of segments to read and their total size is significantly smaller.
2323

24-
For ordered columnstore index availability, see [Ordered columnstore index availability](columnstore-indexes-overview.md#ordered-columnstore-index-availability).
24+
For ordered columnstore index availability in various SQL platforms and SQL Server versions, see [Ordered columnstore index availability](columnstore-indexes-overview.md#ordered-columnstore-index-availability).
2525

2626
For more information about recently added features for columnstore indexes, see [What's new in columnstore indexes](columnstore-indexes-what-s-new.md).
2727

2828
## Ordered vs. non-ordered columnstore index
2929

30-
In a columnstore index, data in each column of each rowgroup is compressed into a separate segment. Each segment contains metadata describing its minimum and maximum values, so segments that are outside the bounds of the query predicate aren't read from disk during query execution.
30+
In a columnstore index, the data in each column of each rowgroup is compressed into a separate segment. Each segment contains metadata describing its minimum and maximum values, so the query execution process can skip segments that fall outside the bounds of the query predicate.
3131

32-
When a columnstore index is not ordered, the index builder doesn't sort the data before compressing it into segments. That means that segments with overlapping value ranges can occur, causing queries to read more segments from disk and take longer to finish.
32+
When a columnstore index isn't ordered, the index builder doesn't sort the data before compressing it into segments. That means that segments with overlapping value ranges can occur, causing queries to read more segments to obtain the required data. As a result, queries can take longer to finish.
3333

34-
When you create an ordered columnstore index, the [!INCLUDE [ssDE](../../includes/ssde-md.md)] sorts the existing data by the order keys you specify before the index builder compresses them into segments. With sorted data, segment overlapping is reduced or eliminated, allowing queries to have a more efficient segment elimination and thus faster performance because there are fewer segments to read from disk.
34+
When you create an ordered columnstore index, the [!INCLUDE [ssDE](../../includes/ssde-md.md)] sorts the existing data by the order keys you specify before the index builder compresses them into segments. With sorted data, segment overlapping is reduced or eliminated, allowing queries to use a more efficient segment elimination and thus faster performance because there are fewer segments and less data to read.
3535

36-
Depending on the available memory, the data size, the degree of parallelism, the index type (clustered vs. nonclustered), and the type of index build (offline vs. online), the sort for ordered columnstore indexes might be full (no segment overlap) or partial (some segment overlap). For example, partial sort occurs when the available memory is insufficient for a full sort. Queries using an ordered columnstore index often execute faster than with a non-ordered index even if the ordered index was built using a partial sort.
36+
## Reduce segment overlap
3737

38-
Full sort is provided for ordered clustered columnstore indexes created or rebuilt with both `ONLINE = ON` and `MAXDOP = 1` options. In this case, the sort is not limited by the available memory because it uses the `tempdb` database to spill the data that doesn't fit in memory. This can make the index build process slower due to the additional `tempdb` I/O. However, with an online index rebuild, queries can continue using the existing index while the new ordered index is being rebuilt.
38+
When you build an ordered columnstore index, the [!INCLUDE [ssDE](../../includes/ssde-md.md)] sorts the data on a best-effort basis. Depending on the available memory, the data size, the degree of parallelism, the index type (clustered vs. nonclustered), and the type of index build (offline vs. online), the sort for ordered columnstore indexes might be full with no segment overlap, or partial with some segment overlap.
3939

40-
Full sort might also be provided for ordered clustered and nonclustered columnstore indexes created or rebuilt with both `ONLINE = OFF` and `MAXDOP = 1` options if the amount of data to be sorted is sufficiently small to fully fit in available memory.
40+
The following table describes the resulting sort type when you create or rebuild an ordered columnstore index, depending on index build options.
4141

42-
In all other cases, the sort in ordered columnstore indexes is partial.
42+
| Prerequisites | Sort type |
43+
| --- | --- |
44+
| `ONLINE = ON` and `MAXDOP = 1` | Full |
45+
| `ONLINE = OFF`, `MAXDOP = 1`, and the data to sort fully fits in the query workspace memory | Full |
46+
| All other cases | Partial |
4347

44-
> [!NOTE]
45-
> Currently, ordered columnstore indexes can be created or rebuilt online only in [!INCLUDE [ssazure-sqldb](../../includes/ssazure-sqldb.md)], in [!INCLUDE [ssazure-sqlmi-autd](../../includes/ssazure-sqlmi-autd.md)], and in [!INCLUDE [sssql25-md](../../includes/sssql25-md.md)].
48+
In the first case when both `ONLINE = ON` and `MAXDOP = 1`, the sort isn't limited by the query workspace memory because it uses the `tempdb` database to spill the data that doesn't fit in memory. This approach can make the index build process slower due to the additional `tempdb` I/O. However, because the index build is performed online, queries can continue using the existing index while the new ordered index is being built.
4649

47-
To check the segment ranges for a column and determine if there is any segment overlap, use the following query, substituting placeholders with your schema, table, and column names:
50+
Similarly, with an offline rebuild of a partitioned columnstore index, the rebuild is done one partition at a time. Other partitions remain available for queries.
4851

49-
```sql
50-
SELECT OBJECT_SCHEMA_NAME(o.object_id) AS schema_name,
51-
o.name AS table_name,
52-
cols.name AS column_name,
53-
pnp.index_id,
54-
cls.row_count,
55-
pnp.data_compression_desc,
56-
cls.segment_id,
57-
cls.column_id,
58-
cls.min_data_id,
59-
cls.max_data_id
60-
FROM sys.partitions AS pnp
61-
INNER JOIN sys.tables AS t
62-
ON pnp.object_id = t.object_id
63-
INNER JOIN sys.objects AS o
64-
ON t.object_id = o.object_id
65-
INNER JOIN sys.column_store_segments AS cls
66-
ON pnp.partition_id = cls.partition_id
67-
INNER JOIN sys.columns AS cols
68-
ON o.object_id = cols.object_id
69-
AND
70-
cls.column_id = cols.column_id
71-
WHERE OBJECT_SCHEMA_NAME(o.object_id) = '<Schema Name>'
72-
AND
73-
o.name = '<Table Name>'
74-
AND
75-
cols.name = '<Column Name>'
76-
ORDER BY o.name, pnp.index_id, cls.min_data_id;
77-
```
78-
79-
For example, the output from this query for a fully sorted columnstore index might look as follows. Note that there is no overlap in the `min_data_id` and `max_data_id` columns for different segments.
80-
81-
```output
82-
schema_name table_name column_name index_id row_count data_compression_desc segment_id column_id min_data_id max_data_id
83-
----------- ---------- ----------- -------- --------- --------------------- ---------- --------- ----------- -----------
84-
dbo Table1 Column1 1 479779 COLUMNSTORE 0 1 -17 1469515
85-
dbo Table1 Column1 1 887658 COLUMNSTORE 1 1 1469516 2188146
86-
dbo Table1 Column1 1 930144 COLUMNSTORE 2 1 2188147 11072928
87-
```
52+
When `MAXDOP` is greater than 1, each thread used for ordered columnstore index build works on a subset of data and sorts it locally. There's no global sorting across data sorted by different threads. Using parallel threads can reduce the time to create the index, but it results in more overlapping segments than when using a single thread.
8853

89-
> [!NOTE]
90-
> In an ordered columnstore index, the new data resulting from the same batch of DML or data loading operations is sorted within that batch only. There's no global sorting that includes existing data in the table.
54+
> [!TIP]
55+
> Even if the sort in an ordered columnstore index is partial, segments can still be eliminated (skipped). A full sort isn't required to gain query performance benefits if a partial sort avoids many segment overlaps.
9156
>
92-
> To sort data in the index after inserting new data or updating existing data, rebuild the index.
57+
> To find the number of non-overlapping segments in an ordered columnstore index, see the [Determine the sort quality for an ordered columnstore index](#determine-the-sort-quality-for-an-ordered-columnstore-index) example.
9358
94-
For an offline rebuild of a partitioned columnstore index, rebuild is done one partition at a time. Data in the partition that is being rebuilt is unavailable until the rebuild is complete for that partition.
59+
You can create or rebuild ordered columnstore indexes online only in [!INCLUDE [ssazure-sqldb](../../includes/ssazure-sqldb.md)], in [!INCLUDE [ssazure-sqlmi-autd](../../includes/ssazure-sqlmi-autd.md)], and starting with [!INCLUDE [sssql25-md](../../includes/sssql25-md.md)]. In SQL Server, online index operations aren't available in all editions. For more information, see [Editions and supported features of SQL Server 2025](../../sql-server/editions-and-components-of-sql-server-2025.md) and [Perform index operations online](perform-index-operations-online.md).
9560

96-
Data remains available during an online rebuild. For more information, see [Perform index operations online](perform-index-operations-online.md).
61+
### Add new data or update existing data
62+
63+
The new data resulting from a DML batch or a bulk load operation on an ordered columnstore index is sorted within that batch only. There's no global sorting that includes existing data in the table. To reduce segment overlaps after inserting the new data or updating existing data, rebuild the index.
9764

9865
## Query performance
9966

100-
The performance gain from an ordered columnstore index depends on the query patterns, the size of data, how well the data is sorted, the physical structure of segments, and the compute resources available for query execution.
67+
The performance gain from an ordered columnstore index depends on the query patterns, the size of data, the sort quality, and the compute resources available for query execution.
10168

102-
Queries with the following patterns typically run faster with ordered columnstore indexes.
69+
Queries with the following patterns typically run faster with ordered columnstore indexes:
10370

10471
- Queries that have equality, inequality, or range predicates.
10572
- Queries where the predicate columns and the ordered CCI columns are the same.
10673

10774
In this example, table `T1` has a clustered columnstore index ordered in the sequence of `Col_C`, `Col_B`, and `Col_A`.
10875

10976
```sql
110-
CREATE CLUSTERED COLUMNSTORE INDEX MyOrderedCCI ON T1
111-
ORDER (Col_C, Col_B, Col_A);
77+
CREATE CLUSTERED COLUMNSTORE INDEX MyOrderedCCI
78+
ON T1
79+
ORDER(Col_C, Col_B, Col_A);
11280
```
11381

11482
The performance of query 1 and 2 can benefit from ordered columnstore index more than query 3 and 4, because they reference all the ordered columns.
@@ -117,73 +85,66 @@ The performance of query 1 and 2 can benefit from ordered columnstore index more
11785
-- query 1
11886
SELECT *
11987
FROM T1
120-
WHERE Col_C = 'c' AND Col_B = 'b' AND Col_A = 'a';
88+
WHERE Col_C = 'c'
89+
AND Col_B = 'b'
90+
AND Col_A = 'a';
12191

12292
-- query 2
12393
SELECT *
12494
FROM T1
125-
WHERE Col_B = 'b' AND Col_C = 'c' AND Col_A = 'a';
95+
WHERE Col_B = 'b'
96+
AND Col_C = 'c'
97+
AND Col_A = 'a';
12698

12799
-- query 3
128100
SELECT *
129101
FROM T1
130-
WHERE Col_B = 'b' AND Col_A = 'a';
102+
WHERE Col_B = 'b'
103+
AND Col_A = 'a';
131104

132105
-- query 4
133106
SELECT *
134107
FROM T1
135-
WHERE Col_A = 'a' AND Col_C = 'c';
108+
WHERE Col_A = 'a'
109+
AND Col_C = 'c';
136110
```
137111

138112
## Data load performance
139113

140-
The performance of data load into a table with an ordered columnstore index is similar to a partitioned table. Loading data can take longer than with a non-ordered columnstore index because of the data sorting operation, however queries can run faster afterwards.
141-
142-
## Reduce segment overlapping
143-
144-
The number of overlapping segments depends on the size of data to sort, the available memory, and the maximum degree of parallelism (`MAXDOP`) setting during ordered columnstore index build. The following strategies reduce segment overlapping, however they can make the index build process take longer.
145-
146-
- If online index build is available, use both `ONLINE = ON` and `MAXDOP = 1` options when creating an ordered clustered columnstore index. This creates a fully sorted index.
147-
- If online index build isn't available, use the `MAXDOP = 1` option.
148-
- Pre-sort the data by the sort keys before the load.
149-
150-
When `MAXDOP` is greater than 1, each thread used for ordered columnstore index build works on a subset of data and sorts it locally. There's no global sorting across data sorted by different threads. Using parallel threads can reduce the time to create the index, but it generates more overlapping segments than when using a single thread. Using a single threaded operation delivers the highest compression quality. You can specify `MAXDOP` with the `CREATE INDEX` command.
114+
The performance of a data load into a table with an ordered columnstore index is similar to a partitioned table. Loading data can take longer than with a non-ordered columnstore index because of the data sorting operation, but queries can run faster afterwards.
151115

152116
## Examples
153117

154-
### Check for ordered columns and order ordinal
155-
156-
```sql
157-
SELECT object_name(c.object_id) AS table_name,
158-
c.name AS column_name,
159-
i.column_store_order_ordinal
160-
FROM sys.index_columns AS i
161-
INNER JOIN sys.columns AS c
162-
ON i.object_id = c.object_id
163-
AND
164-
c.column_id = i.column_id
165-
WHERE column_store_order_ordinal <> 0;
166-
```
167-
168118
### Create an ordered columnstore index
169119

170120
Clustered ordered columnstore index:
171121

172122
```sql
173123
CREATE CLUSTERED COLUMNSTORE INDEX OCCI
174124
ON dbo.Table1
175-
ORDER (Column1, Column2);
125+
ORDER(Column1, Column2);
176126
```
177127

178128
Nonclustered ordered columnstore index:
179129

180130
```sql
181131
CREATE NONCLUSTERED COLUMNSTORE INDEX ONCCI
182-
ON dbo.Table1
183-
(
184-
Column1, Column2, Column3
185-
)
186-
ORDER (Column1, Column2);
132+
ON dbo.Table1(Column1, Column2, Column3)
133+
ORDER(Column1, Column2);
134+
```
135+
136+
### Check for ordered columns and order ordinal
137+
138+
```sql
139+
SELECT OBJECT_SCHEMA_NAME(c.object_id) AS schema_name,
140+
OBJECT_NAME(c.object_id) AS table_name,
141+
c.name AS column_name,
142+
i.column_store_order_ordinal
143+
FROM sys.index_columns AS i
144+
INNER JOIN sys.columns AS c
145+
ON i.object_id = c.object_id
146+
AND c.column_id = i.column_id
147+
WHERE column_store_order_ordinal > 0;
187148
```
188149

189150
### Add or remove order columns and rebuild an existing ordered columnstore index
@@ -193,19 +154,16 @@ Clustered ordered columnstore index:
193154
```sql
194155
CREATE CLUSTERED COLUMNSTORE INDEX OCCI
195156
ON dbo.Table1
196-
ORDER (Column1, Column2)
157+
ORDER(Column1, Column2)
197158
WITH (DROP_EXISTING = ON);
198159
```
199160

200161
Nonclustered ordered columnstore index:
201162

202163
```sql
203164
CREATE NONCLUSTERED COLUMNSTORE INDEX ONCCI
204-
ON dbo.Table1
205-
(
206-
Column1, Column2, Column3
207-
)
208-
ORDER (Column1, Column2)
165+
ON dbo.Table1(Column1, Column2, Column3)
166+
ORDER(Column1, Column2)
209167
WITH (DROP_EXISTING = ON);
210168
```
211169

@@ -214,7 +172,7 @@ WITH (DROP_EXISTING = ON);
214172
```sql
215173
CREATE CLUSTERED COLUMNSTORE INDEX OCCI
216174
ON dbo.Table1
217-
ORDER (Column1)
175+
ORDER(Column1)
218176
WITH (ONLINE = ON, MAXDOP = 1);
219177
```
220178

@@ -223,10 +181,60 @@ WITH (ONLINE = ON, MAXDOP = 1);
223181
```sql
224182
CREATE CLUSTERED COLUMNSTORE INDEX OCCI
225183
ON dbo.Table1
226-
ORDER (Column1)
184+
ORDER(Column1)
227185
WITH (DROP_EXISTING = ON, ONLINE = ON, MAXDOP = 1);
228186
```
229187

188+
### Determine the sort quality for an ordered columnstore index
189+
190+
This example determines the sort quality for all ordered columnstore indexes in the database. In this example, sort quality is defined as a ratio of non-overlapping segments to all segments for each order column, expressed as a percentage.
191+
192+
```sql
193+
WITH ordered_column_segment
194+
AS (SELECT p.object_id,
195+
i.name AS index_name,
196+
ic.column_store_order_ordinal,
197+
cls.row_count,
198+
cls.column_id,
199+
cls.min_data_id,
200+
cls.max_data_id,
201+
LAG(max_data_id) OVER (
202+
PARTITION BY cls.partition_id, ic.column_store_order_ordinal
203+
ORDER BY cls.min_data_id
204+
) AS prev_max_data_id,
205+
LEAD(min_data_id) OVER (
206+
PARTITION BY cls.partition_id, ic.column_store_order_ordinal
207+
ORDER BY cls.min_data_id
208+
) AS next_min_data_id
209+
FROM sys.partitions AS p
210+
INNER JOIN sys.indexes AS i
211+
ON p.object_id = i.object_id
212+
AND p.index_id = i.index_id
213+
INNER JOIN sys.column_store_segments AS cls
214+
ON p.partition_id = cls.partition_id
215+
INNER JOIN sys.index_columns AS ic
216+
ON ic.object_id = p.object_id
217+
AND ic.index_id = p.index_id
218+
AND ic.column_id = cls.column_id
219+
WHERE ic.column_store_order_ordinal > 0)
220+
SELECT OBJECT_SCHEMA_NAME(object_id) AS schema_name,
221+
OBJECT_NAME(object_id) AS object_name,
222+
index_name,
223+
INDEXPROPERTY(object_id, index_name, 'IsClustered') AS is_clustered_column_store,
224+
COL_NAME(object_id, column_id) AS order_column_name,
225+
column_store_order_ordinal,
226+
SUM(row_count) AS row_count,
227+
SUM(is_overlapping_segment) AS overlapping_segments,
228+
COUNT(1) AS total_segments,
229+
(1 - SUM(is_overlapping_segment) / COUNT(1)) * 100 AS order_quality_percent
230+
FROM ordered_column_segment
231+
CROSS APPLY (SELECT CAST (IIF (prev_max_data_id > min_data_id
232+
OR next_min_data_id < max_data_id, 1, 0) AS FLOAT) AS is_overlapping_segment
233+
) AS ios
234+
GROUP BY object_id, index_name, column_id, column_store_order_ordinal
235+
ORDER BY schema_name, object_name, index_name, column_store_order_ordinal;
236+
```
237+
230238
## Related content
231239

232240
- [Columnstore index design guidelines](../sql-server-index-design-guide.md#columnstore_index)

0 commit comments

Comments
 (0)