Skip to content

Commit a48080b

Browse files
committed
Implement support for revisions and comments.
1 parent 9bbfdd4 commit a48080b

26 files changed

Lines changed: 1987 additions & 31 deletions

HISTORY.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,19 @@
33
Release History
44
---------------
55

6+
master branch:
7+
++++++++++++++++++
8+
9+
- Add support for tracked insertions and deletions (revisions)
10+
- Add `Paragraph.accepted_text`, `Paragraph.deleted_text`, and tracked-change helpers
11+
- Change `Paragraph.text` to include deleted text and exclude inserted text
12+
- Add paragraph/run comment convenience methods and comment resolution timestamp support
13+
- Add `_Cell.add_comment()` convenience for comments anchored to cell content
14+
- Add `Paragraph.add_comment_range()` for substring comments on accepted-view text
15+
- Add `TrackedChange.add_comment()` for comments on tracked insertions and deletions
16+
- Restrict comment resolution to top-level comments; replies are not independently resolvable
17+
18+
619
1.2.0 (2025-06-16)
720
++++++++++++++++++
821

docs/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@ User Guide
8282
user/styles-understanding
8383
user/styles-using
8484
user/comments
85+
user/revisions
8586
user/shapes
8687

8788

docs/user/comments.rst

Lines changed: 72 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -68,10 +68,11 @@ empty string if not configured. *date* is also optional, but always set by Word
6868
UTC date and time the comment was added, with seconds resolution (no milliseconds or
6969
microseconds).
7070

71-
**Additional Features.** Later versions of Word allow a comment to be *resolved*. A
72-
comment in this state will appear grayed-out in the Word UI. Later versions of Word also
73-
allow a comment to be *replied to*, forming a *comment thread*. Neither of these
74-
features is supported by the initial implementation of comments in *python-docx*.
71+
**Additional Features.** Later versions of Word allow a top-level comment to be
72+
*resolved*. A comment in this state will appear grayed-out in the Word UI. Later
73+
versions of Word also allow a comment to be *replied to*, forming a *comment thread*.
74+
*python-docx* supports both replies and resolved-state metadata, but only for the
75+
top-level comment in a thread.
7576

7677
**Applicability.** Note that comments cannot be added to a header or footer and cannot
7778
be nested inside a comment itself. In general the *python-docx* API will not allow these
@@ -109,6 +110,48 @@ A simple example is adding a comment to a paragraph::
109110

110111
The API documentation for :meth:`.Document.add_comment` provides further details.
111112

113+
For convenience, a comment can also be added directly from a paragraph, run, or
114+
table cell::
115+
116+
>>> paragraph = document.add_paragraph("Hello, world!")
117+
>>> comment = paragraph.add_comment("Please reword this.", author="Steve Canny")
118+
>>> run = paragraph.runs[0]
119+
>>> comment = run.add_comment("Comment on just this run.", author="Steve Canny")
120+
>>> table = document.add_table(rows=1, cols=1)
121+
>>> cell = table.cell(0, 0)
122+
>>> cell.text = "Cell text"
123+
>>> comment = cell.add_comment("Comment on this cell.", author="Steve Canny")
124+
125+
Tracked changes also provide a convenience API. A comment can be anchored directly to
126+
an insertion or deletion::
127+
128+
>>> paragraph = document.add_paragraph("Hello")
129+
>>> insertion = paragraph.add_tracked_insertion(" world", author="Editor")
130+
>>> comment = insertion.add_comment("Please justify this insertion.", author="Reviewer")
131+
132+
When added from a cell, the comment is anchored from the first run in the first
133+
paragraph of the cell to the last run in the last paragraph of the cell. This matches
134+
Word's XML model, where a so-called "cell comment" is really a comment range anchored
135+
inside the cell's paragraph content rather than on the cell element itself.
136+
137+
For run-level tracked changes, Word stores the comment as an ordinary comment range
138+
that brackets the ``<w:ins>`` or ``<w:del>`` wrapper itself. For block-level tracked
139+
changes such as inserted or deleted paragraphs and tables, the comment is anchored to
140+
the first and last paragraph content inside the tracked block because comment markers
141+
still have to live on paragraph/run boundaries.
142+
143+
When you need to anchor a comment to only part of a paragraph's text, use
144+
``Paragraph.add_comment_range(start, end, ...)`` with offsets measured against
145+
``paragraph.accepted_text``::
146+
147+
>>> paragraph = document.add_paragraph("South")
148+
>>> comment = paragraph.add_comment_range(1, 3, "Comment on just 'ou'.")
149+
150+
The method will split runs as needed so the comment range lands on proper run
151+
boundaries. In this first pass, range comments are limited to plain paragraph runs;
152+
selections that include deleted text, hyperlinks, or other non-run content raise
153+
``ValueError`` rather than guessing.
154+
112155

113156
Accessing and using the Comments collection
114157
-------------------------------------------
@@ -166,3 +209,28 @@ The author and initials metadata can be updated as desired::
166209
'John Smith'
167210
>>> comment.initials
168211
'JS'
212+
213+
214+
Resolving and reopening comments
215+
--------------------------------
216+
217+
A top-level comment can be marked resolved or reopened using either the ``resolved``
218+
property or the convenience methods ``resolve()`` and ``reopen()``::
219+
220+
>>> comment.resolved
221+
False
222+
>>> comment.resolve()
223+
>>> comment.resolved
224+
True
225+
>>> comment.resolved_at is not None
226+
True
227+
>>> comment.reopen()
228+
>>> comment.resolved
229+
False
230+
231+
The ``resolved_at`` value records the UTC timestamp associated with the resolved-state
232+
metadata when that information is available in the document.
233+
234+
Reply comments do not support independent resolved-state operations. This matches Word's
235+
review UI, which treats resolution as a property of the thread root rather than each
236+
individual reply.

docs/user/revisions.rst

Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
Working with Revisions
2+
======================
3+
4+
Word's *track changes* feature stores inserted and deleted content in revision wrappers
5+
such as ``<w:ins>`` and ``<w:del>``. *python-docx* exposes those content revisions so
6+
they can be read, created, accepted, rejected, and used in tracked find-and-replace
7+
operations.
8+
9+
10+
Reading paragraph text with revisions
11+
-------------------------------------
12+
13+
Paragraph text now has two primary views:
14+
15+
- ``Paragraph.text`` returns the paragraph's original reading view: normal text plus
16+
deleted text, excluding inserted text.
17+
- ``Paragraph.accepted_text`` returns the visible/accepted view: normal text plus
18+
inserted text, excluding deleted text.
19+
20+
Deleted-only content is also available directly via ``Paragraph.deleted_text``::
21+
22+
>>> from docx import Document
23+
>>> document = Document()
24+
>>> paragraph = document.add_paragraph("Alpha")
25+
>>> paragraph.add_tracked_insertion("Beta", author="Editor")
26+
>>> paragraph.add_tracked_deletion(1, 4, author="Editor")
27+
28+
>>> paragraph.text
29+
'Alpha'
30+
>>> paragraph.accepted_text
31+
'AaBeta'
32+
>>> paragraph.deleted_text
33+
'lph'
34+
35+
36+
Inspecting tracked changes
37+
--------------------------
38+
39+
Tracked insertions and deletions can be accessed from a paragraph using:
40+
41+
- ``paragraph.has_track_changes``
42+
- ``paragraph.insertions``
43+
- ``paragraph.deletions``
44+
- ``paragraph.track_changes``
45+
46+
Each tracked change provides metadata such as author, date, revision id, and text::
47+
48+
>>> change = paragraph.track_changes[0]
49+
>>> change.author
50+
'Editor'
51+
>>> change.text
52+
'Beta'
53+
54+
55+
Adding tracked changes
56+
----------------------
57+
58+
Tracked insertions and deletions can be created directly on ``Paragraph`` and ``Run``
59+
objects::
60+
61+
>>> paragraph = document.add_paragraph("Alpha")
62+
>>> paragraph.add_tracked_insertion("Beta", author="Editor")
63+
>>> paragraph.add_tracked_deletion(1, 4, author="Editor")
64+
65+
Run-level deletion and replacement are also available::
66+
67+
>>> run = paragraph.runs[0]
68+
>>> run.delete_tracked(author="Editor")
69+
70+
71+
Accepting and rejecting changes
72+
-------------------------------
73+
74+
Individual changes can be accepted or rejected using the tracked-change proxy object::
75+
76+
>>> change = paragraph.track_changes[0]
77+
>>> change.accept()
78+
79+
Document-wide operations are also available::
80+
81+
>>> document.accept_all()
82+
>>> document.reject_all()
83+
84+
85+
Tracked find-and-replace
86+
------------------------
87+
88+
Tracked replacement operates on the accepted/visible text view so that inserted text is
89+
searchable and deleted text is ignored::
90+
91+
>>> document.find_and_replace_tracked("Acme", "NewCo", author="Editor")
92+
93+
94+
Interaction with comments
95+
-------------------------
96+
97+
Comment markers do not contribute text to either ``Paragraph.text`` or
98+
``Paragraph.accepted_text``. A paragraph can contain both comments and tracked changes;
99+
text extraction remains based on document text and revision wrappers, not on comment
100+
anchor markers.
101+
102+
Tracked changes can also be commented directly using the tracked-change proxy::
103+
104+
>>> paragraph = document.add_paragraph("Hello")
105+
>>> insertion = paragraph.add_tracked_insertion(" world", author="Editor")
106+
>>> comment = insertion.add_comment("Why was this added?", author="Reviewer")
107+
108+
For run-level revisions, the comment range brackets the ``<w:ins>`` or ``<w:del>``
109+
element externally, matching Word-authored documents. For block-level revisions, such
110+
as inserted paragraphs or deleted tables, the comment is anchored to the first and last
111+
paragraph content inside the tracked block so the markers still land on valid run
112+
boundaries.

src/docx/blkcntnr.py

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,11 @@ def paragraphs(self):
8484
8585
Read-only.
8686
"""
87-
return [Paragraph(p, self) for p in self._element.p_lst]
87+
return [
88+
Paragraph(element, self)
89+
for element in self._element.inner_content_elements
90+
if isinstance(element, CT_P)
91+
]
8892

8993
@property
9094
def tables(self):
@@ -94,7 +98,11 @@ def tables(self):
9498
"""
9599
from docx.table import Table
96100

97-
return [Table(tbl, self) for tbl in self._element.tbl_lst]
101+
return [
102+
Table(element, self)
103+
for element in self._element.inner_content_elements
104+
if isinstance(element, CT_Tbl)
105+
]
98106

99107
def _add_paragraph(self):
100108
"""Return paragraph newly added to the end of the content in this container."""

src/docx/comments.py

Lines changed: 33 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
from docx.blkcntnr import BlockItemContainer
99

1010
if TYPE_CHECKING:
11-
from docx.oxml.comments import CT_Comment, CT_CommentEx, CT_Comments
11+
from docx.oxml.comments import CT_Comment, CT_CommentEx, CT_CommentExtensible, CT_Comments
1212
from docx.parts.comments import CommentsPart
1313
from docx.styles.style import ParagraphStyle
1414
from docx.text.paragraph import Paragraph
@@ -214,18 +214,41 @@ def replies(self) -> list[Comment]:
214214

215215
@property
216216
def resolved(self) -> bool:
217-
"""True when this comment is marked resolved/done."""
217+
"""True when this top-level comment is marked resolved/done."""
218+
if self.parent_para_id is not None:
219+
return False
218220
return bool(self._comment_ex_elm.done) if self._comment_ex_elm is not None else False
219221

220222
@resolved.setter
221223
def resolved(self, value: bool):
224+
self._validate_resolution_supported()
222225
comment_ex = (
223226
self._comment_ex_elm
224227
if self._comment_ex_elm is not None
225228
else self._comments_part.ensure_comment_ex(self._comment_elm)
226229
)
227230
comment_ex.done = value
228231
self._comment_ex_elm = comment_ex
232+
if value:
233+
self._comments_part.ensure_comment_extensible(
234+
self._comment_elm, dt.datetime.now(dt.timezone.utc)
235+
)
236+
237+
def resolve(self) -> None:
238+
"""Mark this comment resolved and stamp a resolution timestamp."""
239+
self.resolved = True
240+
241+
def reopen(self) -> None:
242+
"""Mark this comment unresolved."""
243+
self.resolved = False
244+
245+
@property
246+
def resolved_at(self) -> dt.datetime | None:
247+
"""Timestamp associated with resolution metadata, when available."""
248+
if self.parent_para_id is not None:
249+
return None
250+
comment_extensible = self._comment_extensible_elm
251+
return comment_extensible.dateUtc if comment_extensible is not None else None
229252

230253
@property
231254
def initials(self) -> str | None:
@@ -257,3 +280,11 @@ def timestamp(self) -> dt.datetime | None:
257280
This attribute is optional in the XML, returns |None| if not set.
258281
"""
259282
return self._comment_elm.date
283+
284+
def _validate_resolution_supported(self) -> None:
285+
if self.parent_para_id is not None:
286+
raise ValueError("reply comments do not support resolved state")
287+
288+
@property
289+
def _comment_extensible_elm(self) -> CT_CommentExtensible | None:
290+
return self._comments_part.comment_extensible_for(self._comment_elm)

0 commit comments

Comments
 (0)