Skip to content

Commit 96493fc

Browse files
committed
doc: update readme with v4.0.0 breaking changes
1 parent c8b372b commit 96493fc

1 file changed

Lines changed: 14 additions & 4 deletions

File tree

readme.md

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1063,17 +1063,27 @@ In order to support this auto merging capability, text block objects have an add
10631063

10641064
**Breaking Changes:**
10651065

1066+
- v4.0.0 introduces several important changes that may affect existing implementations:
1067+
1068+
- **Text encoding removed**: Text in JSON output is no longer URI-encoded (fixes [#385](https://github.com/modesty/pdf2json/issues/385)). Chinese, CJK, and other Unicode characters now display directly as UTF-8 instead of percent-encoded strings. If your application was decoding text with `decodeURIComponent()`, you should remove that step.
1069+
1070+
- **Text block spacing improvements**: Text block gaps and space widths are now calculated from fontMatrix for more accurate spacing (fixes [#355](https://github.com/modesty/pdf2json/issues/355), [#361](https://github.com/modesty/pdf2json/issues/361), [#319](https://github.com/modesty/pdf2json/issues/319)). This uses actual glyph-based width calculation with proper coordinate system handling and applies textHScale for compressed/expanded text. The spacing in both content.txt and JSON output will be more accurate but may differ from previous versions.
1071+
1072+
- **Text coordinate fixes**: Text block coordinate calculations have been corrected (fixes [#408](https://github.com/modesty/pdf2json/issues/408)), which may result in slightly different positioning values compared to v3.x.
1073+
1074+
- **Node.js version requirement**: Minimum Node.js version is now 20.18.0 or higher.
1075+
1076+
- v3.0.0 converted commonJS to ES Modules, plus dependency update and other minor bug fixes. Please update your project configuration file to enable ES Module before upgrade, ex., in `tsconfig.json`, set `"compilerOptions":{"module":"ESNext"}`
1077+
1078+
- v2.0.0 output data field, `Agency` and `Id` are replaced with `Meta`, JSON of the PDF's full metadata. (See above for details). Each page object also added `Width` property besides `Height`.
1079+
10661080
- v1.1.4 unified event data structure: **only when you handle these top level events, no change if you use commandline**
10671081

10681082
- event "pdfParser_dataError": {"parserError": errObj}
10691083
- event "pdfParser_dataReady": {"formImage": parseOutput} **note**: "formImage" is removed from v2.0.0, see breaking changes for details.
10701084

10711085
- v1.0.8 fixed [issue 27](https://github.com/modesty/pdf2json/issues/27), it converts x coordinate with the same ratio as y, which is 24 (96/4), rather than 8.7 (96/11), please adjust client renderer accordingly when position all elements' x coordinate.
10721086

1073-
- v2.0.0 output data field, `Agency` and `Id` are replaced with `Meta`, JSON of the PDF's full metadata. (See above for details). Each page object also added `Width` property besides `Height`.
1074-
1075-
- v3.0.0 converted commonJS to ES Modules, plus dependency update and other minor bug fixes. Please update your project configuration file to enable ES Module before upgrade, ex., in `tsconfig.json`, set `"compilerOptions":{"module":"ESNext"}`
1076-
10771087
## Major Refactoring
10781088

10791089
- v2.0.0 has the major refactoring since 2015. Primary updates including:

0 commit comments

Comments
 (0)