Skip to content

Commit 463149e

Browse files
Merge branch 'development' of https://github.com/syncfusion-content/document-processing-docs into Resolve-2026-vol1-Main-release-dev-to-master-branch-merging-conflicts
2 parents f336ebe + 9078f0d commit 463149e

165 files changed

Lines changed: 6808 additions & 1687 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

Document-Processing-toc.html

Lines changed: 91 additions & 43 deletions
Large diffs are not rendered by default.
Lines changed: 198 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,198 @@
1+
---
2+
title: Syncfusion font handling in Office-to-PDF and image conversions
3+
description: Learn how Syncfusion Document Processing handles font management during Office to PDF/Image conversions and PDF processing workflows.
4+
platform: document-processing
5+
documentation: UG
6+
---
7+
8+
# Font Manager for Office-to-PDF/Image and PDF Processing
9+
10+
## Overview
11+
12+
Font creation is a primary cause of excessive memory consumption and performance degradation during Office to PDF/Image conversions and PDF processing workflows. This problem is particularly pronounced in multi-threaded web applications where multiple users perform concurrent operations across different threads or browser tabs.
13+
14+
To address this challenge, Syncfusion Document Processing libraries introduce the **FontManager** class, which provides centralized font management shared across all threads and conversion libraries. This approach eliminates duplicate font objects and significantly reduces memory overhead.
15+
16+
## Key Features
17+
18+
* **Shared font caching:** Stores fonts in a unified cache to prevent repeated loading across operations.
19+
* **Memory reduction:** Eliminates duplicate font objects, reducing memory usage during large-scale or parallel document conversions.
20+
* **Performance optimization:** Enables multiple threads to safely reuse the same font instances, improving processing speed.
21+
* **Automatic cleanup:** Automatically disposes unused fonts after a configurable delay (FontManager.Delay) to maintain efficiency in long-running applications.
22+
* **Manual cache management:** Provides FontManager.ClearCache() to immediately clear all cached fonts when needed (e.g., during server shutdown).
23+
24+
## Supported Conversions and Workflows
25+
26+
FontManager optimizes memory usage across the following Office to PDF/Image conversions and PDF processing scenarios:
27+
28+
<table>
29+
<tr>
30+
<th>Category</th>
31+
<th>Details</th>
32+
</tr>
33+
<tr>
34+
<td><b>Office Document Conversions</b></td>
35+
<td>
36+
<b>Word Library (DocIO)</b>
37+
<ul>
38+
<li>Word to PDF conversion.</li>
39+
<li>Word to Image conversion.</li>
40+
</ul>
41+
<b>Excel Library (XlsIO)</b>
42+
<ul>
43+
<li>Excel to PDF conversion.</li>
44+
<li>Excel to Image conversion.</li>
45+
</ul>
46+
<b>PowerPoint Library (Presentation)</b>
47+
<ul>
48+
<li>PowerPoint to PDF conversion.</li>
49+
<li>PowerPoint to Image conversion.</li>
50+
</ul>
51+
</td>
52+
</tr>
53+
<tr>
54+
<td><b>PDF Processing Workflows</b></td>
55+
<td>
56+
<b>PDF Library Operations</b>
57+
<ul>
58+
<li>PDF creation and manipulation</li>
59+
<li>PDF merging and splitting</li>
60+
<li>PDF form filling and flattening</li>
61+
<li>PDF page extraction and insertion</li>
62+
<li>Adding text, images, and annotations to PDF</li>
63+
<li>PDF redaction and security</li>
64+
<li>PDF/A conformance</li>
65+
<li>OCR text extraction</li>
66+
</ul>
67+
</td>
68+
</tr>
69+
</table>
70+
71+
N> FontManager automatically manages fonts across all these conversion types, whether you're processing a single document or handling thousands of concurrent conversions in a multi-threaded environment.
72+
73+
## Configuring Automatic Font Cleanup
74+
75+
The `FontManager.Delay` property defines the duration (in milliseconds) after which unused font objects are automatically disposed and removed from the cache. When fonts are no longer referenced, an internal `System.Timers.Timer` triggers disposal based on this value.
76+
77+
**Default value:** 30,000 milliseconds (30 seconds),
78+
**Valid range:** 1 to 2,147,483,647 milliseconds.
79+
80+
N> This configuration is optional. By default, unused fonts are automatically cleaned up 30 seconds after their references are released. To customize the delay, set this property at the application startup (e.g., in `Startup.cs` or `Program.cs`).
81+
82+
The following example demonstrates how to configure `FontManager.Delay` to automatically release cached fonts after the specified delay during document conversions.
83+
84+
{% tabs %}
85+
{% highlight C# %}
86+
87+
using Syncfusion.Drawing.Fonts;
88+
89+
// Set disposal delay to 50 seconds
90+
FontManager.Delay = 50000;
91+
92+
{% endhighlight %}
93+
{% endtabs %}
94+
95+
The following example demonstrates how to configure `FontManager.Delay` in an **ASP.NET Core application** to ensure cached fonts are automatically released after the specified delay during document conversions.
96+
97+
{% tabs %}
98+
99+
{% highlight C# %}
100+
101+
var builder = WebApplication.CreateBuilder(args);
102+
103+
// Add services to the container
104+
// ...existing code...
105+
106+
var app = builder.Build();
107+
108+
// Configure FontManager to dispose unused fonts after 50 seconds
109+
// Default: 30000ms | Valid range: 1 to 2,147,483,647 milliseconds
110+
Syncfusion.Drawing.Fonts.FontManager.Delay = 50000;
111+
112+
// Configure middleware
113+
// ...existing code...
114+
115+
app.Run();
116+
117+
{% endhighlight %}
118+
119+
{% endtabs %}
120+
121+
## Immediate Font Cache Cleanup
122+
123+
The `FontManager.ClearCache()` method immediately clears all font caches managed by the FontManager. This method forcefully removes and disposes all font instances maintained in shared caches, allowing you to reclaim memory deterministically without waiting for the automatic cleanup delay.
124+
125+
**Use cases:**
126+
127+
* Application shutdown.
128+
* After completing batch conversions.
129+
* Before maintenance operations.
130+
* When immediate memory reclamation is required.
131+
132+
The following example demonstrates how to immediately clear all cached fonts using `FontManager.ClearCache()`.
133+
134+
{% tabs %}
135+
{% highlight C# %}
136+
137+
using Syncfusion.Drawing.Fonts;
138+
139+
// Immediately clear all cached fonts
140+
FontManager.ClearCache();
141+
142+
{% endhighlight %}
143+
{% endtabs %}
144+
145+
The following example demonstrates how to configure `FontManager.ClearCache()` in an **ASP.NET Core application** to clear cached fonts during application shutdown.
146+
147+
{% tabs %}
148+
{% highlight C# %}
149+
150+
// Configure services and middleware
151+
// ...existing code...
152+
153+
// Access the application lifetime service
154+
var lifetime = app.Services.GetRequiredService<IHostApplicationLifetime>();
155+
156+
// Register a callback to clear font cache during application shutdown
157+
lifetime.ApplicationStopping.Register(() =>
158+
{
159+
Syncfusion.Drawing.Fonts.FontManager.ClearCache();
160+
});
161+
162+
// Start the application
163+
app.Run();
164+
165+
{% endhighlight %}
166+
{% endtabs %}
167+
168+
## Best Practices
169+
170+
1. Set FontManager.Delay early: Configure the delay property in your application's startup code before any document processing begins (Optional).
171+
172+
2. Use ClearCache() during shutdown: Register a shutdown handler to clear the cache when your application stops to ensure clean resource cleanup.
173+
174+
3. Consider your workload:
175+
176+
a. For high-frequency, short-lived conversions: Use a shorter delay (e.g., 15-30 seconds).
177+
178+
b. For batch processing with longer intervals: Use a longer delay (e.g., 60+ seconds).
179+
180+
4. Monitor memory usage: Track your application's memory consumption to fine-tune the delay value for optimal performance.
181+
182+
## FAQ
183+
184+
**Q: Do I need to configure FontManager for my application?**
185+
186+
A: No, it's optional. The default 30-second cleanup delay works well for most scenarios. Configure it only if you need custom behavior.
187+
188+
**Q: When should I call ClearCache() manually?**
189+
190+
A: Call it during application shutdown, after batch processing, or when you need immediate memory reclamation rather than waiting for automatic cleanup.
191+
192+
**Q: Is FontManager thread-safe?**
193+
194+
A: Yes, FontManager is designed for multi-threaded environments and allows safe font reuse across multiple threads.
195+
196+
**Q: Will FontManager affect my existing document processing code?**
197+
198+
A: No, FontManager works transparently in the background. Your existing code will automatically benefit from improved memory management without modifications.

Document-Processing/DataExtraction/SmartDataExtractor/NET/Assemblies-Required.md renamed to Document-Processing/Data-Extraction/Smart-Data-Extractor/NET/Assemblies-Required.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,8 @@ The following assemblies need to be referenced in your application based on the
1919
<tbody>
2020
<tr>
2121
<td>
22-
{{'[WPF]'| markdownify }},
23-
{{'[Windows Forms]'| markdownify }} and {{'[ASP.NET MVC]'| markdownify }}
22+
{{'WPF'| markdownify }},
23+
{{'Windows Forms'| markdownify }} and {{'ASP.NET MVC'| markdownify }}
2424
</td>
2525
<td>
2626
Syncfusion.Compression.Base<br/>
@@ -35,9 +35,8 @@ The following assemblies need to be referenced in your application based on the
3535
</tr>
3636
<tr>
3737
<td>
38-
{{'[Blazor]'| markdownify }},
39-
{{'[.NET Core]'| markdownify }}
40-
and {{'[.NET Platforms]'| markdownify }}
38+
{{'.NET Core'| markdownify }}
39+
and {{'.NET Platforms'| markdownify }}
4140
</td>
4241
<td>
4342
Syncfusion.Compression.Portable<br/>
@@ -52,7 +51,7 @@ The following assemblies need to be referenced in your application based on the
5251
</tr>
5352
<tr>
5453
<td>
55-
{{'[.NET Multi-platform App UI (.NET MAUI)]'| markdownify }}
54+
{{'.NET Multi-platform App UI (.NET MAUI)'| markdownify }}
5655
</td>
5756
<td>
5857
Syncfusion.Compression.NET<br/>

Document-Processing/DataExtraction/SmartDataExtractor/NET/FAQ/how-to-resolve-the-onnx-file-missing-error-in-smart-data-extractor.md renamed to Document-Processing/Data-Extraction/Smart-Data-Extractor/NET/FAQ/how-to-resolve-the-onnx-file-missing-error-in-smart-data-extractor.md

File renamed without changes.

Document-Processing/DataExtraction/SmartDataExtractor/NET/Features.md renamed to Document-Processing/Data-Extraction/Smart-Data-Extractor/NET/Features.md

Lines changed: 67 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,9 @@ keywords: Assemblies
99

1010
# Smart Data Extractor Features
1111

12-
## Extract data from a PDF document
12+
## Extract Data from a PDF Document
1313

14-
To extract structured data such as text, form fields, tables and images from an entire PDF document using the **ExtractDataAsPdfDocument** method of the **DataExtractor** class, refer to the following code
14+
To extract structured data such as text, form fields, tables and images from an entire PDF document using the **ExtractDataAsPdfDocument** method of the **DataExtractor** class, refer to the following code example:
1515

1616
{% tabs %}
1717

@@ -129,7 +129,67 @@ using (FileStream stream = new FileStream("Image.png", FileMode.Open, FileAccess
129129

130130
{% endtabs %}
131131

132-
## Extract form data as JSON
132+
## Extract Data as Stream
133+
134+
To extract structured data from a PDF document and return the output as a stream using the **ExtractDataAsPdfStream** method of the **DataExtractor** class, refer to the following example.
135+
136+
{% tabs %}
137+
138+
{% highlight c# tabtitle="C# [Cross-platform]" %}
139+
140+
using System.IO;
141+
using Syncfusion.SmartDataExtractor;
142+
143+
//Open the input PDF file as a stream.
144+
using (FileStream inputStream = new FileStream("Input.pdf", FileMode.Open, FileAccess.Read, FileShare.Read))
145+
{
146+
//Initialize the Smart Data Extractor.
147+
DataExtractor extractor = new DataExtractor();
148+
extractor.EnableFormDetection = true;
149+
extractor.EnableTableDetection = true;
150+
extractor.ConfidenceThreshold = 0.6;
151+
152+
//Extract data and return as a PDF stream.
153+
Stream pdfStream = extractor.ExtractDataAsPdfStream(inputStream);
154+
155+
//Save the extracted PDF stream into an output file.
156+
using (FileStream outputStream = new FileStream("Output.pdf", FileMode.Create, FileAccess.Write))
157+
{
158+
pdfStream.CopyTo(outputStream);
159+
}
160+
}
161+
162+
{% endhighlight %}
163+
164+
{% highlight c# tabtitle="C# [Windows-specific]" %}
165+
166+
using System.IO;
167+
using Syncfusion.SmartDataExtractor;
168+
169+
//Open the input PDF file as a stream.
170+
using (FileStream inputStream = new FileStream("Input.pdf", FileMode.Open, FileAccess.Read, FileShare.Read))
171+
{
172+
//Initialize the Smart Data Extractor.
173+
DataExtractor extractor = new DataExtractor();
174+
extractor.EnableFormDetection = true;
175+
extractor.EnableTableDetection = true;
176+
extractor.ConfidenceThreshold = 0.6;
177+
178+
//Extract data and return as a PDF stream.
179+
Stream pdfStream = extractor.ExtractDataAsPdfStream(inputStream);
180+
181+
//Save the extracted PDF stream into an output file.
182+
using (FileStream outputStream = new FileStream("Output.pdf", FileMode.Create, FileAccess.Write))
183+
{
184+
pdfStream.CopyTo(outputStream);
185+
}
186+
}
187+
188+
{% endhighlight %}
189+
190+
{% endtabs %}
191+
192+
## Extract Data as JSON
133193

134194
To extract form fields across a PDF document using the **ExtractDataAsJson** method of the **DataExtractor** class with form recognition options, refer to the following code example:
135195

@@ -150,7 +210,7 @@ using (FileStream stream = new FileStream("Input.pdf", FileMode.Open, FileAccess
150210

151211
//Enable form detection in the document.
152212
extractor.EnableFormDetection = true;
153-
extractor.EnableTableDetection = false;
213+
extractor.EnableTableDetection = true;
154214
//Set confidence threshold for extraction.
155215
extractor.ConfidenceThreshold = 0.6
156216
//Configure form recognition options.
@@ -217,7 +277,7 @@ using (FileStream stream = new FileStream("Input.pdf", FileMode.Open, FileAccess
217277

218278
{% endtabs %}
219279

220-
## Extract form data as PDF
280+
## Enable Form Detection
221281

222282
To extract form fields across a PDF document and save them as a PDF output using the **ExtractDataAsPdfDocument** method of the **DataExtractor** class with form recognition options, refer to the following code example:
223283

@@ -319,7 +379,7 @@ using (FileStream stream = new FileStream("Input.pdf", FileMode.Open, FileAccess
319379

320380
{% endtabs %}
321381

322-
## Extract table data as PDF
382+
## Enable Table Detection
323383

324384
To extract tables across a PDF document and save them as a PDF output using the **ExtractDataAsPdfDocument** method of the **DataExtractor** class with table extraction options, refer to the following code example:
325385

@@ -406,7 +466,7 @@ using (FileStream stream = new FileStream("Input.pdf", FileMode.Open, FileAccess
406466

407467
{% endtabs %}
408468

409-
## Apply confidence threshold to extract the data
469+
## Apply Confidence Threshold to Extract the Data
410470

411471
To apply confidence thresholding when extracting data from a PDF document using the ExtractDataAsPdfDocument method of the DataExtractor class, refer to the following code example:
412472

0 commit comments

Comments
 (0)