You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<a href="/document-processing/data-extraction/smart-data-extractor/net/FAQ/how-to-resolve-the-onnx-file-missing-error-in-smart-data-extractor">How to resolve the “ONNX file missing” error</a>
141
+
<a href="/document-processing/data-extraction/smart-data-extractor/net/FAQ/how-to-resolve-the-onnx-file-missing-error-in-smart-data-extractor">How to resolve the ONNX file missing error</a>
<a href="/document-processing/data-extraction/smart-table-extractor/net/FAQ/how-to-resolve-the-onnx-file-missing-error-in-smart-table-extractor">How to resolve the “ONNX file missing” error</a>
171
+
<a href="/document-processing/data-extraction/smart-table-extractor/net/FAQ/how-to-resolve-the-onnx-file-missing-error-in-smart-table-extractor">How to resolve the ONNX file missing error</a>
Copy file name to clipboardExpand all lines: Document-Processing/Data-Extraction/Smart-Data-Extractor/NET/Features.md
+67-7Lines changed: 67 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,9 +9,9 @@ keywords: Assemblies
9
9
10
10
# Smart Data Extractor Features
11
11
12
-
## Extract data from a PDF document
12
+
## Extract Data from a PDF Document
13
13
14
-
To extract structured data such as text, form fields, tables and images from an entire PDF document using the **ExtractDataAsPdfDocument** method of the **DataExtractor** class, refer to the following code
14
+
To extract structured data such as text, form fields, tables and images from an entire PDF document using the **ExtractDataAsPdfDocument** method of the **DataExtractor** class, refer to the following code example:
15
15
16
16
{% tabs %}
17
17
@@ -129,7 +129,67 @@ using (FileStream stream = new FileStream("Image.png", FileMode.Open, FileAccess
129
129
130
130
{% endtabs %}
131
131
132
-
## Extract form data as JSON
132
+
## Extract Data as Stream
133
+
134
+
To extract structured data from a PDF document and return the output as a stream using the **ExtractDataAsPdfStream** method of the **DataExtractor** class, refer to the following example.
135
+
136
+
{% tabs %}
137
+
138
+
{% highlight c# tabtitle="C# [Cross-platform]" %}
139
+
140
+
using System.IO;
141
+
using Syncfusion.SmartDataExtractor;
142
+
143
+
//Open the input PDF file as a stream.
144
+
using (FileStream inputStream = new FileStream("Input.pdf", FileMode.Open, FileAccess.Read, FileShare.Read))
//Save the extracted PDF stream into an output file.
182
+
using (FileStream outputStream = new FileStream("Output.pdf", FileMode.Create, FileAccess.Write))
183
+
{
184
+
pdfStream.CopyTo(outputStream);
185
+
}
186
+
}
187
+
188
+
{% endhighlight %}
189
+
190
+
{% endtabs %}
191
+
192
+
## Extract Data as JSON
133
193
134
194
To extract form fields across a PDF document using the **ExtractDataAsJson** method of the **DataExtractor** class with form recognition options, refer to the following code example:
135
195
@@ -150,7 +210,7 @@ using (FileStream stream = new FileStream("Input.pdf", FileMode.Open, FileAccess
150
210
151
211
//Enable form detection in the document.
152
212
extractor.EnableFormDetection = true;
153
-
extractor.EnableTableDetection = false;
213
+
extractor.EnableTableDetection = true;
154
214
//Set confidence threshold for extraction.
155
215
extractor.ConfidenceThreshold = 0.6
156
216
//Configure form recognition options.
@@ -217,7 +277,7 @@ using (FileStream stream = new FileStream("Input.pdf", FileMode.Open, FileAccess
217
277
218
278
{% endtabs %}
219
279
220
-
## Extract form data as PDF
280
+
## Enable Form Detection
221
281
222
282
To extract form fields across a PDF document and save them as a PDF output using the **ExtractDataAsPdfDocument** method of the **DataExtractor** class with form recognition options, refer to the following code example:
223
283
@@ -319,7 +379,7 @@ using (FileStream stream = new FileStream("Input.pdf", FileMode.Open, FileAccess
319
379
320
380
{% endtabs %}
321
381
322
-
## Extract table data as PDF
382
+
## Enable Table Detection
323
383
324
384
To extract tables across a PDF document and save them as a PDF output using the **ExtractDataAsPdfDocument** method of the **DataExtractor** class with table extraction options, refer to the following code example:
325
385
@@ -406,7 +466,7 @@ using (FileStream stream = new FileStream("Input.pdf", FileMode.Open, FileAccess
406
466
407
467
{% endtabs %}
408
468
409
-
## Apply confidence threshold to extract the data
469
+
## Apply Confidence Threshold to Extract the Data
410
470
411
471
To apply confidence thresholding when extracting data from a PDF document using the ExtractDataAsPdfDocument method of the DataExtractor class, refer to the following code example:
0 commit comments