Conversation
Reviewer's GuideAdds instance methods to PdfReader for setting PDF content from a stream or base64 data, introduces hashing/length-based change detection for stream-sourced PDFs, allows nullable OnGetStreamAsync, and updates the JS interop to use object URLs and properly manage PDF data lifecycle. Sequence diagram for PdfReader SetPdfStreamAsync and JS interopsequenceDiagram
participant Caller
participant PdfReader
participant JSRuntime
participant PdfReaderJsModule as PdfReaderJs
participant Browser
Caller->>PdfReader: SetPdfStreamAsync(stream)
activate PdfReader
PdfReader->>PdfReader: Copy stream to MemoryStream
PdfReader->>PdfReader: pdfBytes = memoryStream.ToArray()
PdfReader->>PdfReader: _lastStreamLength = pdfBytes.Length
alt NET6_0
PdfReader->>PdfReader: _lastStreamHash = ComputerHash(pdfBytes)
else OtherFramework
PdfReader->>PdfReader: _lastStreamHash = ComputerHash(stream)
end
PdfReader->>JSRuntime: InvokeVoidAsync setData(Id, pdfBytes)
activate JSRuntime
JSRuntime->>PdfReaderJs: setData(id, data)
activate PdfReaderJs
PdfReaderJs->>PdfReaderJs: if objectUrl exists revokeObjectURL(objectUrl)
PdfReaderJs->>PdfReaderJs: options.url = createObjectURLFromByte(data)
PdfReaderJs->>PdfReaderJs: options.data = null
PdfReaderJs->>Browser: URL.createObjectURL(blob)
PdfReaderJs-->>JSRuntime: loadPdf resolved
deactivate PdfReaderJs
JSRuntime-->>PdfReader: completed
deactivate JSRuntime
PdfReader-->>Caller: Task completed
deactivate PdfReader
Class diagram for updated PdfReader component and JS moduleclassDiagram
class PdfReader {
+Func~Task~Stream?~~ OnGetStreamAsync
-bool _enableThumbnails
-bool _showToolbar
-PdfReaderFitMode _fitMode
-string _lastStreamHash
-long _lastStreamLength
+Task OnAfterRenderAsync(bool firstRender)
-Task SetPdfStream(Stream? stream)
-Task~byte[]?~ GetPdfStreamDataAsync()
-static Task~byte[]~ GetBytes(Stream stream)
+Task SetPdfStreamAsync(Stream stream)
+Task SetPdfBase64DataAsync(string base64Data)
-static string ComputerHash(byte[] data)
-static Task~string~ ComputerHash(Stream stream)
}
class PdfReaderJsModule {
+Promise setUrl(string id, string url)
+Promise setData(string id, byte[] data)
+string createObjectURLFromBase64(string base64Data)
+string createObjectURLFromByte(byte[] bytes)
}
PdfReader --> PdfReaderJsModule : uses JS interop
Flow diagram for SetPdfStream change detection logicflowchart TD
A["SetPdfStream(Stream? stream)"] --> B{stream is null or Stream.Null}
B -->|Yes| C[Set _lastStreamHash to empty]
C --> D[Set _lastStreamLength to 0]
D --> E[Invoke setData with null]
E --> Z[End]
B -->|No| F[Read all bytes from stream into pdfBytes]
F --> G[Set currentLength = pdfBytes.Length]
G --> H{currentLength != _lastStreamLength}
H -->|Yes| I[Set _lastStreamLength = currentLength]
I --> J[Invoke setData with pdfBytes]
J --> Z
H -->|No| K[Compute currentHash from data or stream]
K --> L{currentHash != _lastStreamHash}
L -->|Yes| M[Set _lastStreamHash = currentHash]
M --> N[Invoke setData with pdfBytes]
N --> Z
L -->|No| Z[End without JS update]
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey there - I've reviewed your changes - here's some feedback:
- The new hashing logic uses the original stream after it has already been copied (e.g., in SetPdfStream and SetPdfStreamAsync for the non-NET6 path), which will leave the stream at the end and can produce incorrect hashes for non-seekable streams; consider computing the hash from the buffered byte[] instead to avoid reliance on stream position/seekability.
- SetPdfStreamAsync and SetPdfBase64DataAsync update the viewer data but only SetPdfStreamAsync updates _lastStreamHash/_lastStreamLength; this inconsistency can cause the later change-detection in SetPdfStream to misbehave—either keep those fields in sync for both entry points or centralize the state update in a shared helper.
- The createObjectURLFromBase64 helper in PdfReader.razor.js is currently unused, and switching from data/object_urls back to a plain URL in setUrl doesn’t revoke any existing objectUrl; consider either removing the unused helper or wiring it up, and ensure previous object URLs are consistently revoked when no longer needed to avoid memory leaks.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new hashing logic uses the original stream after it has already been copied (e.g., in SetPdfStream and SetPdfStreamAsync for the non-NET6 path), which will leave the stream at the end and can produce incorrect hashes for non-seekable streams; consider computing the hash from the buffered byte[] instead to avoid reliance on stream position/seekability.
- SetPdfStreamAsync and SetPdfBase64DataAsync update the viewer data but only SetPdfStreamAsync updates _lastStreamHash/_lastStreamLength; this inconsistency can cause the later change-detection in SetPdfStream to misbehave—either keep those fields in sync for both entry points or centralize the state update in a shared helper.
- The createObjectURLFromBase64 helper in PdfReader.razor.js is currently unused, and switching from data/object_urls back to a plain URL in setUrl doesn’t revoke any existing objectUrl; consider either removing the unused helper or wiring it up, and ensure previous object URLs are consistently revoked when no longer needed to avoid memory leaks.
## Individual Comments
### Comment 1
<location> `src/components/BootstrapBlazor.PdfReader/PdfReader.razor.cs:248-257` </location>
<code_context>
+ }
+ }
+
+ private async Task SetPdfStream(Stream? stream)
+ {
+ if (stream == null || stream == Stream.Null)
+ {
+ _lastStreamHash = string.Empty;
+ _lastStreamLength = 0;
+ await InvokeVoidAsync("setData", Id, null);
+ return;
+ }
+
+ byte[] pdfBytes = await GetBytes(stream);
+
+ var currentLength = pdfBytes.Length;
+ if (_lastStreamLength != currentLength)
+ {
+ _lastStreamLength = currentLength;
+ await InvokeVoidAsync("setData", Id, pdfBytes);
+ return;
+ }
+
+#if NET6_0
+ var currentHash = ComputerHash(pdfBytes);
+#else
+ var currentHash = await ComputerHash(stream);
+#endif
+ if (_lastStreamHash != currentHash)
</code_context>
<issue_to_address>
**issue (bug_risk):** Hashing the original stream in the non-NET6 path is fragile for non-seekable streams and unnecessarily re-reads the stream.
In `SetPdfStream`, the non-NET6 branch hashes the original `stream` after `GetBytes` has fully read it. For non-seekable streams this means `ComputerHash(stream)` runs at end-of-stream and produces an incorrect hash, and for seekable streams it forces a second pass. Since you already have `pdfBytes`, consider hashing `pdfBytes` in both NET6 and non-NET6 builds to avoid dependence on stream seekability and double-reading.
</issue_to_address>
### Comment 2
<location> `src/components/BootstrapBlazor.PdfReader/PdfReader.razor.cs:311-320` </location>
<code_context>
+ /// </summary>
+ /// <param name="stream"></param>
+ /// <returns></returns>
+ public async Task SetPdfStreamAsync(Stream stream)
+ {
+ using var memoryStream = new MemoryStream();
+ await stream.CopyToAsync(memoryStream);
+ var pdfBytes = memoryStream.ToArray();
+ _lastStreamLength = pdfBytes.Length;
+#if NET6_0
+ _lastStreamHash = ComputerHash(pdfBytes);
+#else
+ _lastStreamHash = await ComputerHash(stream);
+#endif
+ await InvokeVoidAsync("setData", Id, pdfBytes);
</code_context>
<issue_to_address>
**issue (bug_risk):** Using the original stream for hashing after copying introduces similar seekability issues and inconsistency with the byte-based path.
In `SetPdfStreamAsync`, you copy `stream` into `memoryStream` but, for non-NET6, you still compute `_lastStreamHash` from the original `stream`. As with `SetPdfStream`, this fails for non-seekable streams (you’ll hash only the remaining tail, often empty) and the hash won’t match the rendered `pdfBytes`. To keep behavior correct and consistent, compute the hash from `pdfBytes` for both NET6 and non-NET6 instead of using the original `stream`.
</issue_to_address>
### Comment 3
<location> `src/components/BootstrapBlazor.PdfReader/PdfReader.razor.js:38` </location>
<code_context>
export async function setUrl(id, url) {
</code_context>
<issue_to_address>
**issue (bug_risk):** Switching from a blob-based data source to a URL does not revoke any existing object URL, which can leak memory.
`setData` revokes any existing `pdf.objectUrl` before creating a new blob URL, but `setUrl` only updates `options.url` and leaves a previous `objectUrl` intact. If a consumer calls `setData` then `setUrl`, the old object URL is never revoked, causing a memory leak. Consider mirroring the cleanup in `setUrl` by revoking `pdf.objectUrl` (if set) before assigning the new URL.
</issue_to_address>
### Comment 4
<location> `src/components/BootstrapBlazor.PdfReader/PdfReader.razor.cs:135` </location>
<code_context>
/// <remarks>优先使用 <see cref="Url"/> 未提供 <see cref="Url"/> 时会尝试调用此回调获得流进行渲染</remarks>
[Parameter]
- public Func<Task<Stream>>? OnGetStreamAsync { get; set; }
+ public Func<Task<Stream?>>? OnGetStreamAsync { get; set; }
[Inject, NotNull]
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring the new PDF stream/byte handling into a single helper that takes byte[] input, updates hash/length state, and performs JS interop so all call sites delegate to it.
You can simplify the new logic and reduce duplication by centralizing “bytes in → state + JS out” into a single helper and always hashing on `byte[]`. That also lets you reuse `GetBytes` everywhere and keep `_lastStreamHash` / `_lastStreamLength` management in one place.
### 1. Introduce a canonical “bytes in, state + JS out” helper
```csharp
private async Task SetPdfBytesAsync(byte[]? pdfBytes)
{
if (pdfBytes is null || pdfBytes.Length == 0)
{
_lastStreamHash = string.Empty;
_lastStreamLength = 0;
await InvokeVoidAsync("setData", Id, null);
return;
}
var currentLength = pdfBytes.Length;
if (_lastStreamLength != currentLength)
{
_lastStreamLength = currentLength;
_lastStreamHash = ComputeHash(pdfBytes);
await InvokeVoidAsync("setData", Id, pdfBytes);
return;
}
var currentHash = ComputeHash(pdfBytes);
if (_lastStreamHash != currentHash)
{
_lastStreamHash = currentHash;
await InvokeVoidAsync("setData", Id, pdfBytes);
}
}
```
Then update all call sites to funnel through this:
```csharp
private async Task SetPdfStream(Stream? stream)
{
if (stream == null || stream == Stream.Null)
{
await SetPdfBytesAsync(null);
return;
}
var pdfBytes = await GetBytes(stream);
await SetPdfBytesAsync(pdfBytes);
}
public async Task SetPdfStreamAsync(Stream stream)
{
var pdfBytes = await GetBytes(stream);
await SetPdfBytesAsync(pdfBytes);
}
public async Task SetPdfBase64DataAsync(string base64Data)
{
var pdfBytes = Convert.FromBase64String(base64Data);
await SetPdfBytesAsync(pdfBytes);
}
private async Task<byte[]?> GetPdfStreamDataAsync()
{
if (OnGetStreamAsync is null)
return null;
var stream = await OnGetStreamAsync();
if (stream == null || stream == Stream.Null)
{
await SetPdfBytesAsync(null);
return null;
}
var pdfBytes = await GetBytes(stream);
await SetPdfBytesAsync(pdfBytes);
return pdfBytes;
}
```
This keeps all state changes and JS interop in one place (`SetPdfBytesAsync`), and all stream/base64 APIs just convert to `byte[]` and delegate.
### 2. Remove duplicated conditional compilation for hashing
Instead of `ComputerHash(byte[])` vs `ComputerHash(Stream)` in multiple places, unify on `byte[]`:
```csharp
private static string ComputeHash(byte[] data)
{
var hashBytes = System.Security.Cryptography.SHA256.HashData(data);
return Convert.ToBase64String(hashBytes);
}
```
You can then delete the `#if NET6_0` / `#else` blocks and the `ComputerHash(Stream)` overload entirely, since all code paths now hash on the byte array returned by `GetBytes`.
### 3. Use `GetBytes` consistently
Replace the manual `MemoryStream` copy in `SetPdfStreamAsync` with `GetBytes` (as shown above). This removes the last duplicate stream→byte logic:
```csharp
private static async Task<byte[]> GetBytes(Stream stream)
{
using var memoryStream = new MemoryStream();
await stream.CopyToAsync(memoryStream);
return memoryStream.ToArray();
}
```
### 4. Keep URL/stream state transitions consistent
You already clear `_lastStreamHash` when `Url` changes:
```csharp
if (_url != Url)
{
_url = Url;
_lastStreamHash = string.Empty;
await InvokeVoidAsync("setUrl", Id, _url);
}
```
With `SetPdfBytesAsync` as the single place that resets `_lastStreamHash` and `_lastStreamLength` for stream-based data, you avoid scattering state resets across `SetPdfStream`, `GetPdfStreamDataAsync`, and `SetPdfStreamAsync`, while preserving all current behavior.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| private async Task SetPdfStream(Stream? stream) | ||
| { | ||
| if (stream == null || stream == Stream.Null) | ||
| { | ||
| _lastStreamHash = string.Empty; | ||
| _lastStreamLength = 0; | ||
| await InvokeVoidAsync("setData", Id, null); | ||
| return; | ||
| } | ||
|
|
There was a problem hiding this comment.
issue (bug_risk): Hashing the original stream in the non-NET6 path is fragile for non-seekable streams and unnecessarily re-reads the stream.
In SetPdfStream, the non-NET6 branch hashes the original stream after GetBytes has fully read it. For non-seekable streams this means ComputerHash(stream) runs at end-of-stream and produces an incorrect hash, and for seekable streams it forces a second pass. Since you already have pdfBytes, consider hashing pdfBytes in both NET6 and non-NET6 builds to avoid dependence on stream seekability and double-reading.
| public async Task SetPdfStreamAsync(Stream stream) | ||
| { | ||
| using var memoryStream = new MemoryStream(); | ||
| await stream.CopyToAsync(memoryStream); | ||
| var pdfBytes = memoryStream.ToArray(); | ||
| _lastStreamLength = pdfBytes.Length; | ||
| #if NET6_0 | ||
| _lastStreamHash = ComputerHash(pdfBytes); | ||
| #else | ||
| _lastStreamHash = await ComputerHash(stream); |
There was a problem hiding this comment.
issue (bug_risk): Using the original stream for hashing after copying introduces similar seekability issues and inconsistency with the byte-based path.
In SetPdfStreamAsync, you copy stream into memoryStream but, for non-NET6, you still compute _lastStreamHash from the original stream. As with SetPdfStream, this fails for non-seekable streams (you’ll hash only the remaining tail, often empty) and the hash won’t match the rendered pdfBytes. To keep behavior correct and consistent, compute the hash from pdfBytes for both NET6 and non-NET6 instead of using the original stream.
| @@ -36,7 +36,6 @@ export async function setUrl(id, url) { | |||
|
|
|||
| const { options } = pdf; | |||
| options.url = url; | |||
There was a problem hiding this comment.
issue (bug_risk): Switching from a blob-based data source to a URL does not revoke any existing object URL, which can leak memory.
setData revokes any existing pdf.objectUrl before creating a new blob URL, but setUrl only updates options.url and leaves a previous objectUrl intact. If a consumer calls setData then setUrl, the old object URL is never revoked, causing a memory leak. Consider mirroring the cleanup in setUrl by revoking pdf.objectUrl (if set) before assigning the new URL.
There was a problem hiding this comment.
Pull request overview
This PR adds two new public instance methods (SetPdfStreamAsync and SetPdfBase64DataAsync) to the PdfReader component, allowing programmatic control over PDF content loading. The changes also refactor how PDF data is handled internally, switching from passing raw byte arrays to the JavaScript layer to creating object URLs for memory efficiency. Additionally, the PR introduces stream change detection using SHA256 hashing and length comparison to avoid redundant updates.
Key Changes
- Added
SetPdfStreamAsyncandSetPdfBase64DataAsyncpublic methods for programmatic PDF loading - Refactored JavaScript layer to use object URLs instead of raw data, with proper memory management
- Implemented stream change detection using SHA256 hashing and length comparison
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 13 comments.
| File | Description |
|---|---|
| PdfReader.razor.cs | Added two public API methods for setting PDF content, implemented hash-based change detection, and updated OnGetStreamAsync to return nullable Stream |
| PdfReader.razor.js | Refactored setData to use object URLs, added helper functions for URL creation, removed obsolete data clearing in setUrl |
| BootstrapBlazor.PdfReader.csproj | Bumped version from 10.0.12 to 10.0.13 |
Critical Issues Found:
- The hash computation logic in both
SetPdfStreamAsyncandSetPdfStreammethods is broken for non-NET6_0 targets, attempting to hash streams that have already been consumed - Memory leak: object URLs created by
URL.createObjectURLare not being revoked in the dispose function - Several documentation and naming issues that should be addressed
Comments suppressed due to low confidence (1)
src/components/BootstrapBlazor.PdfReader/PdfReader.razor.js:61
- Unused variable createObjectURLFromBase64.
const createObjectURLFromBase64 = base64Data => {
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| private static async Task<string> ComputerHash(Stream stream) | ||
| { | ||
| if (stream.CanSeek) | ||
| { | ||
| stream.Seek(0, SeekOrigin.Begin); | ||
| } | ||
| var hashBytes = await System.Security.Cryptography.SHA256.HashDataAsync(stream); | ||
| return Convert.ToBase64String(hashBytes); | ||
| } |
There was a problem hiding this comment.
The method name has a typo: ComputerHash should be ComputeHash. This is the async overload version of the same incorrectly named method.
| var currentHash = ComputerHash(pdfBytes); | ||
| #else | ||
| var currentHash = await ComputerHash(stream); |
There was a problem hiding this comment.
Method name typo: ComputerHash should be ComputeHash.
| var currentHash = ComputerHash(pdfBytes); | |
| #else | |
| var currentHash = await ComputerHash(stream); | |
| var currentHash = ComputeHash(pdfBytes); | |
| #else | |
| var currentHash = await ComputeHash(stream); |
| var pdfBytes = memoryStream.ToArray(); | ||
| _lastStreamLength = pdfBytes.Length; | ||
| #if NET6_0 | ||
| _lastStreamHash = ComputerHash(pdfBytes); |
There was a problem hiding this comment.
Method name typo: ComputerHash should be ComputeHash.
| #if NET6_0 | ||
| _lastStreamHash = ComputerHash(pdfBytes); | ||
| #else | ||
| _lastStreamHash = await ComputerHash(stream); |
There was a problem hiding this comment.
Method name typo: ComputerHash should be ComputeHash.
| #if NET6_0 | ||
| var currentHash = ComputerHash(pdfBytes); | ||
| #else | ||
| var currentHash = await ComputerHash(stream); | ||
| #endif |
There was a problem hiding this comment.
The hash computation logic is incorrect for non-NET6_0 targets. On line 271, you're trying to compute the hash of stream after it has already been fully read and converted to bytes on line 258. The stream position will be at the end, making the hash computation inaccurate or computing the hash of an already-consumed stream.
Since you already have pdfBytes available, you should compute the hash from the byte array instead. You can create a synchronous overload of ComputerHash that accepts byte[] for non-NET6_0 targets, or convert the conditional compilation to use the same byte-based approach for both targets.
| /// <summary> | ||
| /// 设置 PDF 流数据方法 | ||
| /// </summary> | ||
| /// <param name="stream"></param> |
There was a problem hiding this comment.
The XML documentation for the stream parameter is missing. Public API methods should have complete documentation including parameter descriptions. Add a <param name="stream"> tag to describe what the stream parameter represents (e.g., "The PDF file stream to load").
| /// <param name="stream"></param> | |
| /// <param name="stream">The PDF file stream to load.</param> |
| /// <summary> | ||
| /// 设置 Pdf Base64 数据方法 | ||
| /// </summary> | ||
| /// <param name="base64Data"></param> |
There was a problem hiding this comment.
The XML documentation for the base64Data parameter is missing. Public API methods should have complete documentation including parameter descriptions. Add a <param name="base64Data"> tag to describe what the parameter represents (e.g., "The Base64-encoded PDF data").
| /// <param name="base64Data"></param> | |
| /// <param name="base64Data">The Base64-encoded PDF data.</param> |
| #if NET6_0 | ||
| var currentHash = ComputerHash(pdfBytes); | ||
| #else | ||
| var currentHash = await ComputerHash(stream); |
There was a problem hiding this comment.
Method name typo: ComputerHash should be ComputeHash.
| public async Task SetPdfStreamAsync(Stream stream) | ||
| { | ||
| using var memoryStream = new MemoryStream(); | ||
| await stream.CopyToAsync(memoryStream); | ||
| var pdfBytes = memoryStream.ToArray(); | ||
| _lastStreamLength = pdfBytes.Length; | ||
| #if NET6_0 | ||
| _lastStreamHash = ComputerHash(pdfBytes); | ||
| #else | ||
| _lastStreamHash = await ComputerHash(stream); | ||
| #endif | ||
| await InvokeVoidAsync("setData", Id, pdfBytes); |
There was a problem hiding this comment.
The hash computation logic is incorrect for non-NET6_0 targets. On line 320, you're trying to compute the hash of the original stream parameter after it has already been fully read and copied to memoryStream on line 314. At this point, the stream position is at the end, and even if it's seekable, it will compute the hash of an empty or already-consumed stream.
You should either:
- Compute the hash from
pdfBytesinstead ofstream(like you do for NET6_0) - Pass
memoryStreamto the hash function and ensure it's seeked back to the beginning
The simplest fix is to use ComputerHash(pdfBytes) for the non-NET6_0 case as well, since you already have the byte array.
| /// <returns></returns> | ||
| public async Task SetPdfBase64DataAsync(string base64Data) | ||
| { | ||
| var pdfBytes = Convert.FromBase64String(base64Data); |
There was a problem hiding this comment.
The SetPdfBase64DataAsync method doesn't update the _lastStreamHash and _lastStreamLength fields like SetPdfStreamAsync does. This inconsistency means that if the component state tracking logic depends on these fields, calling this method could lead to unexpected behavior or redundant updates. Consider whether these fields should be updated here for consistency, or document why they don't need to be.
| var pdfBytes = Convert.FromBase64String(base64Data); | |
| var pdfBytes = Convert.FromBase64String(base64Data); | |
| _lastStreamLength = pdfBytes.Length; | |
| #if NET6_0 | |
| _lastStreamHash = ComputerHash(pdfBytes); | |
| #else | |
| using (var memoryStream = new MemoryStream(pdfBytes)) | |
| { | |
| _lastStreamHash = await ComputerHash(memoryStream); | |
| } | |
| #endif |
Link issues
fixes #816
Summary By Copilot
Regression?
Risk
Verification
Packaging changes reviewed?
☑️ Self Check before Merge
Summary by Sourcery
Add support for loading and updating PdfReader content from streams or base64 data while avoiding redundant reloads and improving client-side PDF data handling.
New Features:
Enhancements: