Skip to content

Commit f3eb197

Browse files
authored
💎 fix: Gemini Image Gen Tool Vertex AI Auth and File Storage (danny-avila#11923)
* chore: saveToCloudStorage function and enhance error handling - Removed unnecessary parameters and streamlined the logic for saving images to cloud storage. - Introduced buffer handling for base64 image data and improved the integration with file strategy functions. - Enhanced error handling during local image saving to ensure robustness. - Updated the createGeminiImageTool function to reflect changes in the saveToCloudStorage implementation. * refactor: streamline image persistence logic in GeminiImageGen - Consolidated image saving functionality by renaming and refactoring the saveToCloudStorage function to persistGeneratedImage. - Improved error handling and logging for image persistence operations. - Enhanced the replaceUnwantedChars function to better sanitize input strings. - Updated createGeminiImageTool to reflect changes in image handling and ensure consistent behavior across storage strategies. * fix: clean up GeminiImageGen by removing unused functions and improving logging - Removed the getSafeFormat and persistGeneratedImage functions to streamline image handling. - Updated logging in createGeminiImageTool for clarity and consistency. - Consolidated imports by eliminating unused dependencies, enhancing code maintainability. * chore: update environment configuration and manifest for unused GEMINI_VERTEX_ENABLED - Removed the Vertex AI configuration option from .env.example to simplify setup. - Updated the manifest.json to reflect the removal of the Vertex AI dependency in the authentication field. - Cleaned up the createGeminiImageTool function by eliminating unused fields related to Vertex AI, streamlining the code. * fix: update loadAuthValues call in loadTools function for GeminiImageGen tool - Modified the loadAuthValues function call to include throwError: false, preventing exceptions on authentication failures. - Removed the unused processFileURL parameter from the tool context object, streamlining the code. * refactor: streamline GoogleGenAI initialization in GeminiImageGen - Removed unused file system access check for Google application credentials, simplifying the environment setup. - Added googleAuthOptions to the GoogleGenAI instantiation, enhancing the configuration for authentication. * fix: update Gemini API Key label and description in manifest.json - Changed the label to indicate that the Gemini API Key is optional. - Revised the description to clarify usage with Vertex AI and service accounts, enhancing user guidance. * fix: enhance abort signal handling in createGeminiImageTool - Introduced derivedSignal to manage abort events during image generation, improving responsiveness to cancellation requests. - Added an abortHandler to log when image generation is aborted, enhancing debugging capabilities. - Ensured proper cleanup of event listeners in the finally block to prevent memory leaks. * fix: update authentication handling for plugins to support optional fields - Added support for optional authentication fields in the manifest and PluginAuthForm. - Updated the checkPluginAuth function to correctly validate plugins with optional fields. - Enhanced tests to cover scenarios with optional authentication fields, ensuring accurate validation logic.
1 parent 1d0a4c5 commit f3eb197

8 files changed

Lines changed: 137 additions & 182 deletions

File tree

.env.example

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -243,10 +243,6 @@ GOOGLE_KEY=user_provided
243243
# Option A: Use dedicated Gemini API key for image generation
244244
# GEMINI_API_KEY=your-gemini-api-key
245245

246-
# Option B: Use Vertex AI (no API key needed, uses service account)
247-
# Set this to enable Vertex AI and allow tool without requiring API keys
248-
# GEMINI_VERTEX_ENABLED=true
249-
250246
# Vertex AI model for image generation (defaults to gemini-2.5-flash-image)
251247
# GEMINI_IMAGE_MODEL=gemini-2.5-flash-image
252248

api/app/clients/tools/manifest.json

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -161,9 +161,10 @@
161161
"icon": "assets/gemini_image_gen.svg",
162162
"authConfig": [
163163
{
164-
"authField": "GEMINI_API_KEY||GOOGLE_KEY||GEMINI_VERTEX_ENABLED",
165-
"label": "Gemini API Key (Optional if Vertex AI is configured)",
166-
"description": "Your Google Gemini API Key from <a href='https://aistudio.google.com/app/apikey' target='_blank'>Google AI Studio</a>. Leave blank if using Vertex AI with service account."
164+
"authField": "GEMINI_API_KEY||GOOGLE_KEY||GOOGLE_SERVICE_KEY_FILE",
165+
"label": "Gemini API Key (optional)",
166+
"description": "Your Google Gemini API Key from <a href='https://aistudio.google.com/app/apikey' target='_blank'>Google AI Studio</a>. Leave blank to use Vertex AI with a service account (GOOGLE_SERVICE_KEY_FILE or api/data/auth.json).",
167+
"optional": true
167168
}
168169
]
169170
}

api/app/clients/tools/structured/GeminiImageGen.js

Lines changed: 40 additions & 164 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,11 @@
1-
const fs = require('fs');
21
const path = require('path');
32
const sharp = require('sharp');
43
const { v4 } = require('uuid');
54
const { ProxyAgent } = require('undici');
65
const { GoogleGenAI } = require('@google/genai');
76
const { tool } = require('@langchain/core/tools');
87
const { logger } = require('@librechat/data-schemas');
9-
const {
10-
FileContext,
11-
ContentTypes,
12-
FileSources,
13-
EImageOutputType,
14-
} = require('librechat-data-provider');
8+
const { ContentTypes, EImageOutputType } = require('librechat-data-provider');
159
const {
1610
geminiToolkit,
1711
loadServiceKey,
@@ -59,17 +53,12 @@ const displayMessage =
5953
* @returns {string} - The processed string
6054
*/
6155
function replaceUnwantedChars(inputString) {
62-
return inputString?.replace(/[^\w\s\-_.,!?()]/g, '') || '';
63-
}
64-
65-
/**
66-
* Validate and sanitize image format
67-
* @param {string} format - The format to validate
68-
* @returns {string} - Safe format
69-
*/
70-
function getSafeFormat(format) {
71-
const allowedFormats = ['png', 'jpg', 'jpeg', 'webp', 'gif'];
72-
return allowedFormats.includes(format?.toLowerCase()) ? format.toLowerCase() : 'png';
56+
return (
57+
inputString
58+
?.replace(/\r\n|\r|\n/g, ' ')
59+
.replace(/"/g, '')
60+
.trim() || ''
61+
);
7362
}
7463

7564
/**
@@ -117,11 +106,8 @@ async function initializeGeminiClient(options = {}) {
117106
return new GoogleGenAI({ apiKey: googleKey });
118107
}
119108

120-
// Fall back to Vertex AI with service account
121109
logger.debug('[GeminiImageGen] Using Vertex AI with service account');
122110
const credentialsPath = getDefaultServiceKeyPath();
123-
124-
// Use loadServiceKey for consistent loading (supports file paths, JSON strings, base64)
125111
const serviceKey = await loadServiceKey(credentialsPath);
126112

127113
if (!serviceKey || !serviceKey.project_id) {
@@ -131,75 +117,14 @@ async function initializeGeminiClient(options = {}) {
131117
);
132118
}
133119

134-
// Set GOOGLE_APPLICATION_CREDENTIALS for any Google Cloud SDK dependencies
135-
try {
136-
await fs.promises.access(credentialsPath);
137-
process.env.GOOGLE_APPLICATION_CREDENTIALS = credentialsPath;
138-
} catch {
139-
// File doesn't exist, skip setting env var
140-
}
141-
142120
return new GoogleGenAI({
143121
vertexai: true,
144122
project: serviceKey.project_id,
145123
location: process.env.GOOGLE_LOC || process.env.GOOGLE_CLOUD_LOCATION || 'global',
124+
googleAuthOptions: { credentials: serviceKey },
146125
});
147126
}
148127

149-
/**
150-
* Save image to local filesystem
151-
* @param {string} base64Data - Base64 encoded image data
152-
* @param {string} format - Image format
153-
* @param {string} userId - User ID
154-
* @returns {Promise<string>} - The relative URL
155-
*/
156-
async function saveImageLocally(base64Data, format, userId) {
157-
const safeFormat = getSafeFormat(format);
158-
const safeUserId = userId ? path.basename(userId) : 'default';
159-
const imageName = `gemini-img-${v4()}.${safeFormat}`;
160-
const userDir = path.join(process.cwd(), 'client/public/images', safeUserId);
161-
162-
await fs.promises.mkdir(userDir, { recursive: true });
163-
164-
const filePath = path.join(userDir, imageName);
165-
await fs.promises.writeFile(filePath, Buffer.from(base64Data, 'base64'));
166-
167-
logger.debug('[GeminiImageGen] Image saved locally to:', filePath);
168-
return `/images/${safeUserId}/${imageName}`;
169-
}
170-
171-
/**
172-
* Save image to cloud storage
173-
* @param {Object} params - Parameters
174-
* @returns {Promise<string|null>} - The storage URL or null
175-
*/
176-
async function saveToCloudStorage({ base64Data, format, processFileURL, fileStrategy, userId }) {
177-
if (!processFileURL || !fileStrategy || !userId) {
178-
return null;
179-
}
180-
181-
try {
182-
const safeFormat = getSafeFormat(format);
183-
const safeUserId = path.basename(userId);
184-
const dataURL = `data:image/${safeFormat};base64,${base64Data}`;
185-
const imageName = `gemini-img-${v4()}.${safeFormat}`;
186-
187-
const result = await processFileURL({
188-
URL: dataURL,
189-
basePath: 'images',
190-
userId: safeUserId,
191-
fileName: imageName,
192-
fileStrategy,
193-
context: FileContext.image_generation,
194-
});
195-
196-
return result.filepath;
197-
} catch (error) {
198-
logger.error('[GeminiImageGen] Error saving to cloud storage:', error);
199-
return null;
200-
}
201-
}
202-
203128
/**
204129
* Convert image files to Gemini inline data format
205130
* @param {Object} params - Parameters
@@ -390,34 +315,18 @@ function createGeminiImageTool(fields = {}) {
390315
throw new Error('This tool is only available for agents.');
391316
}
392317

393-
// Skip validation during tool creation - validation happens at runtime in initializeGeminiClient
394-
// This allows the tool to be added to agents when using Vertex AI without requiring API keys
395-
// The actual credentials check happens when the tool is invoked
396-
397-
const {
398-
req,
399-
imageFiles = [],
400-
processFileURL,
401-
userId,
402-
fileStrategy,
403-
GEMINI_API_KEY,
404-
GOOGLE_KEY,
405-
// GEMINI_VERTEX_ENABLED is used for auth validation only (not used in code)
406-
// When set as env var, it signals Vertex AI is configured and bypasses API key requirement
407-
} = fields;
318+
const { req, imageFiles = [], userId, fileStrategy, GEMINI_API_KEY, GOOGLE_KEY } = fields;
408319

409320
const imageOutputType = fields.imageOutputType || EImageOutputType.PNG;
410321

411322
const geminiImageGenTool = tool(
412-
async ({ prompt, image_ids, aspectRatio, imageSize }, _runnableConfig) => {
323+
async ({ prompt, image_ids, aspectRatio, imageSize }, runnableConfig) => {
413324
if (!prompt) {
414325
throw new Error('Missing required field: prompt');
415326
}
416327

417-
logger.debug('[GeminiImageGen] Generating image with prompt:', prompt?.substring(0, 100));
418-
logger.debug('[GeminiImageGen] Options:', { aspectRatio, imageSize });
328+
logger.debug('[GeminiImageGen] Generating image', { aspectRatio, imageSize });
419329

420-
// Initialize Gemini client with user-provided credentials
421330
let ai;
422331
try {
423332
ai = await initializeGeminiClient({
@@ -432,10 +341,8 @@ function createGeminiImageTool(fields = {}) {
432341
];
433342
}
434343

435-
// Build request contents
436344
const contents = [{ text: replaceUnwantedChars(prompt) }];
437345

438-
// Add context images if provided
439346
if (image_ids?.length > 0) {
440347
const contextImages = await convertImagesToInlineData({
441348
imageFiles,
@@ -447,28 +354,34 @@ function createGeminiImageTool(fields = {}) {
447354
logger.debug('[GeminiImageGen] Added', contextImages.length, 'context images');
448355
}
449356

450-
// Generate image
451357
let apiResponse;
452358
const geminiModel = process.env.GEMINI_IMAGE_MODEL || 'gemini-2.5-flash-image';
453-
try {
454-
// Build config with optional imageConfig
455-
const config = {
456-
responseModalities: ['TEXT', 'IMAGE'],
457-
};
458-
459-
// Add imageConfig if aspectRatio or imageSize is specified
460-
// Note: gemini-2.5-flash-image doesn't support imageSize
461-
const supportsImageSize = !geminiModel.includes('gemini-2.5-flash-image');
462-
if (aspectRatio || (imageSize && supportsImageSize)) {
463-
config.imageConfig = {};
464-
if (aspectRatio) {
465-
config.imageConfig.aspectRatio = aspectRatio;
466-
}
467-
if (imageSize && supportsImageSize) {
468-
config.imageConfig.imageSize = imageSize;
469-
}
359+
const config = {
360+
responseModalities: ['TEXT', 'IMAGE'],
361+
};
362+
363+
const supportsImageSize = !geminiModel.includes('gemini-2.5-flash-image');
364+
if (aspectRatio || (imageSize && supportsImageSize)) {
365+
config.imageConfig = {};
366+
if (aspectRatio) {
367+
config.imageConfig.aspectRatio = aspectRatio;
470368
}
369+
if (imageSize && supportsImageSize) {
370+
config.imageConfig.imageSize = imageSize;
371+
}
372+
}
373+
374+
let derivedSignal = null;
375+
let abortHandler = null;
471376

377+
if (runnableConfig?.signal) {
378+
derivedSignal = AbortSignal.any([runnableConfig.signal]);
379+
abortHandler = () => logger.debug('[GeminiImageGen] Image generation aborted');
380+
derivedSignal.addEventListener('abort', abortHandler, { once: true });
381+
config.abortSignal = derivedSignal;
382+
}
383+
384+
try {
472385
apiResponse = await ai.models.generateContent({
473386
model: geminiModel,
474387
contents,
@@ -480,9 +393,12 @@ function createGeminiImageTool(fields = {}) {
480393
[{ type: ContentTypes.TEXT, text: `Image generation failed: ${error.message}` }],
481394
{ content: [], file_ids: [] },
482395
];
396+
} finally {
397+
if (abortHandler && derivedSignal) {
398+
derivedSignal.removeEventListener('abort', abortHandler);
399+
}
483400
}
484401

485-
// Check for safety blocks
486402
const safetyBlock = checkForSafetyBlock(apiResponse);
487403
if (safetyBlock) {
488404
logger.warn('[GeminiImageGen] Safety block:', safetyBlock);
@@ -509,46 +425,7 @@ function createGeminiImageTool(fields = {}) {
509425
const imageData = convertedBuffer.toString('base64');
510426
const mimeType = outputFormat === 'jpeg' ? 'image/jpeg' : `image/${outputFormat}`;
511427

512-
logger.debug('[GeminiImageGen] Image format:', { outputFormat, mimeType });
513-
514-
let imageUrl;
515-
const useLocalStorage = !fileStrategy || fileStrategy === FileSources.local;
516-
517-
if (useLocalStorage) {
518-
try {
519-
imageUrl = await saveImageLocally(imageData, outputFormat, userId);
520-
} catch (error) {
521-
logger.error('[GeminiImageGen] Local save failed:', error);
522-
imageUrl = `data:${mimeType};base64,${imageData}`;
523-
}
524-
} else {
525-
const cloudUrl = await saveToCloudStorage({
526-
base64Data: imageData,
527-
format: outputFormat,
528-
processFileURL,
529-
fileStrategy,
530-
userId,
531-
});
532-
533-
if (cloudUrl) {
534-
imageUrl = cloudUrl;
535-
} else {
536-
// Fallback to local
537-
try {
538-
imageUrl = await saveImageLocally(imageData, outputFormat, userId);
539-
} catch (_error) {
540-
imageUrl = `data:${mimeType};base64,${imageData}`;
541-
}
542-
}
543-
}
544-
545-
logger.debug('[GeminiImageGen] Image URL:', imageUrl);
546-
547-
// For the artifact, we need a data URL (same as OpenAI)
548-
// The local file save is for persistence, but the response needs a data URL
549428
const dataUrl = `data:${mimeType};base64,${imageData}`;
550-
551-
// Return in content_and_artifact format (same as OpenAI)
552429
const file_ids = [v4()];
553430
const content = [
554431
{
@@ -567,8 +444,7 @@ function createGeminiImageTool(fields = {}) {
567444
},
568445
];
569446

570-
// Record token usage for balance tracking (don't await to avoid blocking response)
571-
const conversationId = _runnableConfig?.configurable?.thread_id;
447+
const conversationId = runnableConfig?.configurable?.thread_id;
572448
recordTokenUsage({
573449
usageMetadata: apiResponse.usageMetadata,
574450
req,

api/app/clients/tools/util/handleTools.js

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -207,7 +207,7 @@ const loadTools = async ({
207207
},
208208
gemini_image_gen: async (toolContextMap) => {
209209
const authFields = getAuthFields('gemini_image_gen');
210-
const authValues = await loadAuthValues({ userId: user, authFields });
210+
const authValues = await loadAuthValues({ userId: user, authFields, throwError: false });
211211
const imageFiles = options.tool_resources?.[EToolResources.image_edit]?.files ?? [];
212212
const toolContext = buildImageToolContext({
213213
imageFiles,
@@ -222,7 +222,6 @@ const loadTools = async ({
222222
isAgent: !!agent,
223223
req: options.req,
224224
imageFiles,
225-
processFileURL: options.processFileURL,
226225
userId: user,
227226
fileStrategy,
228227
});

0 commit comments

Comments
 (0)