Telegram bot: analyze images with GPT-4o-Mini/NVIDIA Vila & generate images with Stable Diffusion 3

Created by

Last update

Last update 7 days ago

Introduction

Transform your Telegram bot into an AI vision system using GPT-4o-Mini and NVIDIA Stable Diffusion 3. Perfect for content moderators, researchers, and developers.

Workflow Explanatory

At start: Processes Telegram messages: images→analysis, text→image generation
At Router: Routes by content type
Upper path: Analyzes images using Nvidia Vila + GPT-4o-Mini
Lower path: Generates images from text via Stable Diffusion 3
At Merge: Combines AI results
At Gmail: Emails processed results

How It Works

Telegram Trigger listens for messages (images, text, documents)
Content Router directs images → AI analysis, text → image generation
Image Analysis: Downloads image → GPT-4o-Mini vision analysis → Email results
Image Generation: Text prompt → Stable Diffusion 3 → Email generated image
Gmail Notifications send formatted reports

Prerequisites

Telegram Bot token (via @BotFather)
OpenAI API key (GPT-4 Vision)
NVIDIA API key (free tier available)
Gmail OAuth2 credentials

Setup Steps

** Create Telegram Bot** - Create Telegram bot and obtain token
** Configure API Credentials** - Configure API credentials in HTTP Request nodes
** Set Up Gmail OAuth2** - Set up Gmail OAuth2
** Import and Activate Workflow** - Import workflow, update credentials, and activate

Customization Options

Add more AI models (Anthropic, Gemini)
Route audio/documents to transcription/OCR
Replace Gmail with Slack or Discord
Connect to databases for storage

Benefits

Speed: Seconds per analysis vs. hours manually
Accuracy: AI-powered visual understanding
Intelligence: Historical tracking enables trend analysis