Visionary is a high-end, React-based interface for AI image generation, powered by Google's latest Gemini and Imagen models. It bridges the gap between simple text prompting and professional artistic workflows, offering precision control over composition, style, and perspective in a sleek, dark-mode studio environment.
- Gemini 2.5 Flash Image: Fast, creative generation with multimodal capabilities.
- Gemini 3 Pro Image: High-fidelity generation with support for 2K and 4K resolution.
- Imagen 3 & 4: Specialized text-to-image models for photorealistic results.
Go beyond standard squares. Visionary supports a wide range of aspect ratios mapped to native model capabilities:
- Standard: Square (1:1), Landscape (4:3), Portrait (3:4).
- Modern: Widescreen (16:9), Mobile (9:16).
- Cinematic & Artistic: Ultra-wide Cinematic (21:9) for epic storytelling, Mobile Tall (9:21) for social wallpapers, and Classic Photography (3:2, 2:3).
The Grid feature uses prompt engineering to instruct the model to generate specific compositions within a single output:
- Character Sheets: Generates front, side, and back views of a character in one generation, ideal for concept art.
- Comic Pages & Storyboards: Creates sequential art layouts with dynamic panels for storytelling.
- Triptychs: Horizontal or vertical three-panel artistic splits.
- Split Screens: Vertical or horizontal splits for comparing variations.
- Standard Grids: 2x2, 3x3, and 4x4 grids for rapid iteration and pattern generation.
Instantly apply complex stylistic modifiers without typing paragraph-long prompts. Categories include:
- Digital & 3D: Cyberpunk, 3D Render (Octane/Unreal), Low Poly, Glitch Art, Pixel Art.
- Traditional: Oil Painting, Watercolor, Pencil Sketch, Impressionism, Ukiyo-e, Art Nouveau.
- Unique Textures: Claymation, Papercraft, Stained Glass, Blueprint, Thermal Vision, Mosaic.
Control the virtual camera lens to dramatically change the mood and composition:
- Angles: Bird's Eye, Worm's Eye, Low Angle, High Angle (Drone).
- Lenses: Fisheye (180°), Macro (Extreme Close-up), Telephoto, Wide Angle.
- Composition: Isometric, Knolling (Flat Lay), Dutch Angle (Canted), Over-the-Shoulder, CCTV.
Upload up to 4 reference images to guide style and composition (Gemini models only). This allows for style transfer and composition matching.
Reverse-engineer prompts by using Gemini to describe generated images. Simply click "Full Describe" on any generated image to get a detailed text description that can be used to generate similar images.
- Infinite Canvas: Pan and zoom capabilities to inspect high-resolution details.
- History Archive: Local session history with quick recall and management.
- Bulk Export: Download your entire session as a ZIP file.
- Responsive Design: A fully immersive experience on both desktop and mobile devices.
- Frontend: React 19, TypeScript
- Styling: Tailwind CSS
- AI SDK:
@google/genai - Utilities:
jszip,file-saver
- Node.js (v16 or higher)
- A Google Cloud Project with the Gemini API enabled or access via Google AI Studio.
-
Clone the repository:
git clone https://github.com/yourusername/visionary-art-studio.git cd visionary-art-studio -
Install dependencies:
npm install
-
Configure API Key: The application requires a valid Google GenAI API Key.
- Create a
.envfile in the root directory. - Add your key:
API_KEY=your_api_key_here - Note: In the Google AI Studio environment, this is often handled automatically.
- Create a
-
Run the application:
npm start
- Select a Model: Choose between Flash (Speed), Pro (Quality/Resolution), or Imagen (Photorealism).
- Note: Reference images are disabled for Imagen models.
- Set Constraints:
- Aspect Ratio: From Mobile (9:16) to Cinematic (21:9).
- Grid: Choose "Character Sheet" for consistent character design or "Comic Page" for storytelling.
- Craft your Prompt: Enter a detailed description.
- Add References (Optional): Upload images to influence the generation.
- Generate: Click the generate button and watch the studio render your vision.
- Refine: Double-click the result to zoom, or use the "Full Describe" button to analyze the image and iterate on the prompt.
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature) - Commit your Changes (
git commit -m 'Add some AmazingFeature') - Push to the Branch (
git push origin feature/AmazingFeature) - Open a Pull Request
Distributed under the MIT License. See LICENSE for more information.
