ArtAgents: Your Creative Assistant for Prompt Engineering and Captioning
ArtAgents is an innovative open source application designed to enhance creative workflows by leveraging advanced LLM AI models. It helps create robust, high-quality prompts and captions for the next generation of generative models, utilizing natural language scene descriptions. Powered by Ollama, ArtAgents offers a comprehensive suite of capabilities tailored to a wide range of creative needs, particularly in prompt engineering and captioning experiments.
ArtAgents can be run locally easily and is independent of any specific target generative model, local or cloud platform (it creates and alters text prompts for any generator). You can adjust it to fit any generative style, limited only by the LLM model's capabilities.
Agent-Based Chat Interface
At the core of ArtAgents is its agent-based chat interface, which allows users to interact and get help from various AI agents tailored to different roles and tasks. Whether you're a designer, artist, fashionista, or colorist, ArtAgents provides specialized agents that understand your specific needs and deliver tailored responses.
- Agent Role Selection: Choose from a variety of predefined roles such as Designer, Artist, Fashionista, Colorista, Detailer, Photographer, Video, and Styler. Each role is editable and comes with customizable options to fine-tune the AI's responses.
- Custom Agent Roles: Users can define their own custom agent roles with parameters, allowing for greater flexibility when working with various multimodal LLMs.
Multimodal Input Support
ArtAgents supports multimodal input, enabling users to provide both textual and visual inputs. This feature is particularly useful for tasks that require a combination of textual descriptions and visual references.
- Image Input: Upload images directly from your folder or use a single image input to provide visual context for the AI agents. The application processes these images and incorporates the visual information into the generated responses.
- Textual Input: Enter detailed textual descriptions and prompts to guide the AI agents in generating the desired outputs. You can combine textual and visual inputs, keeping in mind that image information strongly influences the output.
Captioning
Originally developed to assist with unusual captioning styles for training generative AI models, ArtAgents offers simple captioning of images in a target folder. When you input a single image, you can experiment with settings to use for a whole folder of images for training and fine-tuning models and LoRAs.
Advanced LLM Response Generation
ArtAgents utilizes LLM models installed via Ollama to generate high-quality, contextually relevant responses. The application supports various models, including those with vision capabilities, to handle a wide range of creative tasks.
Flexibility
You can customize many features of ArtAgents, including agents and their parameters, LLM models to choose from, additional custom prompt limiters, and LLM settings. You can use any current or future LLM model compatible with Ollama.
Comment on the Output
ArtAgents allows you to comment on the LLM outputs to slightly modify them to fit your needs for generative AI images or videos. This simplifies the workflow regardless of whether you are using local or cloud image or video generators.
Installation and How to Use It
You will find the most current release of ArtAgents on GitHub, with installation instructions.
Installation
Ensure you have Python and Git installed on your system.
- Download and install Ollama from https://ollama.com/ . Ollama is an open-source platform that allows you to run large language models (LLMs) locally on your own hardware
- Clone the ArtAgents repository in target folder with command in terminal
git clone https://github.com/sandner-art/ArtAgents.git
- Run setupvenv.bat (optional, recommended)
- Run setup.bat to setup ollama models (optional, if you want to install the models manually, check ArtAgents github repository for more info)
- Start ArtAgents with govenv.bat (with venv) or go.bat
Creating a Prompt
Select model, write user input and select agent. Click "Submit". When an image is inserted, it will affect the output. You may modify the output with "Comment" section and button.
Captioning
Write user input and select agent. Insert path into "Folder Path" and click "Submit". ArtAgent will generate .txt files with captions for training in the image folder. I recommend to revise the captions and edit them to suit your needs.
Be Inspired
By extending simple descriptions and parameters, you can customize your prompt generation to explore visual information from several technical viewpoints. You can affect the prompt from the perspective of a designer, typographer, photographer, or any professional aspect you define.
The Goal and Development
Originally an experiment for image captioning, I found this tool surprisingly useful for my design sketches and creating prompts and captions for video creations. As the tool develops, new features will emerge to fit various workflows.
Conclusion
ArtAgents is an LLM AI-driven tool designed to enhance possibilities, streamline workflows, and provide support for artists, designers, and creatives. With its easy-to-use capabilities, minimalist user interface, and robust yet simple customization options, ArtAgents helps work with advanced generative models (image and video), which now require a more natural language approach to achieve the best results. Whether you're a seasoned graphic professional or a budding AI art creator, ArtAgents offers the tools to refine the important prompt structure to bring your creative visions to life.
Downloads and References
- ArtAgents on Github (current and development versions, instructions).
- Ollama download