PAINTS: A Prompt Framework for Image Generation

In a galaxy of prompts, PAINTS helps educators craft AI-generated images with clarity, purpose, and pedagogical precision.

Earlier this month, I attended a fantastic session by Tadgh Blommerde on generative AI in education, hosted by Anglia Ruskin University. One moment in particular stuck with me: Tadgh mentioned that it currently feels like there are as many prompt engineering frameworks out at the moment as there are stars in the sky. It was a poetic exaggeration, but not far off. Yet amid this galaxy of models, I found myself wanting to contribute one of my own, something grounded, accessible, and directly relevant to visual content rather than text generation.

So, I created PAINTS: a prompt engineering framework specifically designed to help educators generate high quality, inclusive, contextually rich, and purposeful AI-generated images. The acronym isn’t there to write the prompt for you; instead, it’s a checklist to help refine what should go into a prompt so the image has a greater value.

Let’s say your base prompt is:
“A diverse group of dental care students working together in a clinical training setting.”
That’s a great start—but PAINTS will help direct you to the exact content and detail required..

P – Purpose

Start by identifying why the image is being generated. Is it for a teaching resource? A promotional campaign? A reflective discussion starter? In learning design, this maps to constructive alignment (Biggs & Tang, 2011): the image should support the intended learning outcome. For example, if it’s meant to represent collaborative learning, the visual should depict interaction, discussion, or shared resources, not just students passively standing around.

My example: I wanted an image that could be used as a banner for promoting interprofessional teamwork at a dental open day.

A – Art Style

What visual style best supports the image’s purpose? Realism can support authenticity in clinical simulations, while illustrated or stylised influences may enhance engagement in asynchronous learning materials. This reflects dual coding theory (Paivio, 1986), where visual elements support cognition when aligned with the type of information being taught.

My approach: I chose a photographic style with realistic lighting, similar to a staged editorial because I wanted it to mimic university marketing materials.

Note: I avoid referencing specific artists in prompts to steer clear of copyright issues.

I – Inclusivity

Representation matters. Think actively about how you reflect diversity in your visuals. This includes ethnicity, gender identity, body type, disability, and age. Inclusive visuals align with Universal Design for Learning (UDL) principles, ensuring all students see themselves reflected and valued.

In my prompt: I included “a diverse group of students” and specified features like assistive devices and a range of skin tones, body types, and attire.

N – Necessary Features

What must be included for the image to serve its purpose? Think about equipment, the Personal Protective Equipment I wanted the students to wear, spatial orientation, or even specific actions (like one student holding a chart). This step reflects cognitive load theory (Sweller, 1988). If we’re showing learners something complex (like a dental clinical setting), we want to reduce extraneous details and keep the essentials clear, particularly as some image generators have a tendency to over-clutter. I have also specifically kept this to items I want featured; some prompt writing tips include specifying what you don’t want featured however, I don’t think all image generators are at a place yet where they can confidently distinguish between what you want and what you don’t. Most will read anything listed in the prompt as an itemised list so just as you want to minimise cognitive overload to the viewers, mind you are not overloading the generator as well.

My list included: dental chair, correct PPE, scrubs, and at least one student engaging in a discussion.

T – Tone / Technical Elements

What feeling should the image convey? Seriousness, collaboration, excitement? And fpr additional specificity, some generators will even allow for specific technical attributes such as camera angle, aspect ratio, focus, lighting and even camera type. This links with affective learning theory, how learners emotionally engage with material. The “tone” helps evoke connection, motivation, or curiosity.

For my image: I asked for a sense of energy, eye contact between learners, and a mid-range camera shot with natural light. (In hindsight, I should’ve specified gaze direction to avoid the classic AI Uncanny Valley dead-eye stare).

S – Setting

Specify your environment clearly. “Modern dental clinic” is a lot more precise than “indoors.” Don’t limit yourself to just the immediate area, think about the contextual setting as well; due to it’s training AI will very often give the image an Americanised setting so specify the country, even add the time period it you want a more retro feel. Environmental cues support situated learning theory (Lave & Wenger, 1991), where context enhances the authenticity of knowledge.

My prompt setting: “A state of the art dental training suite at the University of Portsmouth”. I was intrigued to see if an AI could repicate the exact logo, and amazingly I’d say ChatGPT managed it with surprising accuracy.

The Result

After running the prompt through several image generators (including ChatGPT’s DALL·E and Ideogram), I got some fantastic results. All platforms picked up on the collaborative dynamic I asked for though some were better than others which made for an interesting comparison or image generation quality. CoPilot struggled as the length of the prompt was too big for it’s free version, and so did Adobe Firefly but that is likely due to the fact that Firefly does not train on copyrighted media (a massive advantage for unique ideas but for commonly used images such as promotional materials it’s understandable that it lacked the specificity.

A comparison of different Image Generation tools using the same Prompt using the PAINTS framework.

Would I change anything? Yes. Next time, I’d add more specificity to where students are looking to create a better sense of focus and interaction. The nature of this prompt isn’t to create the perfect image, it’s to remind you of everything you should aim to incorporate. The editing process will always be an intrinsic component to the image generation process to ensure suitability, representation and accuracy.

Why PAINTS Matters in Teaching

As educators, particularly in digital and clinical education, we’re increasingly responsible for curating or creating the visuals in our materials. We need images that are not just pretty, but pedagogically functional, inclusive, and aligned with real learning goals.

PAINTS gives us a way to do that without getting overwhelmed by AI complexity. It empowers us to build smarter prompts and in turn, better resources.

And yes, I know this is yet another framework in a sea of frameworks. But if it helps you craft one brilliant, inclusive, targeted image - then I hope it earns its place among the stars.

The image by ChatGPT (though it still requires editing)

Next
Next

Channel-Hopping Revision