I wrote this piece about Machine-Learning art a year ago, and set it to self-publish. Letâs see how much of an idiot I look like!
First off, âAIâ doesnât exist. Some marketing team took a load of coke, then called their vision âOpen AIâ, because being âOpenâ is good and being âAIâ is impressive. The reality is neither, and the technical meaning of this technical term remains unchanged, no matter what us unqualified plebs on the side-lines of the real research might tweet, toot, or post.
The reality is Machine Learning (ML), but ML is still cool.
When ML first managed to produce articles about the incredible art which it spat out (i.e. it produced art, and plausibly some of the articles about the art), every poor RPG designer salivated at the prospect of having actual art in their game.
Unfortunately, it didnât work for my RPGs, I donât think it works for almost any (beyond very limited capacities), and I donât think that will change any time soon.
Initial Attempts
The Vampire Child
I started with an old vampire, turned as a young man, who speaks with Ravens. The tool of choice was Midjourney.
To be as fair as possible, I left the image open to Midjourneyâs interpretation. My prompt was this:
dark ages boy speaks to raven in the moonlit rain
These are bad, but after many iterations, I had finally lowered my standards enough to find an okay-ish image. If you didnât mind the moon looking like a damp cloth, it would workâŚsort of.
Unfortunately, it didnât really illustrate anything.
The Hunter on Horseback
Next up, the vampire-hunter, who tracks down the characters, and notes their carriageâs tracks veering into a village.
Slavic, of-the-night, noble hunter reading tracks, horse, footprints, village, 1300s
This worked out pretty well, assuming the horse was borrowed from Cthulhu. Otherwise, itâs unusable.
Heâs not really âreading tracksâ, heâs just standing there in most iterations. And many iterations later, the image had not improved.
The General Problem
The famous ML images generally do X in the style of Y, where âXâ is a single face, and âYâ is a prolific artist. It does this well.
However, ML doesnât seem to combine elements well. âOptimus Prime, enjoying a sandwichâ, or âSartre explaining to Kant why Linux is the superior operating systemâ will stretch MLâs abilities, because it fundamentally does not know what any of those words mean - it copies, and it doesnât have many copies of that sort of thing.
RPG images are meant to explain. They exist because explaining a room full of goblins lead by a half-ogre with a broad-sword, in a dungeon, isnât an easy job; but placing an image of the scene next to the writing makes the readerâs job easier.
RPG images exist to explain the spells - the transformations, invocations, and prognostication - without lengthy paragraphs.
Explanation implies novelty, and every kind of ML means blending defaults with a large training set - something which immediately implies the opposite of novelty.
Pole Dancing Zombies
A friend challenged me on this, saying ML could make images with interacting elements, so I asked for an image from one of my modules.
Many zombies are chained to a pillar. When the PCs enter the room, they strain against the pillar, which threatens to pull the dungeonâs roof in.
The results werenât great - they look like theyâre pole-dancing. Asking the computer specifically for ânot pole dancingâ didnât help.
General Conclusions
ML cannot display novel interactions, because people create some ML tool by using not-novel source material.
ML remains an excellent tool for:
- making an image wider by a couple of centimetres (it can guess a small amount of an existing image),[1]
- creating technically-novel stock images, and
- increasing the DPI, so small images can be printed larger, without artefacts.
~~~~~~~~
[1] (2025-07-02) I have since been informed that this feature does not work well, as the machines too-often guess nonsensical pieces of things, and itâs often easier to just use standard cloning tools.