AI imagery—some thoughts

I’ve been giving some thought to AI image generation technologies—like Adobe Firefly, Midjourney, and Dall-e—and their implications. I just wanted to get these down, if only for my future reference. They’re kinda random. But I’m trying to remain true to my objective of not overthinking these posts…

Who will be our Witness?

A lot of us groan at Donald Trump’s consistent refrain of ‘fake news’ in relation to factual reporting of his words and actions. And then someone posted actual ‘fake news’ of photos of him being arrested, which were AI generated.

When the blogging explosion started over 20 years ago, one of the promises of technology was for citizen journalism, whistleblowers, and others who could now use (relatively) low-cost equipment and tools to get the message out in areas where this was previously very difficult. Countries with strong censorship, or run by dictators and despots, for example.

Witness.org was an organisation that emerged during this period to assist people with stories to share, to do that safely, providing tools, training and equipment, and helping to get the word out to the media.

It really strikes me that this type of activity, where real wrong-doings are being reported, are at threat due to the ability for the accused to just write things off as ‘fake news’ or AI generated.

Cameras prove their authenticity

I noticed that Adobe are taking steps to work on copyright credit and identifying AI-generated imagery. (I learned this through this video, which is also a pretty good intro to the tech…)

But I think camera manufacturers need to do a similar thing, from the opposite end of the spectrum.

A way of watermarking or otherwise ‘certifying’ an image as having been taken by a physical device, that is unadulterated or modified in any way.

I can’t help but think of blockchain-like verification of the bits of an image.

This would help in the ‘Witness’ case above. But probably would provide other benefits in terms of verifying authenticity or authorship.

Something like a photographer registers their device (camera) which has some kind of cryptographically secure key/identifier. Then any image that contains a signature generated by that key can be identified as authentic.

This causes some issues with the Witness concept, in that the identifier can’t necessarily be the photographer, as this may put their safety at risk. But could go some way to ensuring that ‘truth seekers’ are protected from fallacious claims of ‘fake news’ etc.

Stock photography a dying art

I’m seeing more and more AI-generated imagery appearing in my Pinterest and Vero feeds. Some of it is pretty spectacular, and I could see it being used for magazine layouts etc. in place of illustrations or professional photography.

At the moment, getting a ‘professional’ quality output from AI imagery still requires a lot of work and talent—to come up with the concept, sift through the various outputs, touching up and re-combining images to get a finished product.

But it’s not hard to see this is a temporary ‘barrier’—that as the AI gets better, the need for, or at lest the degree of—touching up will lessen considerably.

I can see this in the corporate world too. I’m thinking of a recent series of presentations I’ve done. I could easily see me being able to get a decent illustration of ‘a person sitting at home using their WiFi’ without using stock imagery (which is what I did—sourced from iStockphoto). Or to use Firefly to generate some text built from elements representing sustainability for a ‘Net Zero’ slide.

This has wide-ranging implications. The stock photo companies have a threatened business model. The photographers that provide the stock to these sites lose an important revenue stream. Professional photographic shoots greatly reduce—impacting not only photographers, but the talent, makeup artists, lighting folks, even caterers.

I can’t help but think that stock photo companies are going to search for their model—either by developing (or licensing) AI tech into their platforms, or potentially offering photographers incentives to ‘license’ their works as inputs into AI learning sets, or something like that.

No substitute for creativity

The series of episodes from the Corridor Crew (and reactions from professional animators to them (1, 2) surrounding their production of the Anime Rock, Paper, Scissors short that I think are telling:

The Crew used AI generated imagery to convert a (relatively) low-budget live-action performance into an anime short. The effect and quality were outstanding in my view (despite the issues that the Crew highlight throughout).

Apart from just being a great piece of work, what is interesting to me about this series is two fold:

The Crew had to do a lot of R&D, trial and error on the process and tooling to get the result. This is a shift in the capabilities that are required to operate and succeed in this new world—away from the more traditional skills of illustration to one of building a workflow and production pipeline to combine tools.
The Crew are also very talented storytellers, and genuine fans of the anime style. The stylistic decisions, the scripting, the performance, the framing choices—all of these things required a high degree of skill and talent in and of themselves, separate to the specific act and art of illustration.

In one sense this is a democratisation of the production process. Similar to how digital tools mean a lot of talented people in the music world, who couldn’t work out how to wrestle with a patch bay or spend hundreds of thousands of dollars on mixing consoles and outboard gear, can now produce professional quality product for a LOT less money, and without having to learn the ins and outs of the hardware technology.

I understand both worlds (analog audio hardware and digital workflows) and I can honestly say the digital tools are a lot more powerful and overall easier to use than the traditional models. (There’s still a place for the traditional world, IMO, but it is more niche, and digital is definitely more prevalent).

But note that the change of tools and workflow doesn’t negate the need for a creative vision, talent and hard work to get a quality result.

But there will be ‘losers’ in this process, for sure. And it means that creatives that ‘just want to make their art’, like musicians (and many other industries) before them, are going to have to skill up on a whole raft of stuff that they may not have the inclination or propensity to do so.

It’s going to be a tough transition for a lot of folks…

Tipping point, or slippery slope…

It feels like we’re definitely at a tipping point. A lot of folks are comparing this to the emergence of Napster, and I think that’s fair. But it’s also like the emergence of Photoshop, and Protools, and Premiere. These were industry and profession-shaking technologies that radically changed the landscape, and impacted livelihoods and the creative process. Not all of that change is good… but not all of it is bad either.

That said, the horse has most definitely bolted. This tech is not going back into the bottle. Companies are not going to stop developing them, even though some significant folks are calling for exactly that.

I think that we need to start thinking about mitigation measures, like some of the things I hint at above. I know that a lot of great minds are on the case. So I’m super-interested to see what emerges in the coming months and years.