The Generator is The Message: What is AI-native media?

An experimental blog post by fabian stelzer, built with glif, an all prompts AI sandbox for everyone.

What does it mean for media to be “AI-native”? As in: if generative AI constitutes the paradigm shift we think it does, what are the "new new" native formats it enables and unlocks? A simple answer here may be that we just let AI generate the content, as in: “hey ChatGPT, write me a brief paragraph about what it means for media to be AI-native”, and we'd get something like this:

AI-native media refers to content that is intentionally created or optimized using artificial intelligence technologies. It involves leveraging AI for content creation, personalization, and immersive experiences, revolutionizing the way we consume media.

ChatGPT's answer here already implies that simply generating the content with AI may not quite fully capture what "AI native" could mean - after all, an LLM generated article is still simply an article - a bunch of static words, dynamically rendered in the mind of you, the dear reader. The media format itself hasn't really changed: you're still looking at a blog post, no matter how revolutionary my method of generating it may have been. Thus, we're not yet utilizing the fact that we're dealing with a generative form of intelligence, able to create things for us on the fly!

So let's take this a step further - what if we use AI to let you, the reader, co-shape this article's final form? I'm going to use glif - an AI sandbox I'm working on to generate a few ways for you to continue this article for a bit:

This generates an interesting question:

Who authored this, uhm, article (?) so far? Still me? You and me? Me, you and ChatGPT? Is it still a blog post if it looks different every time someone...runs it?
Arguably, we're now not looking at one, but two intertwined media objects:

1. A collaboratively created dynamic piece of media, different with each run, as it appears to you now. Let's call this -> "The Run".

2. An authored, designed media script that enables a co-creative 'play mode' with the article, made by a single author, i. e. myself. Let's call this -> "The Generator".

This two-fold structure applies to all AI media, including ChatGPT - each user session is a Run, while ChatGPT itself is a highly flexible Generator programmed to work in specific ways (i. e. trying to prevent you from producing harmful runs, or making it easy for you to generate useful outputs).

In the context of "prompts", this means that we're actually always working with 2 classes of prompts: creator prompts, built into the Generator by its author, and user prompts, which only get injected at the "Run" stage of an interaction.

For example, the Generator box above restricts the latent space of continuations to two pre-provided options - it's choose-your-own-adventure with a fairly limited page count.

Let's take this a step further: instead of limiting the space of continuations, we can use our generator/glif to let you explore this blog post into any direction you want. Just tell me (?) how you'd like the article to continue and conclude, e. g. 'What would McLuhan think about this?', or 'continue this thought in the style of Borges':

This creator-consumer permeability of media isn't restricted to text - we can do the same with images (or videos, or audio) of course! If creation is as seamless as writing a prompt or pushing a button, why wouldn't it become a natural part of consumption?

This has interesting implications for storytelling, IP and even branding. For example, here's a glif that let's you redesign the glif logo in whatever texture you want (try "mahagony wood", or "sushi"). You can view the StableDiffusion + ControlNet + Prompt chain for this generator in this glif (hit "remix"!). Just enter a simple prompt like "sushi" to reskin the logo, while maintaining its general design and structure:

Both the image and text generators above are now part of a single collaboratively created media object, and while I can let you, the reader/player, steer their outputs into many different directions, I, the author, broadly retain control over the guard rails - that strikes me as a new and pretty neat concept of interaction, where the line between creator and consumer gets blurred through generative means. If you wish to interact with this in a more trad way, I'm mostly posting here.

I don't write code - so this entire post was made possible by GPT-4 and my own company, glif, which lets me instantly spin up all types of AI generators in a no code / all prompts environment. Each glif / generator is an API endpoint, so it's easy to integrate anywhere. You should try it!