Depths and Shallows
On AI Generated Images and the History, Theory, and Theology of Human Cultural Production
Within major historical periods, along with changes in the overall mode of being of the human collective, there are also changes in the manner of its sense perception. The manner in which human sense perception is organized, the medium in which it occurs, is dictated not only naturally but also historically.
Walter Benjamin, The Work of Art in the Age of Mechanical Reproduction
Let me begin with a prefatory apology: I am a little hesitant to undertake a critique of AI-generated cultural products, images in particular: I am neither an art historian nor, despite my close اassociation over the last few years with machine-learning projects (albeit not of the flashy generative AI sort) am I anything like an expert in artificial intelligence. What I am is an inveterate user of the internet and have been for close to three decades now, even as I have tried to moderate my habit (my priest has helped on that front—I’m off of twitter for Lent per his instructions, and might end up disconnecting entirely one of these days). As such I’ve been privy up-close to the transformations of those decades, including of the last few years, and feel that I have some insights to offer. So what follows here are some more or less connected thoughts on generative AI and its place in the history of human cultural production, in particular visual culture, though the textual and sonic will crop up as we go. More than usual I should say these are provisional remarks, as both the technology itself continues to evolve but much more importantly the socio-cultural contexts and repercussions, the ‘changes of sense perception’ to which Walter Benjamin refers in his foundational essay (to which we will have occasional recourse below) are still emerging.
Before generative AI, the most profound revolution in human visual—and textual and aural and musical—culture and communication was of course the long ‘revolution’ of mechanical mass reproduction. ‘Revolution’ is not entirely apt we now know given that the introduction of moveable typographic printing (or, more or less roughly comparably in East Asia, the earlier introduction of wooden block printing) was not immediately revolutionary, either in its original West and Central European context or in other milieus into which it would be introduced in subsequent centuries. Yet in other respects it did prove revolutionary, being both a chronological prior step in future industrialization and mass production and as a genuinely causative first step that laid the ground for future developments. By the middle of the nineteenth century texts and images were being mass reproduced technologically in every corner of the earth, the technologies in question ranging from cheap and straightforward lithographic text prints to enormous steam-powered mechanical typographic presses churning out volumes of text at rates exponentially greater than at any previous period in human history. With the emergence of photography, and, later, film and audio recording, machines became the direct mediators of human perception in a way that printing technologies could not claim.
The implications of this steadily gathering stream of mass reproduction of cultural products (which, if we are being properly capacious, should also include clothing, furniture, household objects, food—all carriers of cultural significance too, if in different fashion from text, image, and sound) are vast and still coming into focus; they could and have filled scholarly monographs (most of which do not get mass produced…). The feature I will stress here though is the reproducibility aspect: if in the pre-modern world every text, image, or sonic performance was at some level irreducibly unique, mass mechanical reproduction created a world of virtually identical objects that could be reproduced ad infinitum with little to no internal transformations.1 Before mechanization, human cultural productions were in a basic continuity with the wider dynamics of other living things, changing and mutating and adapting to their environments at rates slow and fast, subtle and dramatic. If one of the major thrusts of industrial modernity is the severing—attempted or realized—of human existence from the usual exigencies and dynamics of the wider natural world, we might say that the arresting of textual, image, and sonic development through mass reproduction was a crucial first step in that process.

AI-generated cultural products—if that is the right term, and probably it is not—occupy a sort of uncanny valley between the essentially unique productions typical of most pre-typographic human culture on the one hand and on the other the mass produced, endlessly reproducible and stable products of the modern world.2 In one sense, the AI generated digital object is in each instance a unique product which will not be precisely replicated through future prompts or on other platforms; there is a chronological instability, as well as an internal instability (AI can make what appear to be quite random and unanticipated errors that will not reproduce from one instance to another). Yet in terms of the aesthetic vocabulary, the ‘aura’ (or, if we are to remain true to Benjamin’s use of that term, the ‘pseudo-aura’), perhaps better the ‘vibes’ conveyed, there is a striking homogeneity, a uniformity that for now at least is immediately recognizable, even in the most technically proficient AI image generators (or text or sonic generators). If in real human culture—even in mass reproduced culture—every discrete instance stands at the end of a long chain of other productions, of accumulated skill and knowledge and social connectivity embodied in human beings, there is a massive chasm between each instance of AI production and the human culture upon which it is ultimately based.
There is a connection, of sorts, between AI generated imagery and actual human practices of image generation. No one really creates new images out of thin air; every picture, every piece of art, draws upon in some way the world outside of the image, and most images deliberately or otherwise are linked to previous ones, sometimes in very long genealogies indeed. Innovation and change take place in a tumultuous dialogue of past images, present sensory experience, and many other factors internal to the artist, to her milieu, and other things too numerous to list.
Generative AI also draws upon existing imagery (or texts or sonic material). A vast body of training data, accumulated from the nearly infinite archives of the internet, informs that imagery that generative AI produces. But where the human artist has an ultimately unique convergence of forces and factors—no single artist, even in the same workshop and with the same education and social life, will be precisely the same in terms of what he or she has seen, retained, what elements of inner life and emotional state come to play—AI is standardized, regulated by verbal prompts and conditioned by precise metrics. The process is fundamentally, inextricably, quantified. It lacks contingency, to the extent that we might almost say that it is at some level almost an act of rebellion against the deep structure of the created cosmos, which is predicated upon freedom and contingency, the possibility of unpredictable development and movement.
Ultimately there is nothing really ‘behind’ AI products, they are by necessity shallows, thin gruel. Or rather we might say everything is behind them, and hence nothing: anything produced by a particular human has ‘behind it’ an essentially unique and irreduciable world of experiences, referents, ideas, and so forth, all of which contribute to the ‘personality’ or ‘aura,’ the backstory, the context, the connectivity to the human (and the other-than-human) that is embodied to some degree in everything that our hands produce. Mass reproduction might thin this underlying dynamic out, as it were, but it does not utterly negate it. AI production is an almost complete removal of the human and of history and contingency, at least as we have otherwise understood and experienced those things. To put it a bit more baldly: ‘behind’ that image above of hand prints on a cave wall is a real cave wall somewhere with real hand prints upon it, placed there fifteen, twenty thousand years ago by real humans who chose those exact locations based upon a complex array of exigencies, accumulated cultural and religious norms, personal needs, and who knows what else. More on the interplay of place and process below—for now let us stress that no AI image (or text, or sound) can in fact have any such concrete lying behind it, but only the similitude, an image that automatically and without human intention makes use of images of the real thing to create an approximation (and, as the below image generated using Grok shows, at the moment not a very good approximation!).

There are many more potential contrasts to be drawn out between AI production and deeply sedimented examples from the far human past such as Paleolithic hand prints and other forms of ‘cave art,’ but one that is most potent I think is the fact that Paleolithic subterranean art in particular was clearly marked by its material location and by its process, by the act of producing the art and of participating in it. As for the first, the actual physical location—not just underground, but often deep underground—was not just or primarily, pace Benjamin’s argument on this front, a way of keeping the imagery away from human eyes, but rather a way of shaping human participation in that imagery. Whatever else Paleolithic cave imagery (some of which we would describe as ‘abstract,’ ‘geometric,’ or ‘figurative’) was supposed to ‘do,’ and for whatever audiences—human and more-than-human—the presence of people at certain times was definitely part of it. Entire communities, children included, made their way into these caves, leaving footprints in the clay and those hand outlines on the walls, traces of men, women, and children. The act of entry and production was itself almost certainly a ritual process, and it was the entirety of these components—difficult of access, limitations of light and viewability, individual and communal journeys into the ‘galleries,’ and the artwork itself made, viewed, and modified—that mattered. Place of production and the process of production were not incidentals, but fundamental.
This basic ‘package’ in which place and process were intimately connected and constitutive of the power and meaning of ‘art’ (be it visual, auditory, textual, what have you) persisted up until the very recent past (and, to be sure, survives if often in diminished and hybridized forms to the present). The story of how that package has fared in the industrial age has been told elsewhere, including in these digital pages. What the increasing dominance of AI-generated cultural products will mean for the continuation of the story remains to be seen, but I think we can safely hazard some guesses. If mechanical reproduction has, generally speaking, reduced the power and importance of place and process, AI-generated products have accelerated the disintegration of that constitutive package, all but eliminating both discrete places of production and interaction and nearly obliterating any individual or social process of production.
It is not that AI-generated art (or writing, for that matter) has no participatory or social aspect. In terms of participatory process of creation, it loops back behind, as it were, the distancing and massification of industrial image production and reproduction: anyone with an internet connection can now call up an image or text or audio generation engine and ‘create’ cultural products ad infinitum (in theory and payment pending anyway). Of course the actual bounds of participation are sharply limited: they consist of entering textual prompts and adjusting the parameters, then selecting a preferred generated object.
As for the social, while it is certainly too early to argue for a general sociological theory of AI-generated imagery, there are certainly usages that have stood out over the last couple of years. In particularly AI-generated imagery has become something of a fixture of the reactionary right, from the uncanny images of imagined ‘trad’ Catholic parents with a half dozen weirdly appendaged children perched around them to the deluge of images of Donald Trump rescuing cats and dogs from Haitians during the weeks in which his campaign was busy falsely accusing migrants of stealing and eating pets. In recent weeks the world was treated to the spectacle of J.D. Vance shilling for American AI companies in Europe, as part of the administration’s push for ‘AI supremacy.’ And so on. At first glance the embrace by ostensibly ‘conservative’ political tendencies of the most novel and seemingly disruptive technology available right now seems odd. But we ought not be surprised: to return again to Benjamin’s seminal essay, he notes the enthusiasm the reactionaries of his day had for the revolutionary medium of film; it would take an additional essay to explore these dynamics. For the moment it is perhaps sufficient to note that today’s reactionaries embrace AI-generated imagery in no small part because they do not have anything else to go on. There were no actual images of Haitians stealing and eating pets because those claims were just flat our false; generative AI however can produce quasi-veridical images of quite literally anything. And so these images have entered into particular social contexts, a single image or style stabilizing long enough to have common currency in online circles, albeit a very ephemeral one.3
Despite these social and political uses that have begun to emerge, it remains very much true that the generation and use of AI images lack the kind of tangible, place-based interaction that has typified ‘ritual’ art for so long, and even ‘secular’ art of more recent centuries. The art of Benjamin’s ‘age of mechanical reproduction,’ after all, started life at the literal hand of a particular human or group of humans in a particular place, very often a studio of some sort, and then was dissimulated in many different ways, including to places of communal interaction—the movie theatre for instance. In general it was for public consumption—whereas one of the emergent features of generative AI is to produce the sort of material that previously would have been mass produced for broad audiences, only now ‘personalized’ using a particularized body of training data or prompts. You can spin up a podcast about whatever topic, a podcast only you will ever hear. Mass mechanical production realized for the fully atomized individual, we might say.
Let’s not beat around the bush: the diminishing of participation threatens basic human capacity; our abilities to make things, to imagine, to create, to think, are all threatened in one way or another by these technologies. AI imagery depends upon previous human creativity and skill, from the original artists to the people who digitized and distributed the imagery that has become the vast trove of training data for these applications. But as it becomes more ubiquitous it threatens to undermine the very skills and aptitudes that created that training data originally: as with generative text, the set of skills, of manual habits and self-training, necessary are much more limited, and will tend almost certainly to develop a condition of dependence. Hopefully I am wrong about this, but the likely atrophying of skills and habitus and practice does not bode well for the near future of human cultural production and social life. It is very unfortunate: generative AI is only a subset of the technical possibilities inherent in machine learning advances of recent years, many of which have great promise for the humane arts and sciences—but only insofar as other human skills and knowledge bases remain vital and active. The best and most productive applications of machine learning processes are not shortcuts or technical magic; they require larger robust systems and human skills, in particular the ability to think through things and to ask complex and knowledgeable questions informed by deep study and habituation, all things against which the ocean of generative AI (especially on the textual level) presents an existential threat.
Finally, a sort of postscript for future query: Yesterday was the Sunday of Orthodoxy, a day on the liturgical calendar of Lent in which we commemorate the restoration of the icons after a long period in the Byzantine church’s history in which icons were suppressed. Iconography and its history have been examined from many angles indeed, and I will not attempt to replicate those debates or discussions here. Rather, what interests me is the way in which the now long-established Orthodox theology and practice of the icon intersects with and has been shaped by the historical realities described above.
We tend not to think about it very much, but it is certainly the case that our collective and individual relationship to icons has changed over the last century and a half, in the sense that icons, like pretty much every thing else in human cultural production, have been subject to industrial-scale mechanical reproduction and capitalist distribution. I don’t know that anyone has done a quantifiable study on it but I am quite confident that the sheer number of icons—be they lithographs, cheap photocopy prints, more sophisticated productions, and so forth—on the average believer’s wall or in the average parish exploded over the last couple of hundred of years. Mass reproduction has certainly transformed quantity, and integrated markets and technology into iconography more markedly than in the past. That said I think it is safe to argue, iconography has not been totally captured by either. Icons tend to evade the limitations of the market and of capitalist production and legal norms, continuing to occupy a liminal space in more ways than one.
Still, I do not think that we as Orthodox believers—or, for that matter, any Christians who accept the conclusions of Second Niceae and who make use of iconography—have really grappled with the implications of mass reproduction, and we certainly have only begun to notice or think about generative AI and the role it should or should not play in iconography and ritual practice. This essay has already stretched on too long—if you’re still reading, thank you for making it this far!—so suffice to say what I think we can offer is a theology of the image/Image ‘updated’ in light of these escalating transformations, not necessarily resulting in total rejection but rather in the reworking of these transformations and their products in light of the theology of the icon. In particular one thing we stress in looking at and with icons is the importance of depth: some of this is historical, with icons being produced by humans in intimate contact with artistic and spiritual flows of transmission, motifs and colors and styles being reproduced and modified and enacted over many centuries, accumulating place and process and meaning as they go. Any icon has, as a matter of fact, some iteration of that sedimented depth of human community and spiritual significance. It also possesses in practice and perception a depth of connection, acting as a proverbial ‘window to heaven,’ linking imaginatively and spiritually viewer and the ultimate objects of view, all leading back, finally, to God Himself at the end of the visual-imaginative-participatory chain.
There is more to be said, including the degree to which mass reproduction changes the social and other dynamics of ‘sacred art,’ to say nothing of whether or not we should be using AI-generated icons (I think not, for reasons that by now should be clear). But I’ve hopefully provided enough food for thought already, and look forward to continuing this conversation in the months and years to come. These are fraught times, but they are also good times to better understand the world that has produced them and our place in it, such that we do not simply react to things, but can actively and deliberately shape them for the good.
In the pre-modern, pre-industrial world, there were in some places a scattering of prefigurations of this sort of mass reproducibility, though the differences are striking (pun intended as you will see): coins were in theory all alike, having been struck at an authorized mint with an authorized image; in practice, however, coins were regularly depreciated physically, to say nothing of the deformations that regular usage worked upon coins. Other objects produced with molds or their analogues—pots, statues, sporadic instances of block printing (a type of mold we might say)—tended to either rapidly deform under use or had highly niche purposes and contexts. None approached the ubiquity of, say, the printed book, and all depended upon matrices that could not be mechanically reproduced in perfect continuity ad infinitum.
Another rabbit trail we cannot pursue at length here but can only mark out for the future: there are occasional break-downs in mechnical perfection, resulting in usually quite small bodies of products that are distinct from the mass; the specific stock of toy cars with different colors or wheels or whatever compared to all the rest, the limited run edition (deliberate but building off of the dynamic of differentiation), and so forth. These are of course the sorts of things valued by collectors…
It is worth contemplating in this regard Benjamin’s argument from nearly a century ago—there are points of divergence but on the whole much of his analysis can be applied to the intersection of contemporary techno-reaction and AI ‘art’: ‘Fascism seeks to organize the newly emergent proletarianized masses without touching the property relations that those masses are so urgently trying to abolish. Fascism sees its salvation in allowing the masses to find their voice (not, of course, to receive their due). The masses have a right to see the ownership structure changed: Fascism seeks to give them a voice in retaining that structure unaltered. Fascism leads logically to an aestheticization of political life… All efforts to aestheticize politics culminates in one point. That one point is war. War, and war only, makes it possible to give mass movements on a colossal scale a goal, while retaining the traditional ownership structure. That is how the situation looks from a political viewpoint. From the viewpoint of technology it looks like this: Only war makes it possible to mobilize all the technological resources of the present day while retaining the ownership structure.’
Walter Benjamin, The Work of Art in the Age of Mechanical Reproduction, trans. by J. A. Underwood, 35-36.