The Digital Crowd

Curator's Note

Many of the crowds you see in movies and on television today are digitally simulated. In the wake of a global pandemic, these “fake crowds” swell sports stadiums during games, fill empty seats at rock concerts, and populate political rallies. The visual technology of crowd simulation has historically been developed in cinema: the simulated crowd was first used in film (and then video games) to create physically impossible gatherings in fiction genres. As filmmakers today turn to visual effects technicians to supply large numbers of “digital extras” for spectacular sequences, such as this one from S.S. Rajamouli’s RRR (2022, India), crowd simulation has now supplanted the traditional method for summoning numbers: the crowd-casting of “extras” in filmmaking (Image 1 of 2).

Since the early 2000s, digital crowds have spread quickly, helping to realize the generic conventions of historical epics, science fiction, and fantasy film and television without the costs, time, and risks associated with physical filmmaking. From the thronged stands of the Colosseum in Ridley Scott’s Gladiator (2000), to the armies of orcs in Peter Jackson’s Lord of the Rings (2001–2003), to the zombies of World War Z (2013) and the White Walkers of HBO’s Game of Thrones (2011–2019), the digital crowd seems to emerge spontaneously in response to spectacular story events. But the sheer size of these crowds on-screen means that they too function somewhat as spectacles: they are impressive and fearsome screen forms for us to gawk at. In her study of computer-generated imagery in contemporary cinema, Kristen Whissel reads the simulated crowd as the “digital multitude,” an emblematic formation that intimates the sublime. Since they are often made up of orcs, zombies, and other non-human monsters, this unindividuated and dark gathering is shown to pose a threat to humanity and historical time. More recently, Drew Ayers has extended Whissel’s reading of the digital crowd as the sublime. Ayers argues that the digital crowd’s effortless appearance on-screen denotes a new stage of historical development in film industries, wherein virtual effects software like MASSIVE (Multiple Agent Simulation System in Virtual Environment) proclaim their superiority over traditional crowd work. Whissel and Ayers may be said to have offered us a way to conceive of digitally simulated crowds as an informational sublime, heralding at once the revolutions of a post-human history within the story world and the revolutionary capacities of digital media technologies.

The digital crowd, such as this one created for RRR by the Hyderabad-based VFX studio Makuta VFX, is rendered by stochastic algorithms and particle systems. In particle systems, each particle is assigned values from pre-defined sets; in the case of the crowd, one of those values comes from a pre-defined set of possible movements, such as ambling, standing still, sitting, turning, or hand-raising. These values are assigned randomly by the algorithm as it runs in real-time. Essentially, objects created by particle systems seem alive because each small trajectory emerges without human design. As Jordan Schonig puts it, “objects produced by particle systems have the appearance of contingency partly because each tiny trajectory lies outside” human control; “set in motion rather than moved, such objects retain a degree of independence” from that control (Schonig, 2022: 34). Further, such particle systems are agnostic about their objects: the same system that animates digital extras in the crowd can animate dust, snow, rain, or leaves.  Thus, Ayers writes: “In contrast to human crowds, digital crowds don’t so much indicate the ability of the [film] industry to manage labor as they indicate the industry’s ability to manage information” (Ayers, 2019: 140).

A question arises: might we understand the management of information as the management of labor? Recall that I said RRR’s crowd is made up of so many digital extras: figures with no existence outside their algorithmic implementation by crowd simulation programs at Makuta VFX. This is only partly true. The VFX program needs inputs as well: to assign values to particles at random, it also needs to know what those values will look like when assigned. In other words, to properly produce a visual image, it also needs to know what a digital extra looks like walking, standing, sitting, raising their arms, or running. These inputs can take one of two forms: they can be hand-drawn using an animation software, or they can be supplied from small movie clips generated using motion capture.

Motion capture is the practice of recording the physical movements of human bodies and using those movements to animate computer-generated bodies, thereby producing virtual movement on the screen. This is what motion capture is in a film like James Cameron’s Avatar, in which the slightest, even involuntary gestures of performing faces and bodies are captured. Indeed, movie stars engaged in motion capture may be highly compensated and spot-lit in press discourses surrounding a film’s production. Yet, the privileging of the movie star in these discourses obscures the way in which motion capture works. No matter how much the final character on-screen may look like the star on whom it is modeled, it is important to remember what the technology does: the process is, as Lisa Purse puts it, “an abstraction of the actor’s movements” (Purse, 2013: 56). A somewhat similar process was deployed for the production of the crowd in RRR.

Movements were filmed at a time and place months and miles removed from principal filming locations, ones performed by not an extra in the traditional sense, but truly in that of the Latin: outside, exterior, beyond (Image 2 of 2). These movements—in which a single performer, in costume, modeled the menu of possibilities, from walking to sitting to raising their hands or standing still—were captured on camera, then implanted inside a program which distributed those movements at random to digital agents. Therefore, generating the crowd here requires not the indifferent indexical capture of multitudes gathered before the camera on the day of filming, but the very careful scanning of a single body in isolation. This abstraction of motion recalls the studies of movement that precipitated the birth of cinema, themselves indebted to extractive technologies by which populations were stilled for the camera on plantations and in penal colonies. As gestures are expropriated from personhood to animate digital persons on-screen, traces of movement are isolated and extended by computational programs that operate outside of or indifferent to human temporal and spatial scales. In this way, flesh still matters to the digital: but in ways that do not fully equal the personal, the individual, or the subjective.

What that means is that a single human body, that of a paid performer, has supplied the graphically recorded trace of motion that moves many bodies—because software multiplies the trace of movement and assigns it to multiple agents—becoming a form of distribution, virality, contagion, suggestion, and imitation. In this production logic, the crowd shot may be populated by multiplying just one extra, a spawning with immense and unfolding implications for how film work is counted, measured, and compensated. The basis of the digital extra on-screen then is a real extra, but one by definition destined to remain outside the frame.



Whissel, Kristen. Spectacular Digital Effects: CGI and Contemporary Cinema. Duke University Press, 2014.

Ayers, Drew. Spectacular Posthumanism: The Digital Vernacular of Visual Effects. New York and London: Bloomsbury Publishing, 2019.

Schonig, Jordan. The Shape of Motion: Cinema and the Aesthetics of Movement. New York and London: Oxford University Press, 2022.

Purse, Lisa. Digital Imaging in Popular Cinema. Edinburgh: Edinburgh University Press. 2013.

Add new comment

Log in or register to add a comment.