Sight and Sound Conspire: Monstrous Audio-Vision in James Whale’s Frankenstein

Creator's Statement

The video essay “Sight and Sound Conspire” offers a meditation, in three acts, on the relations between the visual and the auditory in James Whale’s classic horror film Frankenstein. Introducing the iconic image of the creature (played by Boris Karloff, realized by makeup artist Jack Pierce, and long defended as a visual trademark by Universal Studios), Whale’s film drew its power to frighten audiences from the particular relations that images and sounds had with respect to one another in the early years of sound cinema.

According to film historian Donald Crafton, the transition to the talkies was largely over by 1931. But as Robert Spadoni argues, the horror genre, which emerged after the transition’s completion, captured some of the energies of the transitional period and preserved them into the era of sound. Between about 1926 and 1931, while the transition was still in full swing, technical glitches were common and the sight--and sound--of human figures speaking on the big screen was still more or less novel; unaccustomed audiences, as Spadoni documents, reported experiencing these figures as “ghostly” or “uncanny bodies.” Horror films, beginning in 1931 with Dracula and continuing with Frankenstein later that year, capitalized on these experiences, which were already beginning to fade from audiences’ memories, and transformed them into the literally uncanny bodies of vampires, werewolves, mummies, and other monsters. Films like Dracula, Frankenstein, and The Invisible Man (1933) played centrally with the relations between the visible and the invisible, and the audible and the inaudible, offering them up in a variety of permutations: the vampire’s body (in human form) was uncannily silent and under certain circumstances invisible (e.g. when viewed in a mirror); the Invisible Man, on the other hand, was completely invisible while extremely noisy; and Frankenstein’s monster was preeminently visible (an instant visual icon) but uncannily mute. Horror, in other words, was generated in and through bodies that enacted (or re-enacted, as it were) the clash of silent and sound-era cinema, along with the uncertain, often volatile relations between sight and sound that obtained during the transitional period.

This video essay seeks to uncover something of this sedimented experience of early horror through a concentrated look at one key scene in Whale’s Frankenstein: the scene in which the newly animated creature makes his first onscreen appearance. Just a few minutes prior, the creation scene, with its noisy electrical equipment ramped up to full capacity, drew attention to the sonic dimension of the film while playing self-reflexively on the relation between diegetic and filmic acts of “animation”: the monster, stitched together from corpses and raised up and exposed to the life-giving flashes of lightning, resembles formally the “dead” photographic traces that are routinely animated--infused with life--through the apparatus of cinema itself. But now, in the scene analyzed in the video essay, the monster’s relation to film is problematized. The monster’s approaching footsteps grow first louder and then go silent, as the monster--who we will soon find out is incapable of speech (unlike the eloquent creature in Mary Shelley’s novel)--backs awkwardly into the room before his iconic head is thrust towards us in a series of jump cuts. This visual assault fixes the iconic image of the monster seemingly forever, while the creature’s muteness embodies a weird re-entry of silent cinema into the world of the talkies.

If “Act 1” of the video essay uses voiceover to suggest the film’s highly self-reflexive interplay and to recover (something of) the experience of watching Frankenstein in the light of the sound-film transition, “Act 2” shifts gears and reimagines it counterfactually as a silent film. In this way, the video essay seeks to foreground the role of sound, and its relation to the image, in a more direct (if paradoxical) manner, less by telling than by showing--or whatever the aural equivalent of showing might be. Indeed, it is not only image/sound relations in Frankenstein, but also image/sound relations in the video essay itself, that are potentially complicated in this manner. Finally, “Act 3” looks once more at precisely the same scene, but this time as it was re-worked in the 1957 German dubbing of the film. A careful examination of the German soundtrack reveals subtle but important differences that highlight the historical and cultural anchoring of the uncanny experience of transition that was instrumentalized in Whale’s film--and hence motions towards the particular social and material circumstances from which the horror genre emerged. Formally, the comparison between the 1931 original and the 1957 dub poses a challenge to the videographic medium: splitscreen techniques are excellent for making visual differences apparent, but how do we visualize sonic differences? The approach taken here, whereby a digital VU meter supplements the image with a visualization of sound levels in the two versions, offers an admittedly crude approach to the problem. But perhaps the crudeness of the instrument--which is more a blunt axe than a surgical scalpel--will be seen to resonate with the uncanny object offered up here for dissection. In any case, it is hoped that the problematic nature of the visualization can function nevertheless as productively problematic. Together with the first two acts, this one as well seeks to re-focus visual attention onto the sonic--and vice versa: to displace the aural dimension onto filmic images--so that in the video essay, just as in Whale’s Frankenstein, an uneasy and unsettled relation emerges: for the purposes of a phenomenologically invested film scholarship, as for the purposes of filmic horror under videographic investigation, sight and sound conspire…

Crafton, Donald. The Talkies: American Cinema’s Transition to Sound, 1926-1931. Berkeley: University of California Press, 1999.

Denson, Shane. Postnaturalism: Frankenstein, Film, and the Anthropotechnical Interface. Bielefeld: Transcript-Verlag/Columbia University Press, 2014.

Spadoni, Robert. Uncanny Bodies: The Coming of Sound Film and the Origins of the Horror Genre. Berkeley: University of California Press, 2007.


This video was produced out of the “Scholarship in Sound and Image” workshop at Middlebury College, June 2015, as funded by the National Endowment for the Humanities Office of Digital Humanities. Any views, findings, conclusions, or recommendations expressed in this video do not necessarily reflect those of the National Endowment for the Humanities.

Review by Steven Shaviro

This is a thoughtful and valuable video essay. Denson argues that the process of transition from silent film to talkies, although it had mostly been completed by Hollywood (and for Hollywood audiences) by 1931, still resonates (to use a sonic metaphor) in the construction of James Whale's Frankenstein, made that year. Silent" films of course always had musical accompaniment; sound film was distinguished by the incorporation, both of the human voice (direct and indirect speech) and by sound effects. Denson mentions in his written statement that, according to Robert Spadoni, audiences not yet fully accustomed to sound film "reported experiencing these figures [of human bodies talking] as 'ghostly' or 'uncanny bodies.'" By 1931, this sort of uncanny experience had faded, as a result of audiences becoming more habituated to "talking pictures." But Denson suggests that Universal's early 1930s horror pictures renewed or perpetuated the uncanniness of early sound film by presenting the (more literally) uncanny bodies of monsters (whether Frankenstein's creature, the Invisible Man, or Dracula).

The video essay gives an effective demonstration of this hypothesis by going over the scene in which Karloff's monster appears for the first time. We hear his footsteps before he appears to us in a memorable way, by first coming in the door with his back to us, then slowly turning around while the camera jump cuts to closer and closer shots of his face. Denson makes the point that hearing the footsteps, with the appearance of the figure still unknown gives us a sense of uncanniness; when Karloff actually does appear, things grow quiet (there are no more footsteps), and his visual presence, filling the screen without speech -- introducing the motif that the monster will be mute throughout the film -- works like a sort of flashback to the silent film era. Denson demonstrates the importance of sound thus "conspiring" with vision by showing the scene three times: the first, with voiceover commentary; the second, with silent-film music on the soundtrack, thus transforming Frankenstein into a silent film; and the third, by juxtaposing the original scene with that in a much later (1957) German-language dubbed version. In the latter treatment, the sound of the footsteps continues even as Karloff appears.

Denson's video essay is powerful and accomplished, and very effective in demonstrating its thesis by giving these three versions of the same scene: the initial analysis is usefully supplemented by what are, in effect, counterfactual renditions. The video essay does what it should -- which is to demonstrate something that needs to be seen and heard to be understood, and hence cannot be conveyed as well by a print essay alone. And it is ingenious, and I think, successful, in how it specifically foregrounds the use of sound (it seems to me that the genre of video essay in general is at this point much more image-centric than sound-centric; this may be more or less built into how the form works, with spoken or written commentary layered over moving images; but Denson has successfully figured out how to get the video essay to focus on sound instead).

I do have a few reservations, which have to do with further implications of Denson's analysis. One thing is that Spadoni's observation about uncanny bodies in transitional films seems quite important to Denson's overall argument, yet it appears only in the printed explanation and not in the video essay itself. It would not be clear to me, just from the video, how the weirdness of Karloff's monster relates to the general estrangement produced by the transition from silent to sound.

And for another thing, although Denson speaks of the relation between visible and invisible, as well as between audible and inaudible, his video essay doesn't really address the question of how sounds on the one hand, and images on the other, are related. This is of course a central issue for cinema in general. But it is especially important in relation to the transition from silents to talkies (cf. the importance to audiences of the sounds of gunshots in early gangster talkies like Scarface and Public Enemy). And it is also especially crucial to the entire history of the horror genre. In countless horror films, we hear scary sounds whose sources we cannot (or cannot yet) see, including footsteps coming closer and closer. It's a major source of anticipation and dread. Fear of the unknown, as Lovecraft said, is the oldest and strongest sort of fear. In such cases, hearing does not have the same epistemological authority as sight; even when the sounds themselves are definite, they do not convey the nature of the entity making them in the way that images do. It seems to me that the scene analyzed in this video is playing with this dynamic, and it needs some explicit commentary. This is all the more the case in that, as Denson notes, the monster does not and cannot speak, either in this scene or in the movie as a whole.

I am not sure that I would recommend that the video be revised to include my second point, because it would probably require an entire additional video essay to do it justice. So one can take these criticisms not as requests for revision -- I think the video essay is worthwhile as it stands -- but as an example of how Denson's work usefully and powerfully leads us to additional questions.

I am of two minds when it comes to Shane Denson's piece "Sight and Sound Conspire: Monstrous Audio-Vision in James Whale’s Frankenstein." On one hand, the piece productively analyzes one scene through a variety of different lenses: as an introduction to certain formal tropes of the horror genre (the binaries between the visible and invisible, and the audible and inaudible), to a historical analysis of how Frankenstein exemplifies and bridges the gap between silent and sound filmmaking, to an analysis of how the act of dubbing the film--altering the film's sound effects in the process--creates a sensorial shift in the scene's meaning. On the other hand, I could not help but want more. Due to its modest focus on one scene and the brevity of the overall piece (especially when we realize that it's covering the theoretical and historical waterfront), the video feels a bit like an introduction to a larger project.

What I appreciate about this video is the value such a piece can have in the classroom, a venue in which I find that students often crave an absolute interpretation of a film. Denson's piece suggests that we can look at one scene in three profoundly different ways, each of which has its own value. In other words, it showcases that the methodologies used to analyze a film are not in competition with one another, and that is a pedagogical gift. Moreover, Denson depicts the co-existence of these modes of analysis formally within the piece. Like the thesis of his work (that sight and sound conspire to produce an otherworldly sensation in the viewer), Denson engages in both senses strategically by achieving the difficult balance between the two channels of expression. While the piece begins rather classically with the use of voice-over narration to break down the scene in conjunction with freeze frames to synchronize the two, his use of silence and theoretical extra-diegetic intertitles illustrates his argument perfectly in the second section. Finally, the use of the side-by-side comparison and the audio level meter helps the viewer comprehend the differences between the original soundtrack and the German dub with a specificity not offered by viewer memory. In short, I appreciated the formal creativity that Denson engages in here.

That being said, I have been thinking a lot about length and the purpose of videographic criticism. Recently, I made a series of videographic works about film noir for Press Play. My motivation was to make a teaching resource that could introduce the casual viewer to a range of films and some scholarly frameworks for analyzing them. As the project progressed, my editors pushed me to make the pieces shorter. Their rationale was pragmatic: the audience--much wider and less likely to be versed in scholarship--was more likely to engage with pieces that have the same length as a pop single. While I initially balked at this suggestion, there was one unforeseen benefit that arose from making short videographic episodes versus long-form, self-contained works: audience feedback. As the series progressed, I was able to address their questions or criticisms. I became aware--like a teacher in the classroom--of what knowledge my general audience needed.

If the audience for this piece is meant to be the uninitiated, it succeeds perfectly in introducing the viewer to the film and three potent and interrelated frameworks of analysis. It is concise and the formal experimentation gives it a certain momentum and energy. However, if Denson imagines his audience to be peers and colleagues, I would love to see him explore the film in the context of larger debates about the horror genre. For instance, how might Denson's analysis of the various binaries inherent in Frankenstein fit into Noël Carroll's account of the various paradoxes of the genre in The Philosophy of Horror? I am not suggesting that Denson needs to revise this piece or that his choice in scope is flawed. However, it did prompt me to reflect upon the idea that if videographic criticism wants to both swing for the fences with regard to philosophical rigor and appeal both to academics and a broader, uninitiated audience, perhaps a series of shorts is ideal as it can engage more concretely in a dialog with the viewer.