Audiovisual Parameter Mapping in Music Visualizations

by Tina Frank, Lia

1 Electronic Music in a Live Context

2 Digital Code — The Shared Basis of Sound and Image

3 The Development of Pictorial Language in Live Visuals

4 Software Tools

5 The Live Factor


With the advent of affordable personal computers with processors and applications that were in a position to manipulate moving images in real time, the phenomena of sound and image, which were previously separate from each other in terms of media technology, could be linked by means of the algorithmic translation of auditory and visual parameters.

The way to this development was paved by the increasing dissemination of electronic music, during which the computer established itself as an instrument. The music was created digitally, and the available applications suggested experimenting with the linking up of digital sounds and images.

Initial attempts with early programs translated techniques from analog video into digital possibilities and manipulated the image parameters. However, programs were soon developed that contained generative processes and gave rise to an independent digital aesthetic. Sound could now be analyzed and broken up into wave bands and thus as data material provide the input for image-generating systems. Today, in the course of direct programmability, a variety of complex generation methods have emerged that continuously expand the spectrum of expressive means and allow their elaboration.

The initial application areas of digital music visualizations encompassed club visuals and live performances within the scope of concerts with electronic music. Today, they have fanned out to include events of all kinds. With the breakthrough of LED technology, more and more public space is being conquered. Numerous buildings and billboards have playable surfaces and are used for artistic interventions. In addition, contemporary artists are increasingly producing multimedia works on DVD or as installations.


1 Electronic Music in a Live Context

In the mid-1990s, personal computers came on the market that were so small, so easily portable, and at the same time so affordable that they enabled uncomplicated production of digital sounds in the context of live performances. Thus, the laptop became an easily available music instrument that, as such, not only revolutionized performance, but also the sound world of electronic live music.

Live performances by the first laptop musicians, such as Peter Rehberg alias Pita, General Magic and Farmers Manual from the Viennese record label Mego, the British duo Autechre, and musicians such as Carl Stone and Zbigniew Karkowski, featured interactive manipulation of sound processes. The visual reception of these live performances was static, however. Watching a performer simply operating a computer made it difficult for audiences to relate to the musical act of sound generation, and this created a visual vacuum.

Following an increase in processor and storage capacities, laptops could now be used to manipulate moving images in real time. Some musicians therefore began to collaborate with visual artists. Others developed their own methods for filling the visual vacuum at concerts.

2 Digital Code — The Shared Basis of Sound and Image

Sounds and images, which in terms of media technology are separate phenomena, are represented in the digital media by a shared binary code and described mathematically by means of numbers. This results in a fundamental transformability, which in contrast to analog transformation allows an algorithmic translation of auditory and visual parameters.

Computer music (music that consists of digital sounds and is generated on a computer) had already become common practice in live contexts in the early 1990s. By 1997, the first visualization programs as well as adequate processing power had become available, and these allowed artists to work with image sequences in real time on small, portable personal computers. Based on the principles of live improvisation in electronic music, visual artists began to manipulate and later also generate computer graphics in live settings. A new kind of real-time performance emerged. The audio or control data from the musicians’ controllers were transferred to the image-generating system and used as actuators for visual impulses. Cécile Babiole, who from the outset of her artistic career worked with the transposition and manipulation of images through sounds (and vice versa), is exemplary for her early exploration of these techniques.[1] As early as 1999, audiences became acquainted with her work Reality Dub Bus at the Phonotaktik festival in Vienna, where Babiole converted a public bus into a rolling performance space. The audience sat in a completely screened-off area of the bus and listened to a live remix of the images and sounds recorded by cameras and microphones (and processed by musician Fred Bigot a.k.a. Electronicat and Babiole) while the bus was being driven.[2]

Any number of interrelations between audio and video can be produced using digital means, with sounds frequently controlling the images. The sound is registered using various methods of analysis and mathematically translated into numerical values. Values for volume, pitch, timbre, sound duration, as well as heights and depths, which are broken down into a series of wave bands, are first compiled and then fed into the image-generating system. The originally auditory parameters are translated into visual parameters at the software level. The brightness, speed, size, transparency, position, and rotation of two-dimensional forms and three-dimensional bodies are just some of the parameters that can be influenced. In principle, any value can be translated into a value recognized by the respective other system without being affected by signal loss.

At the turn of the millennium, a separate machine was used for each medium, for example one laptop for the generation of sounds, another for the production of image sequences, and additional computers for control protocols or data exchange between the generating systems. Today, both laptops and the available software applications are so high-performance that all media can be generated synchronously on a single computer. It is therefore becoming increasingly easy for just one person to control the sound and image levels simultaneously. Artistic personalities have emerged who see themselves neither exclusively as musicians nor as pure visualists. The Japanese Ryoichi Kurokawa refers to himself as an audiovisual artist. For his performances, such as Parallel Head (2008) and Rheo (2009), he develops his fragile and complex image and sound worlds in a reciprocal process.

3 The Development of Pictorial Language in Live Visuals

In 1997, Image/ine, the first software for commercially available personal computers — in contrast to high-performance machines in professional video and television environments — appeared on the market, which enabled live sampling and the continuous processing of previously recorded image sequences in real time.[3]

The means of composition at that time corresponded with the technological possibilities. Short, recorded image sequences were digitally manipulated and overlapped. The reuse and recombination of media elements (remixing) as an esthetic method was dominant in the 1990s. Because of the low processing power (compared to today), videos could only be processed in real time at a very low resolution (320 x 240 pixels). The enlargement on the actual projection surface resulted in highly pixelated images. This typical ‘pixel look’ of the time did not necessarily reflect the express desire of the artists; in actual fact, they were working and experimenting within the scope of the possibilities and limitations of the available software and hardware.

Soon after, the applications Nato.0+55+3d[6]

The motto generate, don’t make collages (Jan Rohlf) aptly describes the moment at which the issue of pictorial language became wide open for debate again.[8] In the design of a digital composition, the technology of computation also contained an element of chance: artists lay down certain rules, the computer performs these processes and supplies results within the predefined conditions, and the outcome is an infinite number of visual options. Although visual artists maintain control over the process and the process conditions, the chance factor also produces results that are scarcely predictable at the beginning of the process.[9] The artists’ group Ubermorgen[10] consciously incorporates an element of chance in its work The Sound of eBay (2008), in which sound and image are automatically generated by the same external data source, namely by eBay user data.

4 Software Tools

Today there is a wealth of diverse software tools for live image generation in association with sound. Individual and original approaches to the visualization of music and the generation of images are being developed with the aid of a wide variety of applications and methods. There are three basic approaches to managing the software, where each software has its own special features:

USER INTERFACE: A graphic user interface allows processing and mixing of images and films. No previous programming knowledge is required (e.g., Module8, Isadora).[11]

PATCHES AND NODES: A graphic development environment supplies preconfigured modules that can be combined in so-called patches in an object-oriented approach. The individual modules have specific tasks and produce new functionalities as a result of their combination with each other (e.g., MAX/Jitter, Pure Data, vvvv, QuartzComposer). The program vvvv stands out due to its high speed in the area of 3-D effects in real time. QuartzComposer is included with versions of the Apple operating system OS X 10.4 and upward.

PROGRAMMING: The direct implementation of a programming language (e.g., Processing/Java, OpenFrameworks/C++) allows any imaginable linking of sound and image. The open-source program Processing offers quick and simple access to the world of programming.[12] The open-source concept has the advantage that there is a global community permanently working on the further development of programs and the expansion of functions. There is a collection of knowledge on the Internet for each program which is accessible to all users and can be extremely helpful in the development of personal applications.

Advanced artists create their own instruments for audiovisual performances by falling back on preconfigured material (modules, libraries) and combining it in new ways. If a function is required that does not yet exist in the form of modules or libraries, they can program it themselves — however, it is essential that they are familiar with a programming language.

The more independently artists intervene in or modify the preset elements of the software — or test them for unusual modalities or even divert them from their intended use — the more authentic and individual the emerging image level will appear. For many of her early videos, the Austrian artist Lia used the multimedia program Director, which was not actually conceived for use in music visualizations, but rather for application in the programming and controlling of interactive CD-ROMs. Although software applications seem to suggest the use of certain effects or esthetics, the unique style of Lia’s works testifies to the possibility of independence from the utilized software.

With the aid of vvvv, the French artist David Dessens developed his own visual stylistic elements, whose basis is the superformula by Johan Gielis (a mathematical modeling of plant shapes).[13] Semiconductor, an artist duo consisting of Ruth Jarman and Joseph Gerhardt,[14] also developed their own live performance software, Sonic Inc., which enables them to generate forms and compositions in real time by means of drawings and manipulations while the computer analyzes the sound space. In their work Inaudible Cities (2002/2003), sound constructs an entire city in this way.

5 The Live Factor

The original area of application for digital real-time visualizations encompassed both club visuals (VJing) and live performances within the scope of electronic music concerts (live cinema). While a club VJ often works on his or her own at the side of a DJ, visual artists explore specific connections between sounds and images in a production collective with audio artists. The close collaboration between musicians and visual artists leads to a favorable alignment of music and the visual system, especially in the live context. The use of visuals has now been extended to events of all kinds, from concerts to installations to the design of professional theater performances.

As a rule, the sets for the musical performance and the visuals are carefully prepared in the run-up to a live performance. The esthetics of the visual level is determined by the good choice of source material, such as photo stills, film sequences, text, geometric objects, and abstract elements, with the controllability of the elements and a high degree of interaction playing a special role here. Many artists change their sets for each concert and in so doing compile their personal tool box over time, which in practice and after repeated performances leads to the development of a whole repertoire of effects. This individual method results in a personal and, in the best cases, innovative style and thus has the added value of artistic exploration.

The control of the individual parameters can be carried out via mouse or keypad (virtual buttons or actuators on the screen, use of the data from the mouse position, or key combinations) or with the aid of external devices (to name just a few of the many possibilities: MIDI controller, joystick, WII remote control, iPhone applications via OSC, etc.).

Not every type of sound lends itself to every visual system and its immanent sound analysis. The complexity of a visual result based on audio data is dependent on the quantity and quality of the musical parameters. If one only analyzes the volume parameter, for example, an evenly low, noisy sound with short interposed high-frequency sounds will be difficult to differentiate at the visual level. A more precise frequency analysis of the sound enables partitioning of the sounds in different wave bands so that data values for high, medium, and low sounds can be determined and used for the generation of images.

During a live performance, the actual audio interpretation does not in fact take place until the performer acts as a filter and interpreter. The decisions on the use of the control options are made spontaneously and thus lead to a result that cannot be repeated and is unique — visual artists improvise live with their set. Audiovisual performances in electronic music call for a high degree of concentration on the part of the recipient and therefore often only last for 35 to 45 minutes.

Classic VJing in a club context is frequently only the first step for artists working in the area of audiovisual design options. An increasing differentiation of genres has occurred in recent years. The term sound sculptures, for instance, is also used to describe artistic works that consist of generated sounds and images.[15]

Not many contemporary artists exploring connections of sound and image perform in live contexts. They produce multimedia works that are presented at festivals or released as DVDs and/or create installations that develop their effects in a public space. Some institutions and organizations commission generative works of art that are presented on purpose-built objects together with screen and loudspeakers.

An increasing use of synthetic surfaces as potential screens can be detected in particular in public space. With the dissemination of LED technology, today there are numerous buildings and billboards with playable surfaces that are also used for artistic interventions.

all footnotes

[1] “Transcoding Obsession,” an interview with Cécile Babiole conducted on March 10, 2009 by Laurent Catala. Available online at digital art international,


[3] An updated version of Image/ine can currently be downloaded from This allowed the transposition of techniques from the area of video into the digital environment of the personal computer. Image/ine was developed by Steina Vasulka and Thomas Demeyer at the Dutch Studio for Electro Instrumental Music (STEIM). It was also in 1997 that Matthew Cohn (aka Matt Black), one half of the music duo Coldcut, wrote the application VJamm. More information about Coldcut is available at This program corresponded to a digital video mixer with the options of manipulation and overlapping and was thus a precursor to a long series of applications that enabled the rapid combination and playback of film clips. Both of the programs mentioned above were specifically developed for use in a live context so that performers could play with images in the same flexible way that was previously only possible for musicians playing with sounds.


[5] More information about Coldcut is available at

[6] and Jitter appeared, which were notable because they offered the option of combining and programming different objects for the manipulation and generation of images. Unlike applications such as VJamm for the playback of clips, the more flexible, object-oriented applications allowed the direct manipulation and generation of image sequences on the basis of sounds. Skot vs. Hecker and End of Skot (both 2000, music by Florian Hecker and Mathias Gmachl, visuals by Skot) and 242.Pilots — Live In Bruxelles (2002) by the artists’ group 242.pilots (USA) are early testimonies to these developments.


[8] Jan Rohlf, “Generieren, nicht Collagieren,” Cinema — unabhängige Schweizer Filmzeitschrift, no. 49: Musik (2004), 121–132, online at

[9] Katerina Tryfonidou and Dimitris Gourdoukis, “What comes first: the chicken or the egg? Pattern Formation Models in Biology, Music and Design,” 2009, online publication at (accessed July 23, 2009).

[10] (accessed July 23, 2009).

[11] As commercial software, Isadora also enables the use of so-called FreeForm open-source extensions.

[12] More detailed information about the programs mentioned is available at the following URLs:,,,,,, and,

[13] See also,, and


[15] The DVD advanced beauty, curated by Matt Pyke (Universal Everything), was released in late 2008. It comprises eighteen audioreactive video-sound sculptures. The videos are configured as physical manifestations of the sound, created by changing the volume, the pitch, or the structure of the corresponding soundtrack. Also go to

List of books in this text

Cécile Babiole: Transcoding Obsession: Interview mit Cécile Babiole
2009, Author: Catala, Laurent

Generieren, nicht Collagieren
2004, Author: Rohlf, Jan Publisher: Schüren

What comes first: the chicken or the egg? Pattern Formation Models in Biology, Music and Design (23.07.2009)
2009, Author: Tryfonidou, Katerina and Gourdoukis, Dimitris

see aswell

  • Cécile Babiole
  • Fred Bigot
  • Matt Black
  • Matthew Cohn
  • Thomas Demeyer
  • David Dessens
  • Joseph Gerhardt
  • Johan Gielis
  • Mathias Gmachl
  • Florian Hecker
  • Ruth Jarman
  • Zbigniew Karkowski
  • Ryoichi Kurokawa
  • Lia
  • Pita
  • Matt Pyke
  • Peter Rehberg
  • Jan Rohlf
  • Carl Stone
  • Steina Vasulka
  • Works
  • 242.Pilots: Live in Bruxelles
  • advanced beauty
  • Director (Adobe Director, Macromedia Director)
  • End of Skot
  • Image/ine
  • Inaudible Cities
  • iPhone
  • Isadora
  • Jitter
  • Max/MSP/Jitter
  • MIDI-Controller
  • Modul8
  • Nato.0+55+3d
  • OpenFramework
  • Parallel Head
  • Processing
  • Pure Data
  • QuartzComposer
  • Reality Dub (Reality Dub Bus)
  • Rheo
  • Skot vs. Hecker
  • Sonic Inc.
  • SuperFormula 3D
  • The Sound of eBay
  • VJamm
  • vvvv
  • Wii
  • zgodlocator

  • Timelines
    1990 until today

    All Keywords
    no Keywords available

  • 242.pilots
  • Autechre
  • Coldcut
  • eBay
  • Farmers Manual
  • General Magic
  • Mego (Editions Mego)
  • Phonotaktik Festival (phonoTAKTIK-Festival)
  • semiconductor
  • STEIM (STudio for Electro Instrumental Music)
  • Ubermorgen
  • Universal Everything