"Virtual Acoustic Reality"

By Keith Yates

On a wintry day in 1870, QueenVictoria paid a visit to a colossal auditorium then nearing completion in the South Kensington area of London. Nearly ten times the size of the largest concert halls of Europe, the immense structure was destined to take its place as the world’s grandest concert venue. With the scaffolding still in place, the Queen thrilled to the “delicate harmonics” of some violin scales and the voice of a soprano. But at the inaugural concert a few months later, the Prince of Wales’ welcoming address was garbled by “a curious echo bringing the repetition of one sentence as the next one was begun.”

Pity the Queen didn’t have a computer running today’s cutting-edge “virtual acoustic reality” software to hear the problem before construction began. As it turned out, it took a hundred years to tame the rogue echoes blighting The Royal Albert Hall.

Auralization is the newly-minted name of a technology that stands to revolutionize the design of any sound-critical space by allowing both designer and client to hear the structure prior to its being built. You could think of it as the audio cousin of the now-common visualization tools that allow viewers to take a video walk-through of a structure while it’s still in the design stage inside a computer.

The auralization sequence typically goes something like this: The designer constructs a three-dimensional model of the proposed structure in a computer, assigns features (wood, glass, concrete, gypsum board, carpet, etc.) to the various surfaces, specifies locations for the sound sources (musicians, organs, hi-fi loudspeakers, public-address speakers, etc.), defines a listener location, patches in an audio program (say a movie soundtrack, a CD or a live microphone feed), and lets the computer simulate, in a set of headphones, what it projects the “real McCoy” would sound like at that exact listener position. You could think of it—as marketing guys no doubt will—as the first truly useful foray into virtual acoustic reality.

Sounds too Good to Be True?
To anyone about to build an Olympic stadium, a home theater, or anything in between, it sounds too good to be true. Auralization has been a hot topic in university labs in the US, Japan, Germany, Sweden and elsewhere for a few years now, attracting some of the best and brightest researchers in psychoacoustics, architectural acoustics and virtual reality. Yet the hurdles have been daunting: even the fastest workstation-class computers have had to “chew the numbers” for hours or even days to process or “convolve” barely a minute of program. And who’s to say how accurately the replay—usually on headphones—would resemble the live aural experience? I’ve been following the lab developments and listening to experimental prototypes since 1991, and had concluded that, at least for another few years, auralization was more buzz word than business tool.

Chin Music
Bose Corporation recently changed all that with the introduction of a sensational new tool called Auditioner. The result of 15 engineer-years of work, Auditioner is essentially an auralization enhancement to the Bose Modeler software program in use by sound contractors since the late-1980s to predict and graphically display speaker coverage patterns, reverberation time, interference effects, speech intelligibility and other nuts and bolts of commercial sound system design. As helpful as Modeler (and similar products from other vendors) can be, it takes a skilled operator to be able to input and then interpret all the data, let alone mentally translate them into a “sound”. The customer still has had to take the operator’s word for it that the proposed structure will “sound right.”

The hardware and software elements that comprise Auditioner allow the designer to go beyond charts, maps, curves and “trust-me” chin music by actually simulating the sound of the venue through a special set of speakers located just a foot or so from the auditioner’s ears. You simply set your—well, your chin—on the chin-rest and let the system make the music.

Intrigued, I recently spent a day at Bose’s Massachussetts R&D headquarters to have a listen for myself. Using a variety of CDs and live microphone feeds, the demos proved uncannily lifelike, whether the simulated environment was a small local church or a big urban concert hall. Impressed but wary of canned demonstrations, I asked Ken Jacob, the project’s chief engineer, to simulate a known pair of speakers in a known environment: the listening room we were sitting in, which happened to be an IEC-standard listening room. (This is a room that represents the statistically “average” home listening environment.) He opened a new file in Modeler, input the room, speaker and listener particulars, loaded a CD of my choice—the system-busting Telarc recording of Stravinsky’s “Firebird”—and let ‘er rip through the Auditioner playback system. The upshot was remarkable agreement between the auralized presentation and the way the real room—which, again, we were sitting in—actually sounded.

Auditioner has been used by a handful of specially-trained designers to rescue a few dozen public facilities—including the new German Bundestag (parliament) in Bonn, a 70,000 seat outdoor stadium in Thailand, and the University of Minnesota’s new Mariucci Arena—from acoustic disappointment by allowing the principals to hear an accurate simulation of their facilities before the sound system and acoustical designs (or redesigns) were finalized. With a growing list of tested and “authenticated” public projects under its corporate belt, Bose is now sufficiently confident of Auditioner’s accuracy that it offers a written guarantee that the completed facility and sound system will sound as Auditioner simulates. . . or Bose will rip it out and refund all monies paid.

The guarantee is simple; Auditioner’s inner workings are anything but. First, Modeler looks at all the relevant variables—the geometry of the space, the acoustical characteristics of the surfaces, the locations of the listeners and sound sources—and calculates a system-to-listener transmission response that includes the direct sound, discrete reflections and diffuse reverberation-sort of an acoustical fingerprint unique to that particular arrangement [see Fig. 1]. Move a wall, add a window, reposition a speaker or specify a thicker carpet and that transmission response will change to reflect it.

Modeler uses this transmission response to create an extremely complex audio filter. (You might think of this filter as a giant digital graphic equalizer with thousands of knobs, each dialed to a very particular setting. Unlike even the most sophisticated equalizers, however, the filter produced by Modeler includes the dimension of time so as to account for the room’s reverberation. Filtering audio in the time domain–something central to all auralization systems–is called “convolving.”) This filter is so huge, mathematically, that commercially available DSP chips are incapable of applying it to continuous real-time signals (like, for example, a stereo feed from a CD player). Bose uses purpose-built convolution hardware—the guts of which are rumored to come from Lake DSP of Australia—to get the job done. An audio processing unit—basically a powerful digital audio computer—stores and applies the filter to whatever real-time audio source is plugged in, typically a CD or live microphone feed.

Importantly, the processing unit allows for instantaneous “A/B” comparisons between different design iterations. This takes the I-forgot-how-the-other-one-sounded “fuzziness” out of the decision-making process.

Home Connection
Auditioner has not yet been used to auralize small proposed playback spaces like home theaters or listening rooms. Yet one naturally wonders how useful the tool might be in the design of purpose-designed residential venues. Jacob and staff initially doubted that home projects justified the use of such an industrial-strength tool; when I related the scope of my current screening room and home theater projects around the world they were temporarily struck speechless. Jacob vowed to “look into” the feasibility of adding a residential component to the program as soon as the mushrooming backlog of major public projects subsides. His fear is that there may not yet be a sufficient number of similarly ambitious residential projects to warrant the considerable expenditures of corporate time, money and talent.

Yet I suspect that there could be a lively residential niche awaiting Auditioner or some similar future product: As much as they may believe they’re in good hands, most end-users would love to audition their new home theater, listening room or home concert hall before the walls go up and the check for all that hardware and engineering is made out. Aside from the obvious prudence of hearing before buying, there’s an almost voyeuristic fascination in being able to experience an acoustically credible world that doesn’t yet exist. My take on it is that when something is sexy, financially sound, and marketed with the kind of savvy Bose is famous for, it’s bound to find an enthusiastic market.

If he decides to take the plunge, Jacob may want to tune Modeler/Auditioner to the specific requirements of high-resolution home theaters and listening rooms. For starters, an enhancement of the speakers themselves may be in order: I suspect that the simulation’s weak link is currently not the Macintosh computer or Bose’s proprietary hardware and software tucked inside it, but the bandwidth and resolution of the little “nearphone” satellite speakers used to present the simulation. Even if they’re enhanced by special compensating circuits in the audio processing unit—something that would make sense, but which Jacob would neither confirm nor deny—it still seems a bit far-fetched to believe that the differences between, say, the Meridian and Genelec speakers that I enthused over in this space a few months back could be laid bare by a little pair of satellite speakers, let alone the differences between various surround processors, amplifiers, D-A converters and so on.

Next, although the auralization demos were astonishingly lifelike, it would seem possible to further increase spatial realism by expanding the presentation system to include a pair of “surround” speakers just behind the auditioner. While this could conceivably tax the computational power of the present DSP chips and software, I think it worth studying: Once experienced, the sensation of envelopment within a soundfield is addicting.

Third, in my view the system would benefit from the ability to accurately predict and portray the bass (from say 60 Hz down to 15 Hz or so) that underpins the home theater experience. This area is tough to get just right in residentially sized environments, with their typically uneven modal responses [see “Well-Tuned Room” installments in the May ’93 and January ’94 issues]. It may not be sufficient to merely hook up a subwoofer to Auditioner’s playback system: The interaction between low-frequency waves and the kinds of objects found in residential spaces—chairs, sofas, display cabinets and so on—is very tough to model accurately. (This is probably not a shortcoming in projects Modeler and Auditioner are presently used for—projects where speech intelligibility is the overriding concern.)

Fourth, Modeler itself would need to be expanded to include libraries of the standard acoustical products—diffusors, bass traps, absorbers—found in most all serious residential and studio playback spaces. Libraries of non-Bose branded speakers would be—no offense, really—extremely welcome, too. (The latter issue is due not to chauvinism on Bose’s part, but the unspeakable reluctance of most speaker manufacturers to part with the kind of specific performance data that programs like Modeler need before they can construct that all-important system-to-listener transmission response.)

If all this tweaking sounds like too tall an order for a company scrambling just to keep up with demand for big-ticket commercial projects, perhaps a solution would be for Bose to set up one or more third-party developers or “VARs” to target the custom residential market.

From my perspective, the Bose R&D team has developed a startlingly promising tool that could usher us into a new, what-you-hear-is-what-you-get era in home entertainment. For now, though, the music and movie enthusiast dreaming of that riveting room and playback system will have to do what he or she has always done: Rely on the best specialists available . . . and a lot of imagination.

Contributing editor Keith Yates has been professionally designing purpose-built listening rooms and home theaters since 1977. He just completed his first book, with coauthor Ralph Glasgal, on Ambiophonics, a new surround-like playback technology.