Citation :
If you’ve shopped for a new A/V receiver anytime in the past few years, you’ve no doubt seen Audyssey’s logo join the ever-growing collection of silk-screened faceplate emblems. The company’s MultEQ automatic room correction system has quickly become the de facto standard for equalization in home theater, and is included these days as standard equipment on receivers and preamps from Denon, Integra, Onkyo, NAD, and Marantz (View slideshow of Audyssey-enabled receivers). Recently, Audyssey introduced two new sound-shaping technologies to the market, dubbed Dynamic EQ and Dynamic Volume, which take the concept of automatic room correction to the next level. To get the skinny on these new features, we sat down with Chris Kyriakakis, founder and chief technology officer of Audyssey, to discuss their origins and the problems they hope to address.
Let’s talk about Audyssey’s Dynamic EQ feature, which we’ve seen popping up on A/V receivers as of late. How does it work? Well, Dynamic EQ is an extension of our room correction technology, MultEQ, so let’s talk about that for a minute first. MultEQ was designed after lots of research at the University of Southern California, where I teach along with Tom Holman, one of our co-founders. It’s unique because it was the first [room correction solution] to address how you take sound measurements in a room at multiple positions, and how you combine those measurements to make filters that fix audio problems over a large listening area.
The most common way of doing room EQ is that you put a microphone at your main listening position, play some pink noise, and then you look at the frequency response and make an inverse of that…
In other words, if you find a two-decibel spike at a certain frequency, you counter that with a two-decibel cut at that frequency, and vice versa, in an attempt to flatten out the EQ curve in the room—to make sure that no particular frequency is accentuated over any other…
Right. But there are several problems with that that way it’s been done before. First off, measuring pink noise means you’re only measuring the frequency response—you’re time blind. In other words, you don’t know what part of the signal was direct sound from the speakers and what part of the signal was reflected off of other surfaces.
So to address that problem we came up with a method of measuring impulse responses—bursts of sound—which gives you a lot more information about what’s happening in the room in the time domain. We’re not just adjusting frequencies; we’re adjusting delays, as well. And we come up with the right frequency adjustments and delay settings by taking six or eight measurements and looking at them as patterns: maybe positions one and four give us very similar measurements, for example; that means they have very similar acoustical problems. Positions two, five, and six may also be very similar to each other, but different from one and four. And perhaps the third measuring position is so different that is has a unique response. Based on all of that information, we apply a weighting to the problems. If enough seats have the same problem, then it’s more important than the one seat that has unique issues. And you keep doing that over the entire frequency range, and you come up with this weighted combination, which is better than what others have tried in the past, which was simply taking six measurements and averaging them.
So how does Dynamic EQ fit into the picture?
Well, we’ve found that when you do all of that and equalize a room to a particular target equalization curve, what you’re really doing is equalizing it at what’s called “reference level,” which is the level the person mixing the content was listening to on the mixing stage. But the level at which mixers listen in Hollywood is much higher than “civilian levels,” as we call them. Our installers tell us that the average home user listens at -10 to -20 on the volume knob, and the minute you turn the sound down that far, your perception changes—the balance between bass and mid and treble is completely changed.
This isn’t a new thought, of course: what’s known as “loudness control” came about to address this, and it’s been around for many years. But the reason loudness controls disappeared is that they really weren’t effective.
If you have a passage of music that has loud and soft parts when you’re at high volumes, and then you lower the volume, the soft parts need more correction than the louder parts. So we had to invent a way of measuring perceived loudness in every channel, running in real time. And that’s what Dynamic EQ does. Every channel has its own loudness meter that looks at the content as its coming through and estimates—based on the signal and our human models—how a human will perceive it. So it makes minute adjustments, and of course it doesn’t just rapidly go up and down or you would notice it. We smooth it out. That’s part of the innovation here—that’s why we called it Dynamic.
So how did you develop the human model for this?
We work a lot with professional mixers at the university because of Tom [Holman]’s connections. So we did a number of experiments with them. In one of them we put them in front of the console and turned down the master fader, and the first reaction they had to their own mixes was, “what happened to my surrounds?” It turns out that loudness perception is spatial—it falls off faster behind us than it does in front of us. And we asked these mixers, “OK, you’re down 10dB, what would you do to the surrounds to maintain the surround impression?” And they would move it up, and at different levels they would move it up by different amounts. So if you do that with enough people you can come up with a set of rules that mimic what they’re doing.
So we integrated that into Dynamic EQ—as you turn the volume down, the surround levels go up a little or a lot, depending on how far down you are, to maintain the impression of surround. And the best way to demo that is to turn the volume down 20dB and turn off Dynamic EQ, and all of the sound collapses to the front.
How does Dynamic Volume fit into all of this?
So, continuing with our story of the mixers—we had probably 20 people who participated in this set of experiments, initiated by Tom Holman, who was puzzled for a long time by one particular problem: why is it that a mix that is done in the dubbing stage doesn’t work as well in our home theaters? Why do we need to keep adjusting the volume—up to hear the dialogue and down when the action gets heavy? Why do we need to do it at home, but not in the movie theater? We have high end equipment. What’s going on?
What we found was that the content is mixed for movie theater systems with extremely high dynamic range—very powerful amps, big speakers, in huge rooms. But when you bring that into the home the sound pressure levels that you perceive change, the perceived dynamic range changes, and it’s affected by the size of the room and the ability of your system. So the first thought was, We have to control the dynamic range, but we can’t do it by putting a compressor in the signal path, because that doesn’t work—it introduces artifacts.
By “compressor,” you mean things like Night-time Mode or Midnight Mode, and similar dynamic range controls on many A/V receivers, which make loud sounds quieter and quiet sounds louder…
Exactly. So to find a better solution, we brought mixers in to work with content that they knew and said, “Pretend your daughter is sleeping next door, and make adjustments to this mix in real time.” And we set up a system to electronically track their movements and in parallel record the content in 5.1. When we were done we had a seven- or eight-terabyte database that we gave to our research team and said, “We need to reverse engineer their decisions. Why did they turn the sound down when they did, and when they did turn it down how fast were they reacting? What were they reacting to? What was the content in every channel?” And we came up with an algorithm that mimicked those decisions. And so Dynamic Volume came out of these experiments. It uses a look-ahead method, with a small buffer, and looks at what’s about to play in each channel…
How far does it look ahead?
That’s adjustable. It depends on who we’re working with. A TV manufacturer will give us a lot because they’re already doing video processing and they don’t care. An A/V receiver manufacturer will not give us as much look-ahead time—it’s about half a frame. The more the better, obviously, but you don’t want to be introducing delay more than you have for video, so we’re limited by that. So, aside from the look-ahead buffer, what makes Dynamic Volume different from other dynamic range controls?
Well, the average person adjusts the volume level so dialogue sounds right. That’s what’s important to them. They just don’t want to have to run from the room when the explosions come, or at commercial breaks. So we keep dialogue as reference, and then monitor the signal above or below and decide how much to bring it up or down. The trick was to do that without using a compressor, as I mentioned before, because a compressor has a fixed time constant. If things are too soft it boosts, and if things are too loud it cuts. But it doesn’t have any knowledge of how quickly they’re becoming soft or loud. And so by the time it catches most things they’re already changing in the other direction.
Some improvements have been made to compressors over the years, where you take the signal and break it up into different frequency bands, and put a compressor on each of those bands, the idea being that only the frequency bands that have something loud in them will be compressed, and not the whole signal. But if there’s an explosion, which means the 60-70Hz bands have something extremely loud in them and the rest don’t, if you compress those bands and bring them down, you’ve done changed the frequency response, because you’re turning down just the bass. And Hollywood mixers get very upset when you do that. They’re okay with you turning the volume down. That’s one thing. But intentionally changing the frequency response is not okay with them. So we don’t use bands.
Our approach is different. We look at the time domain signal as it’s varying, estimate how loud it is, and make adjustments slowly or rapidly, depending upon what needs to happen.
What do you say to people who accuse you of tinkering with the artistic intent of a carefully crafted surround sound mix?
Actually, I say that we’re trying to preserve it. Whether it’s the user or the system turning the volume down, Dynamic EQ is tied into Dynamic Volume. So, if I make it softer, all of the things I said earlier about preserving the frequency response and the spatial impression are still in play, whether it’s done automatically or not. You can’t turn on Dynamic Volume without having Dynamic EQ on. So for that reason, we’re actually preserving the artistic intent of the surround sound mix, considering the fact that the user simply isn’t going to play it at the volumes it was mixed for. So, given the reality of real-world listening situations, we’re trying to translate the artistic intent as much as possible.
You say you can’t turn on Dynamic Volume without turning on Dynamic EQ—is the inverse true, as well?
No, you can turn on Dynamic EQ without engaging Dynamic Volume. If you have a dedicated room and you want to listen at reference levels, or for whatever reason you don’t want or need Dynamic Volume on, you can turn it off and Dynamic EQ will stay on.
It seems like a lot of what you’re doing is taking mixes that were intended for large cinematic spaces and modifying them so they work better at home. What about the DVDs and Blu-rays on the market that contain near field mixes, specifically designed for home theater?
Dynamic EQ and Dynamic Volume really aren’t affected by that, but what is affected is the target equalization curve choice in MultEQ. The high-frequency roll-off that is in the Denons and other receivers is called the Audyssey Curve. That assumes that the mix was done for theaters [and therefore mixed with increased high-frequencies to compensate for the sound passing through a large cinema screen]. So if you have a mix that was actually done for the home, we recommend that you switch to the Audyssey Flat Curve. The problem is, you don’t always know. I wish the studios would make it clearer, who does it and who does not. It’s not the majority of them, but there are some—New Line and a few others—who do this. So we have no way of making it automatic, because we have no way of knowing whether the mix was done for theaters or for home. But if you do know you’ve got a near field mix, you should set it to Flat.
Could it be made automatic somehow? Could the sound mixers put a flag in the bitstream?
Yeah, Tom is trying to get them to do that. He knows a lot of the post-production houses, and there are empty flags in Dolby Digital that could be used for things like that, but habits are hard to break. Those guys just don’t use flags. I think that it will happen eventually. I’m just not sure how soon it will happen.
What’s the next big problem you want to solve?
The next big problem relates to MultEQ: in our pro applications, our installer applications, we provide a few more options than what you have as a consumer. We’ve come up with a few preset equalization curves that work for different sized rooms, because different sized rooms require different amounts of high-frequency roll-off, depending on how much acoustical treatment they have. So today we do that with presets and guidelines. But what we’re working on is to make that completely automatic, so that a measurement of the reverberation time and knowledge about the size of the room and the position of the primary listening position—because that affects how much direct sound and how much reverberant sound you’re getting—will give us exactly the right EQ curve for that room. That’s a lofty goal. It’s a really difficult problem. But we have a good team that’s working on it now, and we have a whole series of experiments that we’re doing in the lab to figure this out.
When might we start to see that?
My best guess is that we might start to see models with that appearing… a year from now. That’s just a guess, though.
|