2
Though much more goes into how humans hear, we will focus on how we locate sound. To do this, we rely on a combination of specific auditory cues that tell us the azimuth (horizontal position), elevation (vertical position), and range (distance) of the sound we are hearing. There are six primary auditory cues that contribute to sound localization [4]:
- Interaural Time Differences (ITDs)
- Interaural Intensity Differences (IIDs)
- Monoaural spectral cues shaped by the pinna
- Torso reflections and diffractions
- The ratio of direct to reverberant energy
- The changes in cues based on head motion
- The familiarity with the sound
The ITD and IID are the primary cues that determine the azimuth or the position of sound in the horizontal plane. ITD refers to the time delay between a sound reaching the right vs left ear. If a sound originates from your right, the sound will reach your right ear first and reach your left second. IID refers to the difference in loudness of a sound between the right and left ear. Again, if a sound originates from your right, the sound reaches the right ear more directly while it will be attenuated by your head when it reaches the left and thus be softer [1][4].
The main cues for finding the elevation, or vertical position are the monoaural spectral ones created by the complex geometry of the pinnae, occurring above 3kHz. Torso reflections and diffractions and the ratio of direct to reverberant energy also provide monoaural cues, though they are both much weaker. In general, monoaural cues are more ambiguous than binaural ones and thus can be overridden or confused much easier.
The range is determined by the ratio of direct to reverberant energy, loudness in conjunction with the familiarity of the source, and low frequency IID.