Multi-talker Speech As A Distraction In Open-plan Offices?

While ease of collaboration may have been a driving principle in open-plan design philosophy, this design has also been directly responsible for increased speech distraction and lack of speech privacy for workers, when compared to walled offices. Speech distraction typically manifests as the wavering of concentration due to speech from neighboring workstations, including telephone conversations, discussions, etc. On the other hand, due to the proximity of colleagues, the possibility of having a private or confidential conversation suffers as well.

The effect of these speech-related issues has been noticed in the form of lowered work quality and quantity, increased annoyance, a decline in motivation, and health issues for workers due to these issues and similar factors. It must be noted here that speech-related, or acoustics issues, have consistently shown a stronger correlation with overall workplace dissatisfaction out of all other salient factors of the indoor environment in an office (heating, lighting, furniture, etc.).

Let’s consider a common scenario in open-plan workplaces, where there are conversation(s) happening nearby, and one cannot help their attention ping-ponging between the conversation(s) and the incomplete report that is due in 5 minutes. Psychologists study this kind of scenario as a case of the irrelevant speech effect[1] (ISE), which involves systematic detriment to task performance, due to ambient speech. Note that the ambient speech may even be entirely irrelevant to the task at hand, and the ISE exists even when specific instructions about ignoring the irrelevant speech are given. Obviously, if the ambient speech is interesting, it is likely to exacerbate the ISE. In any case, it is very easy to see that the ISE can have many long-term drawbacks than just momentary concentration lapses.

From another perspective, one might ask how the acoustics can be treated, so that while the ease of collaboration is maintained, speech (relevant or irrelevant) does not travel between neighboring workstations, and the level of other sound is reduced as well. Firstly, it must be noted that this may not even be the desired solution, as excessively quiet environments may be discomforting in their own way. Besides, some steady-state or other kinds of noise sources might actually be pleasant and/or mask the distracting speech. Nevertheless, there are other more intricate reasons as to why treating offices against speech issues are generally not straightforward.

In this regard, the international standard for measuring the acoustics of open-plan offices (ISO 3382-3: 2012) has developed useful metrics that can be employed for practical solutions to some speech-based issues. These metrics include, for example, the distance from a talker beyond which his/her speech will not be intelligible, and hence, substantially less (or not) distracting (called distraction distance; privacy distance is defined similarly). Achieving distraction distances that are less than typical workstation distances (5 meters or less) with current technology, however, would be inadvisable; simply putting walls between workstations would perhaps be cheaper and more foolproof. Hence, solving speech distraction from neighboring workstations, which are within a radius of around 5 meters, is non-trivial from a physical acoustics standpoint. Again, such short distraction distances may not operate well in practice, and instead, the goal would be to achieve a middle ground between excessive privacy and distraction.

However, another issue which is systemic in ISO 3382-3 is that it seems to misrepresent certain crucial aspects of the psychoacoustics (psychology of hearing) of multi-talker speech environments. Specifically, the ISO 3382-3 assumes that the scenario of one person talking at any point in time represents the most distracting scenario for workers that need to concentrate, and most of the metrics in ISO 3382-3 are affected by this assumption. However, there are no studies that have explicitly confirmed this assumption in ISO 3382-3. These issues and similar concerns were highlighted in a recent study by Acoustics and Thermal Comfort researchers at The University of Sydney.

Figure 1. Will the scenario on the left be more distracting than the one on the right? The jury is out on this one [3]. Credit: Manuj Yadav

Do Multi-talker Environments Distract More Than Single-talker Environments?

In the study, they pointed out that the only literature that is cited in ISO 3382-3 regarding the single-talker assumption is, in fact, equivocal on the topic, and states that a multi-talker environment has the potential to distract more (to a certain number of talkers), especially if the talkers are spatially separated (Fig 1). More generally, multiple studies have shown that spatially separated talkers are more immune to being mutually masked, a phenomenon known as spatial release from masking (SRM; unmasking of as much as 20 dB has been reported).

Imagine four talkers that are standing very close together, one behind the other. In this almost collocated state, the combined speech from these talkers may sound scrambled or masked by each other, with individual speech streams hard to follow. Now imagine these talkers moving away from one another. Suddenly, listening to and following individual streams becomes much easier, which is the concept behind SRM (although attending to each talker simultaneously may still be hard, or even distracting in its own right).

In laboratory studies, up to 4 spatially separated talkers have been shown to be still quite distracting, compared to when they are collocated. This represents a typical scenario in open-plan spaces such as offices, libraries, schools, etc., where multiple simultaneous conversations may be occurring, and the speech content from nearby talkers is generally quite intelligible. It must be noted here that the degree of distraction may plateau in its effect as the number of talkers becomes high enough (this number is not known). With such a large number of talkers, the intelligibility of individual speech streams may be scrambled, or masked by other streams, like the example of the four collocated talkers in the previous paragraph.

In the study reported here, a realistic simulation of an open-plan office in terms of its visual and acoustics considerations was used. Four loudspeakers that were hidden from the experiment’s participants were used to simulate four talkers at adjoining workstations. Each loudspeaker was calibrated to play speech that was recorded with a high-quality control on the content to make it sound like a one-sided conversation. This scenario is similar to phone conversations in offices that have been reported as the most distracting by the workers. Another set of ceiling-mounted loudspeakers were used for the broadband steady-state noise (-5 dB per octave slope).

This obviously represents a simplification of the more dynamic and unpredictable sound environment of an open-plan office involving not just speech, but other office sounds. However, this experimental design is well-suited to psychoacoustical inquiries where a high degree of control is desired. Some of the artificialness of the experimental design was mitigated by the relatively sophisticated speech environments, difficult cognitive tasks to be performed, and the general look and feel of an open-plan office for the participants.

Based on the considerations above, the aims of the experiment were to find out how the degree of distraction varies with the number of talkers that are active, and what happens when some broadband noise is added. The number of talkers that were active were 0, 1, 2, and 4.

Association Of Performance In Cognitive Tasks With Number Of Active Talkers

The results showed that for some cognitive tasks, the performance decreased (with statistical significance) as the number of talkers increased, with the performance also varying with participants’ gender and general noise sensitivity. These effects were somewhat decreased with additional overhead masking but retained the trend. The participants also rated how distracted they felt while doing the cognitive tasks in the sound environments. These rating showed increased distraction with the increased number of talkers (statistically significant), regardless of the HVAC noise (Fig 2).

Figure 2. Auditory distraction as a function of the number of simulated talkers and HVAC noise. Credit: Manuj Yadav

These findings contradict the single-talker assumption in ISO 3382-3 and are more aligned with the extant psychoacoustics literature (the right panel in Fig 1 was shown to be more distracting). To further explain these results within the framework of the ISE, the authors proposed that the changing order of intelligible talkers may, in fact, be another facet of the ISE. Within the extended framework, sound environments with 2 or 4 talkers would involve more changes in the talker order over time (with more unpredictability in the order), and hence would be more distracting than the changing order of a single talker.

In other words, with more than one talker, the listener may hear a denser and more segmented speech stream, with the order of speech ‘segments’ [2] changing over time, which effectively increases the degree of distraction for the multi-talker case. Note that segmentation (perhaps with increased distraction) that is referred to here is in three dimensions of space, time, and frequency, where the individual talker’s speech is segmented in time and frequency, and the speech from many talkers combined is segmented due to spatial separation and the individual traits of the talkers (i.e., gender, etc.)

While these results do not provide solutions to the problem of speech distraction from nearby workstations, they do emphasize the importance of considering the crucial details of how humans hear and interact with complicated sound environments. Without such considerations, the scope and effect of any acoustic treatment are likely to be less than expected. There are, in fact, auditory models that incorporate the psychoacoustics of complex multi-talker environments, which may be more suited to open-plan offices. These are being considered in our continuing work in this area.

It is widely recognized that a major proportion of the workforce around the world is in open-plan offices of some kind, and the productivity decline and associated issues translate into losses that run into sizable figures (a crude Australian estimate is around 8 billion per year). Recent times have seen a dramatic improvement in the awareness of the acoustics issues by office workers, researchers, and more importantly the media, property owners, and businesses. This surge likely means better news for all that are involved, but the results of the study reported here suggests mixing caution with the current momentum. More research in both laboratory and field settings is needed to fully appreciate the basic issues in the paradigm of multi-talker speech distraction in open-plan space.

These findings are described in the article entitled Auditory distraction in open-plan office environments: The effect of multi-talker acoustics, recently published in the journal Applied Acoustics. This work was conducted by Manuj Yadav, Jungsoo Kim, Densil Cabrera and Richard de Dear from The School of Architecture, Design and Planning, The University of Sydney.


  1. Actually, research has shown that some tasks, specifically those requiring maintenance of order, such as remembering a phone number, exhibits the ISE more clearly. Since memorizing and manipulating numbers, words, thoughts, is quite general, ISE seems generalizable. But thus far, only certain established tasks have been used to show an ISE in laboratory studies. Also, any sound that has some order, and is changing over time can have a similar effect as speech and can affect performance in tasks that require an implicit or explicit maintenance and manipulation of order. Hence, sometimes the more general term, the irrelevant sound effect is used.
  2. A segment of speech can be seen as having a high degree of self-similarity in some property of the speech (like the talker, content, pitch, etc.), and is sufficiently different from other segments in the same property.
  3. London scientists feel the noise. Nature 551, 542–542 (2017)