top of page

Multi-Speaker Transcription Difficulties

Updated: Jun 15, 2022

Unfortunately during a meeting members tend to interrupt or speak of the top of one another. Also there is often one speaker who speaks louder than the rest, so while you can hear him clearly you cannot make out anyone else who is talking at the same time. It is already difficult enough to transcribe multiple speakers, but when one is louder, it becomes virtually impossible.

The most common question in multiple-speaker audio that transcriptionists ask is, ‘When did speaker one stop speaking and speaker two start speaking?’ The answer to that can best be answered by the speakers themselves, the person they were talking to, or whoever was holding the meeting. Remember, that while you may know the speakers and find their voices very distinctive, the transcriptionist working on your file has never met or heard the people before, and to them it could sound just like the same person.

Ideally each person could identify themselves before speaking, although this is often not practical.

If you have multiple speakers vying for attention at the same time it is always a good idea, if possible, to send your transcriptionist a clear recording of each speaker, identifying them. While this will not guarantee the transcriptionist can tell them apart, it will certainly help. Clients should bear in mind they may have old recordings where the same speakers are easy to distinguish singularly, and this could help the transcriptionist.

Sometimes it is easy to tell the speakers apart, for example one may be a man and one woman, but on the other hand, if they are both of the same sex, have a similar tone of voice and come from a similar area, (therefore their accents are the same), it can be impossible for someone who doesn’t know them to tell the difference.

Tip: If you are sending at an audio file with multiple speakers to be transcribed, try and list the number of speakers, and if possible the identity.

At London Transcription our expert team of transcriptionists are trained to deal with multi-speaker audio. Before they start work they listened to the recordings to familiarise themselves with the different voices. Distinguishing between speakers often means the transcriptionist must listen to the audio multiple times in order to separate the different voices, and this is before they start the job of transcribing.

If there is any background noise it can make distinguishing between speakers impossible, so when recording try to eliminate all sounds, leaving only the voices.

It may be too much to ask, but if you can limit your staff to one speaker at a time, this makes the work easier and quicker, thus cutting down on costs for you too.


bottom of page