UNITED24 - Make a charitable donation in support of Ukraine!

Intelligence


Automatic Voice Identification

Voice identification technology was pioneered in the 1960s. Voice identification has since undergone aggressive research and development to bring it into the mainstream. Voice recognition is a rapidly developing technology, thanks to the availability of cheap computing power. Present laboratory spectrographic and/or computer voice comparison systems do not produce conclusive results, but meaningful findings are possible with careful analysis of speech samples collected under forensic conditions.

Voice identification systems require speech samples from the subject. The spoken input is compared with a stored sample of the subject's speech. This stored sample is called a voiceprint. A voice print is a plot of frequency density vs. time. Early voice identification systems made matches between sets of such plots. If the voiceprint and spoken samples match, then the person is identified.

Modern voice identification systems, which have lower error rates [even in the presence of noise], depend more heavily on a technique known as feature analysis. A feature is a idiosyncratic element of speech, such as a tell-tale transition between phonemes with different pitches. These may not be readily heard by humans, but they can be identified through digital analysis.

Voice identification is possible because every person has a unique set of voice characteristics and speech patterns. Voice identification extracts specific and unique features from a person?s speech, such as pitch, tone, cadence, harmonic level and vibrations in the larynx, and stores and uses them to differentiate that person's voice from other voices.

An analog sound spectrograph produces excellent voice spectrograms, especially under noisy recording conditions. It is being quickly replaced with specialized spectrographic software.

Specialized spectrogram software produces digitally calculated spectrograms that have been optimized for the speech and forensic communities. This software should be user-friendly and allow the operator to control all the important time and frequency characteristics of the graphic representation.

Specialized forensic voice identification algorithms that are presently being developed (Nakasone and Beck 2001; Reynolds et al. 2000). When fully developed, this specialized, computer-based software will allow automated and/or operator-assisted voice comparisons between different voice samples.

Editing software allows two or more recorded voice samples to be selectively isolated and combined into a new recording.

A headphone-switching box allows the rapid toggling between two input signals containing separate voice samples for aural comparison.



NEWSLETTER
Join the GlobalSecurity.org mailing list