Frequency detection – DTMF fractions removal

Content

1 Example input
2 Example output
3 Background
4 DTMF remover
4.1 Libmusic
4.2 Filtering
5 Performance
6 Pros & cons

1 Example input


2 Example output


3 Background

In VOIP telephony the DTMF tones sent inband get usually detected using Goertzel algorithm. This is industry standard which proved to work, but it is limited by the fact it cannot reliably detect DTMF in blocks of data shorter than about 100 samples and it won’t work at all for blocks shorter than 40. This limitation is caused by direct relationship between the resolution of Goertzel algorithm and the number of samples in block. For a block of length N and a sampling rate of Fs, Goertzel separates frequency spectrum into N bins of Fs/N baseband each. This means that if N is small, all frequencies are aliased to same bin and detector doesn’t work, but in order to increase resolution N must be increased and this can only happen if more data gets buffered. If DTMF arrives in fractions of length shorter than 12.75 ms, then Goertzel – based detector will miss them.

Of course DTMF standard requires these signals duration to be much longer than 12.75 ms (ITU-T Q.24 requires minimum of 40 milliseconds) but in practice there are use cases when media processing server may receive such DTMF partially removed, that is in fractions of 5 – 10 milliseconds in length. Such software may than need to remove all the leftovers, but as we’ve just pointed out standard Goertzel procedure will not work in this case.

This inability of Goertzel algorithm to detect DTMF fractions is visible on the following pictures showing the result of running MATLAB implementation of Goertzel algorithm on audio data sampled at 8000 Hz, containing DTMF digit “1”. Each picture shows result of using different length of data block, from 32 to 105 samples. DTMF digit “1” is comprised from 697 Hz and 1209 Hz frequencies.

All the windows less than 4 ms produce bad results. First, the values returned by Goertzel are very similar for all bins, while they should peak for 697 Hz bin and 1209 Hz bin. Second, the resolution in the frequency domain is too low, which can be observed by multiple DTMF frequencies being put to the same bin (k number) – as lower window length N produces less bins (N) of wider length 8000/N each. The lowest window that could be considered at all (with poor results) is 5 ms (40 samples, 40 bins, each of 200 Hz baseband).

The 12.75 ms of data is 102 samples, 102 bins, each of 78.43 Hz baseband. This is used by many DTMF detection software (e.g. spandsp). 12 ms window is the minimum window to give proper resolution in the frequency domain, this is why 12.75 ms gets usually chosen by Goertzel implementations as the optimum tradeoff for robustness and minimum length.

4 DTMF remover

4.1 Libmusic

Data And Signal created libmusic – implementation of MUSIC algorithm for superresolution frequency detection. MUSIC decomposes input signal into signal and noise space using linear algebra SVD. This outperforms simple methods for frequency detection such as picking peaks of DFT spectra in the presence of noise, when the number of components is known in advance, because it exploits knowledge of this number to ignore the noise. Also, unlike DFT, it is able to estimate frequencies with accuracy higher than one sample, because its estimation function can be evaluated for any frequency, not just those of DFT bins. This is a form of superresolution technique in signal processing.

Because we know number of components in DTMF signal, libmusic outperforms DFT and Goertzel. Our fast frequency detector is a FreeSWITCH module using libmusic which can detect frequencies in data frames of lengths as short as 2 milliseconds. This allows for dramatic improvement in quality (very accurate/sensitive) and performance (makes decision fast) over the standard Goertzel algorithm, commonly used to detect DTMF tones. And most importantly it enables the DTMF fractions to be removed.

4.2 Filtering

DTMF remover applies low pass filter with cutoff frequency of 500Hz before samples are passed to libmusic for detection. This and other optimization techniques (audio level, variance ratio tests) keep CPU usage low and constrained at a level of about 2 – 4 % for a single detection session.

The final result is high quality, fast, perfectly (100%) accurate, reliable DTMF detector/remover, with performance which cannot be beaten by any other software running standard Goertzel algorithm. Thanks to the linear algebra techniques implemented by this detector it doesn’t need that many samples as Goertzel algorithm and makes much more accurate decisions. It detects 100% of the DTMF (and DTMF fractions of duration as short as 2 milliseconds) and gives no false positives.

Data And Signal - DTMF detection

5 Performance

Audio stream description

Expected

Input

Output

CPU usage
[% of total CPU power] *

Max
(instantaneous spike)

Average

Speech “Hello… This is a DTMF test. 1 [#dtmf 1], 2 [#dtmf 2], …, # [#dtmf #] Thank you. We wish you all the best. Have a good day, goodbye.”

Speech left intact. DTMF removed.

4.9

2.0

Speech with sine tones in a wave:
“Hello… This is a DTMF test [#sin 444]. 1 [#dtmf 1] [#sin 1000], 2 [#dtmf 2] [#sin 1200], …, # [#dtmf #] [#sin 3000] Thank you [#sin 300]. We wish you[#sin 200] all the best [#sin 100]. have a good day, goodbye.”

Speech left intact. Sine tones left intact. DTMF removed.


WARNING
Contains high frequencies. May damage your ears. Please play with care.

15.3**

6.0**

Speech with DTMF fractions of 10-13 ms length: “Hello… This is a DTMF test. 1 [#dtmf 1], 2 [#dtmf 2], …, # [#dtmf #] Thank you. We wish you all the best. Have a good day, goodbye.”

Speech left intact. DTMF fractions removed.

8.2

2.83

Speech with DTMF fractions of 5-6 ms length: “Hello… This is a DTMF test. 1 [#dtmf 1], 2 [#dtmf 2], …, # [#dtmf #] Thank you. We wish you all the best. Have a good day, goodbye.”

Speech left intact. DTMF fractions removed.

8.5

1.92

The music, 4:59 minutes of “D Lete Funk K – Runaway Train” cover.

Nothing removed, very low usage of CPU.

8.6

2.07

The human speech, president Donald Trump. Also some noise present in the file, multiple people shouting, applause, etc. 6 min 25 sec.

Nothing removed, very low usage of CPU.

9.5

6.85

* – Measured on machine running Debian 8 Linux with Intel Core i7-4790K (4 cores at 4.00 GHz base freq, 4.40 GHz Max Turbo, 8 threads, 8 Mb cache)
** – In this test CPU usage is affected by processing of artificially generated pure sine tones

Data And Signal’s frequency detector is available as a dynamically loadable module integrated with FreeSWITCH as DTMF detector and DTMF remover. Please contact us should you have any questions about this product.

6 Pros & cons

Advantages Drawbacks
1. Fast. Detector can work on fractions as short as 2 ms, it needs only 16 samples to accurately detect DTMF frequencies. 1. More CPU intensive than Goertzel, but still very well in the limits (2 – 4 % on a single detection session) thanks to low-pass filtering and other optimization techniques.
2. No buffering. It needs only 16 samples compared to 102 for Goertzel algorithm.
3. It is very accurate. Detector complies with CCITT Q.24 standards. Although easier to implement, neither the Goertzel nor the NDFET DTMF decoders can satisfy the standards, due to poor frequency resolution and high SNR requirements (50 dB), respectively.

Data And Signal - DTMF Removal

Please contact us if you have any questions. Call us on 1444 415 862 or send us an email.

Check all our products and services.

Bitnami