Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One big issue with gathering this kind of data over the phone is the frequency cutoff on voice-only lines—above a certain frequency (I want to say 4kHz? maybe I'm misremembering), the information is lost. It's basically as if you took a Fourier transform, zeroed out everything above the threshold, and then transformed back.[0] For humans (and even computers) trying to interpret the sound as language, that's not a huge problem, although you might lose some of the higher formants. But for an acoustic analysis that's trying to do voiceprinting—in this case to detect Parkinson's—this could be a big problem.

(I'm also irritated by the glib "99 percent success rate" but I just ranted about that on a different HN post so I won't go into it here.)

[0] Why do this? So the phone company can compress and send a lot more data over the same amount of internal bandwidth. Come to think of it, it's kind of related to how wavelet-based compression works.



8 bits, 8 kHz. There's only so much you can do with that.

http://en.wikipedia.org/wiki/DS0

Sure, we can do more now, but this isn't exactly new. Consider the age in which it was conceived and it may seem more reasonable.


The sampling frequency doesn't directly effect the audio frequencies it can encode. Telephones do PCM encoding (meaning it has data representing the graph of the sound wave) at 8 kh/z. Following the nyquist sampling theorem (cut your rate by 2), this can allow frequencies up to 4 kh/z (as you said). It's not a hard cutoff though, you can still get most of the sounds above that pitch, they'll just sound pretty weird (as if you were talking on the telephone!)


No, components above the Nyquist frequency are filtered out before the ADC because you'd otherwise get aliasing.

https://en.wikipedia.org/wiki/Aliasing


> So the phone company can compress and send a lot more data

Actually, the limits date back to analog phone lines with circuit switching of a century ago. Back then there was a wire going through switches from one phone to another. The quality requirements were that those lines had to pass 300 Hz to 3.4 kHz or so. That often required "pupinization" - adding inductive coils to tune the line's frequency response.

If you look at a spectrogram of your voice, there's very little power above 1.5 or 2 kHz. However, it seems the high frequency part is important to understandability, including perception of emotional overtones.

(Just the other day, playing with modems, we found a weird case - voice being pumped through before the call was considered completed - which I suspect is the persistence in digital protocols of the analog behavior of a century ago.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: