Title: Blood glucose monitoring from voice signal, 5 years later
Abstract:
Non-invasive glucose monitoring is a notoriously difficult problem including the unfortunate history of several biomarkers developed and then retrieved. Vocal biomarkers are non-intrusive and cost-effective, and therefore are very desired. Several systems were developed for both diabetes diagnostics [2] and blood glucose (BG) monitoring [1]. The former is possible, because associated diseases cause permanent damage to the systems involved in the voice production and consequently the pathological trace is detectable from voice. Regarding the latter, there are two explanations of why voice reflects BG swings:
? the change in BG concentration makes the elastic properties of the tissue of the larynx and cord change, which in turn causes the change in spectral characteristics of voice, and
? hypoglycemia is often accompanied by anxiety, whereas hyperglycemia is accompanied by lethargy, and it is known that emotional states are detectable from voice [4].
The statistical association between voice and BG has been repeatedly confirmed in the literature, as a next step effective detection has been aimed. I will review the architectures of the successful systems:
? several generations of feature extraction software for voice signal (praat, openEAR, convolutional NNs), and
? diverse classification functions (from linear regression to deep neural networks).
The superiority of some newer architectures over the old ones were claimed, but cause doubts in the light of information leakage and cheating/ignorance on the state of the art. These defects were not well understood when those research aerticles were published. I will review the pitfalls of working with glucose and voice, including the sources of nontrivial information leakage. Non-invasive glucose biomarkers based on other sources than voice were reviewed, for example in [3.] For future work, the deep neural architecture is recommended with a number of useful practices taken from the bioinformatics analysis for trasncriptomics such as itegrating survival analysis framework [5], including {age, sex} features, avoiding information leakage if tranfer weight is used, and considering influential deep architectures in the light of the problem.

