Due to increasing popularity of voice assistant, it has become important for programmer developing voice assistant to accurately separate speech from noise. We have converted signals into frames and than applied DFT's on static signal i.e frames to get its representation in frequency domain, one of the reasons DFT's is useful as we can separate signal into different frequency. We applied PCA analysis on the DFT stream to convert higher dimension data into lower dimensions due to curse of dimensionality i.e higher dimensions require higher data. After performing PCA, we applied Expectation Maximization GMM to separate speech from noise.


Marie Roch (Speech Processing Professor at SDSU)