MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 months

Type: HTTP

Audio Denoiser: A Speech Enhancement Deep Learning Model

tsram 01 Mar, 2024

8 min read

Introduction

As most of us are doing our jobs or attending school/college virtually, we often have to attend online meetings and we can’t expect each of our places to always be quiet. Some of us may live in a noisy environment where we can hear horn sounds or other people’s voices or even sometimes our earphones are at fault which is certainly undesirable for the receiver at the other end. Would it be better if we remove the noises at the sender end by using a deep learning model? Let’s take a look. This article delves into the realm of audio denoising, exploring the efficacy of deep learning models like the Audio Denoiser in enhancing speech quality by eliminating unwanted noise.

This article was published as a part of the D ata Science Blogathon.

What is Audio Denoising?
A Brief History of Different Denoising Methods
Real-Time Speech Enhancement in Waveform Domain – Facebook Denoiser
Setting up the FB Denoiser Model
Frequently asked Questions?

What is Audio Denoising?

The definition is so simple and self-explanatory, which will go like this,

Audio Denoising is the process of removing noises from a speech without affecting the quality of the speech

Here, the noises are any unwanted audio segments for the human hearing like vehicle horn sounds, wind noise, or even static noise.

It is also known as speech enhancement as it enhances the quality of speech. Speech enhancement is an important task and it is used as a preprocessing step in various applications such as audio/video calls, hearing aids, Automatic Speech Recognition (ASR), and speaker recognition. We’ll see how to remove the noises in an audio signal in the rest of this article.

A Brief History of Different Denoising Methods

In this section, let’s summarize the different audio denoising techniques used-

Spectral Subtraction
Wiener Filter
Spectral Gating
Deep Learning-based models

Spectral Subtraction

The development of speech enhancement methods traces back to 1979 when Boll S proposed a noise suppression method based on Spectral Subtraction. But, what is spectral subtraction? Let’s understand this.

The first thing they’ve done is to convert the audio signal to the frequency domain. For this, they’ve used one of the influential algorithms in digital signal processing, the Fast Fourier Transform (FFT), and some variations of FFT like Short-Time Fourier Transform (STFT) which will extract both time and frequency related features. Then they’ll simply subtract the frequency components of noises from the noisy audio to get a cleaned/enhanced speech and hence the name Spectral Subtraction.

But the spectral subtraction came up with two major shortcomings-

We have to choose a noise from the audio signal to remove it, which is not practical at all.
The noise should be present in the entire audio. So, this kind of method didn’t work well for audio signals having rare noises like car horns, making it ineffective for real-world applications.

Wiener Filter

The next one is Wiener filtering, which is an industry-standard for audio denoising and is used widely in hearing aids, smartphones, and communication devices. This filtering also requires both the noisy speech and a sample of noise present in the speech. The Wiener filter finds a statistical estimate of the clean speech from the noisy speech and the noise itself to minimize mean squared error under certain assumptions on the noise.

However, the Wiener filter comes in handy in the case of smartphones where we can have two microphones, one for recording our speech with noise and another one is for only the noise. (In your smartphone, the mic at the bottom is to record speech and the mic at the top is to record noise). So, we’re using some kind of Wiener filter every day without even realizing it. This is amazing, isn’t it?

Spectral Gating

Below is an audio waveform of a noisy speech, and the enhanced speech. These results were generated from noisereduce python module, which uses spectral gating under the hood – a traditional method as well. These kinds of traditional noise filters work well in filtering static noise but not for some rare noises, that is one of the reasons for developing Deep Learning models for speech enhancement.

Deep Learning Model

Deep Learning models are getting popular these days because of their generalization to learn and solve a task (in an end-to-end manner) without the hassle of feature engineering. This includes audio denoising too and there are some good models which will even work in real-time! Deep Learning models for audio denoising can be divided into categories, mask-based and mapping-based.

Mask-based models compute masks (boolean arrays) in the time/frequency domain based on the input noisy speech to attenuate the noises in the signal. On the other hand, the mapping-based methods aim to get them cleaned speech directly from the noisy speech if we’re provided with plenty of noisy and cleaned speech (plenty of training data!).

Next, let’s discuss Facebook Denoiser, one of the State Of The Art (SOTA) models for speech enhancement!

Real-Time Speech Enhancement in Waveform Domain – Facebook Denoiser

This model was proposed by Alexandre Défossez et al. in Facebook AI Research (FAIR) in the year 2020. The speciality of the model is it can run in real-time on a laptop CPU. Bear in mind that the proposed architecture would be as simple as possible to work in real-time systems. I’ll show how to use this model in real-time at the end of this article.

The model has an encoder-decoder U-Net architecture with skip-connections and a sequence modelling network applied to the encoder’s output. But, why does this encoder-decoder part have skip-connections in between them? Seems like the skip-connections have to be there for learning fine-grained features at the decoder part. You can also refer to this article to learn more about skip-connections. Below is the overall architecture of the model.

Facebook denoiser | Deep Learning Model — The architecture of Facebook Denoiser

Each layer in the encoder has a 1D convolution layer followed by ReLU activation. It also has 1⨯1 convolution to double the number of channels at the encoder with GLU (Gated Linear Unit) activation. Each layer in the decoder has a similar structure as the encoder except that it has 1D transposed convolution (deconvolution) in place of the convolution layer.

The sequence modelling layer at the middle of the architecture can either be unidirectional LSTM or bidirectional LSTM based on the complexity of the model. The architecture proposed in this paper is called DEMUCS. Pretty simple right?

Unlike most of the models in audio processing, this works with raw wave files in the time domain itself and hence the name Speech Enhancement in Waveform domain. But, the model optimizes in both time and frequency domain, here is how.

The authors have used L1 loss over the waveform which is the absolute difference between the noisy and cleaned audio as a loss (this works in the time domain). They’ve also computed the Short-Time Fourier Transform (STFT) for the noisy and cleaned audio and derived a loss from the computed STFT (this works in the time-frequency domain).

The model works really well in practice as the authors claimed that it could be considered as one of the SOTA models for speech enhancement. Here is the result from the paper,

I think it is enough to know the scale of each evaluation metric used in the paper rather than the complex theory behind the evaluation metrics. Think of PESQ as an evaluation metric with a scale of 0.5 to 4.5, 0 to 100 for STOI, 1 to 5 for MOS measures. Let’s set up and run this model in our local system. I’ll guide you step by step for setting up this model in your local system.

Setting up the FB Denoiser Model

First of all, we need to install the denoiser module from pypi. If you have any Linux distro or Mac OS, you’re on the side of luck! The authors didn’t provide any official support for other OSes like Windows as of now.

This is the GitHub repository of the denoiser model we’re going to use. If you have any doubts in the middle of the installation, feel free to leave comments below.

Step 1:Installing the Denoiser Module

They have provided the model in pypi that is just “pip install denoiser” will work. If you don’t want to mess up with the Python libraries you already have, just create a virtual environment (venv) and then install the denoiser. I’ll show you how to do this.


python3 -m venv denoiser      # you can use which ever name you want instead of denoiser
source denoiser/bin/activate  # for activating the venv we just created
pip install denoiser          # installing the denoiser library in the venv

You can skip the first and second line if you don’t want to create a separate environment for the installation of denoiser.

Step 2: Checking PulseAudio Installation (for Linux Users)

Linux users should check whether you have PulseAudio installed in your system. If you don’t, install with the command,


apt-get install pulseaudio-utils    # For Ubuntu, debian, kali-linux users
dnf install pulseaudio-utils        # For Fedora systems
# Use "sudo" when it's showing Permission denied or something

Also install PulseAudio Volume Control with the command,


apt-get install pavucontrol # For Ubuntu, debian, kali-linux users
dnf install pavucontrol     # For Fedora systems
# Use "sudo" when it's showing Permission denied or something

Mac OS users must have SoundFlower installed on their system to use this model. Follow this link to install SoundFlower and you can skip Step 3.

Step 3: Creating Virtual Streams

Run the commands for creating virtual streams which is exactly we’re going to use,


pacmd load-module module-null-sink sink_name=denoiser
pacmd update-sink-proplist denoiser device.description=denoiser

This will add a “Monitor of Null Output” to the list of microphones to use. Select it as input in your software. Open another terminal and run “pavucontrol”. This will open the volume controller window.

Step 4: Running the Denoiser Model


python -m denoiser.live   # This will use the installed SoundFlower

Mac users can run the command,

This will load the model and add a microphone, you can choose this and the other end can enjoy your enhanced speech now!

For Linux users, run the command,


python -m denoiser.live --out default   # "default" means it'll use the default loopback interface

The model will start denoising the input audio from the mic. I’ve tested it on a laptop running on Intel i5 8th gen with 8 GB of RAM but the model shows some noticeable delay in creating the denoised output. You can also use this model to denoise at the receiver end. You can refer to the model’s repository to know how to do this.

Conclusion

Audio denoising, essential for enhancing speech quality by removing unwanted noise, has evolved from traditional methods like spectral subtraction and Wiener filtering to advanced deep learning models. Facebook Denoiser, a state-of-the-art real-time model, employs an encoder-decoder U-Net architecture with skip-connections and sequence modeling. Operating in the waveform domain, it optimizes time and frequency domains, effectively eliminating noise while preserving speech integrity. With its simplicity and efficiency, the model offers practical solutions for applications like audio/video calls and automatic speech recognition, promising clearer communication in noisy environments. Continued advancements in deep learning will likely drive further innovations in audio denoising.

Frequently asked Questions?

Q1. Why is audio denoising important?

A. Audio denoising is crucial for improving the quality of speech by removing background noise, ensuring clearer communication in various applications such as online meetings, video calls, and speech recognition.

Q2. What are the common sources of noise in audio?

A. Noise can originate from various sources including environmental factors like traffic noise, wind, or background conversations, as well as technical issues such as faulty earphones or audio equipment.

Q3. What are the main challenges in audio denoising?

A. One challenge is effectively removing noise without degrading the quality of the speech signal. Additionally, addressing diverse types of noise and ensuring real-time processing are significant considerations.

Q4. How does audio denoising contribute to speech enhancement?

A. Audio denoising techniques aim to enhance speech quality by attenuating or eliminating background noise while preserving the clarity and intelligibility of the speech signal.

Q5. How do deep learning models revolutionize audio denoising?

A. Deep learning models offer a promising approach to audio denoising by leveraging neural networks to learn complex patterns directly from data, eliminating the need for handcrafted features and achieving superior performance in various scenarios.

The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion.

tsram 01 Mar, 2024

Audio Processing Deep Learning Intermediate

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Learn Retrieval-Augmented Generation (RAG): learn how it works, the RAG framework, and use LlamaIndex for advanced systems.

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

s_pltp

s_tslv

li_theme

li_theme_set

Google (11)

_gcl_au

SID

SAPISID

__Secure-#

APISID

SSID

HSID

DV

NID

1P_JAR

OTZ

Facebook (2)

_fbp

fr

LinkedIn (6)

bscookie

lidc

bcookie

aam_uuid

UserMatchHistory

li_sugr

Microsoft (2)

MR

ANONCHK

Reading list

Introduction to NLP

Text Pre-processing

NLP Libraries

Regular Expressions

String Similarity

Spelling Correction

Topic Modeling

Text Representation

Information Retrieval System

Word Vectors

Word Senses