DataHour: Demystifying Clustering in Topic Modeling Algorithms like BERTopic
DataHour: Demystifying Clustering in Topic Modeling Algorithms like BERTopic
18 Jan 202313:01pm - 18 Jan 202314:01pm
DataHour: Demystifying Clustering in Topic Modeling Algorithms like BERTopic
About the Event
In a world where customer reviews drive product sales and business decisions, it is crucial to pick relevant topics from vasthuge corpus of textual feedback provided by customers, either on public platforms like Twitter, TrustPilot, Google reviews or internally collected feedback via email campaigns and surveys. It is manually impossible to sift through tons of textual data and get what customers are speaking about. Enter BERTopic, an unsupervised machine learning topic modeling technique which learns a set of topics from a collection of documents (customer reviews), where each topic is represented as a distribution over words, and each document is represented as a distribution over topics. It identifies the underlying themes or topics from this collection and represents each document in terms of the proportions of those themes that it contains.
Clustering is an integral part of this topic identification process. It groups words into different topics based on their statistical co-occurrence patterns in the documents. Using dimensionality reduction techniques like SVD and clustering algorithms like K-means, groups of words are identified that are strongly associated with specific themes or topics, and to use these groups to define the set of topics learned by the model.
In this DataHour, Abhiram will break down how clustering works in BERTopic with the help of the Amazon Alexa Reviews dataset.
Prerequisites: Basic understanding of NLP and curiosity of learning Data Science.
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
Who is this DataHour for?
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
About the Speaker
Participate in discussion
Registration Details
Registered
Become a Speaker
Share your vision, inspire change, and leave a mark on the industry. We're calling for innovators and thought leaders to speak at our event
- Professional Exposure
- Networking Opportunities
- Thought Leadership
- Knowledge Exchange
- Leading-Edge Insights
- Community Contribution
