Article

Alyona Medelyan
Alyona Medelyan 26 November 2018
Categories B2B, Data & Analytics

3 learnings from the Deep Learning for Sentiment Analysis workshop

I wanted to share with you my learnings from when our team at Thematic organized and sponsored a free 3-hour workshop on Deep Learning for Sentiment Analysis, run by Dr. Felipe Bravo-Marquez. The workshop booked out within 24hrs of announcing it and more than 40 people attended. Felipe was joined by co-host Dr. Edison Marrese. Both Felipe and Edison are Post Docs in the general area of Deep Learning focusing on NLP.

I wanted to share with you my learnings from when our team at Thematic organized and sponsored a free 3-hour workshop on Deep Learning for Sentiment Analysis, run by Dr. Felipe Bravo-Marquez. The workshop booked out within 24hrs of announcing it and more than 40 people attended. 

Felipe was joined by co-host Dr. Edison Marrese. Both Felipe and Edison are Post Docs in the general area of Deep Learning focusing on NLP.

They walked us through a huge amount of material, explaining the maths behind approaches such as distributional word similarity, word embeddings, convolutional and recurrent neural networks.

They also shared some practical tips on models, projects and experimental techniques. The content was adopted from a 3-day course, but they managed to cover the most interesting parts within only 3 hours of the workshop.

Here are my 3 key learnings:

1. Easily use and adopt models created by other researchers – instead of creating your own

Creating a Deep Learning model requires a huge amount of work on pulling together the right kind and the right amount of data, setting up the learning environment and running the algorithms on a server.

The results are often published on GitHub and other websites. Models, shared as part of the results, can be adopted for similar use-cases, often without the need of retraining.

“Fine-tuning the network with your own data is usually the best approach” – Felipe Bravo

For example, the winning solution in a recent sentiment analysis competition was an ensemble of Deep Learning models trained on various data representations.

One model, that we can’t wait to experiment with at Thematic, is the DeepEmoji project, where the specific sentiment of phrases such as “this movie was shit” and “this movie was the shit” could be discovered.

2. Big trend to use character-based NLP models

Typically, NLP models are trained by splitting text into sequences of words. This works particularly well for the English language.

Compared to languages like Finnish or Russian it has a very small number of suffixes and endings. Compared to German, words aren’t typically combined to form new words. Compared to Japanese and Chinese, words are always separated by whitespaces.

English is one of the easiest languages to analyze.

Interestingly, Deep Learning models trained on character sequences don’t rely on language-specific methods for dealing with special language characteristics, such as tokenization for splitting text into words.

Because the English language usually has the greatest amount of training data, this means that other languages can now benefit from character based models.

Similarly, in the 90s, one of the first usable language detection algorithms used the most common character sequence patterns to detect a language. There are hidden linguistic properties of languages that most people aren’t aware of, but Deep Learning models can capture them.

3. Customer feedback analysis is one of the hardest NLP tasks

Our post-event discussions with Felipe and Edison about what we do at Thematic were thought-provoking. In the academic world, thematic analysis of people’s reviews is called “aspect detection”.

For example, “room cleanliness”, “breakfast quality”, “location”, “price”, “check out” are all aspects of hotel reviews. Hotel owners need to know which sentiment is attached to each aspect.

At Thematic, we deal with a variety of businesses, and each one has their unique set of aspects. In fact, our customers don’t just want to know the sentiment of top-level aspects, but they need an in-depth understanding of what is actually driving that sentiment.

We solve this problem by automatically extracting up to hundreds of aspects or themes that are the most common in customer responses. The same algorithm automatically groups these themes into broader categories for easy analysis.

When we explained this process, Felipe and Edison agreed that it’s an extremely hard task to solve across a variety of datasets.

Because most businesses don’t have training data, Deep Learning algorithms can’t easily help, and an approach specifically crafted for this task (as we’ve done at Thematic) is required.

In the academic world, researchers tend to compete on clearly defined tasks and datasets that can be shared. While it’s possible to design a task around hotel reviews, a cross-domain approach is much harder, particularly given how subjective this task is.

I believe that the best ideas come from such interactions, and I’m sure the workshop attendees have benefited from the knowledge shared at this workshop. A huge thanks to GridAKL for sponsoring the venue, and Felipe and Edison for running it.

This article was first published here.

 

Please login or register to add a comment.

Contribute Now!

Loving our articles? Do you have an insightful post that you want to shout about? Well, you've come to the right place! We are always looking for fresh Doughnuts to be a part of our community.

Popular Articles

See all
The Impact of New Technology on Marketing

The Impact of New Technology on Marketing

Technology has impacted every part of our lives. From household chores to business disciplines and etiquette, there's a gadget or app for it. Marketing has changed dramatically over the years, but what is the...

Alex Lysak
Alex Lysak 3 April 2024
Read more
Infographic: The State of B2B Lead Generation 2024

Infographic: The State of B2B Lead Generation 2024

A new report from London Research and Demand Exchange looks at the latest trends in B2B lead generation, with clear insights around how lead gen leaders are generating the quality and quantity of leads they require.

Linus Gregoriadis
Linus Gregoriadis 2 April 2024
Read more
How much has marketing really changed in the last 30 years?

How much has marketing really changed in the last 30 years?

Have the principles of marketing changed in the age of the Internet? Or have many of the key fundamentals of the discipline stayed the same?

Ben Hollom
Ben Hollom 15 April 2024
Read more
How to Review a Website — A Guide for Beginners

How to Review a Website — A Guide for Beginners

A company website is crucial for any business's digital marketing strategy. To keep up with the changing trends and customer buying behaviors, it's important to review and make necessary changes regularly...

Digital Doughnut Contributor
Digital Doughnut Contributor 25 March 2024
Read more
7 Reasons Why Social Media Marketing is Important For Your Business

7 Reasons Why Social Media Marketing is Important For Your Business

In the past two decades social media has become a crucial tool for marketers, enabling businesses to connect with potential customers. If your business has yet to embrace social media and you want to know why it is...

Sharron Nelson
Sharron Nelson 29 February 2024
Read more