Article

Alyona Medelyan
Alyona Medelyan 26 November 2018
Categories B2B, Data & Analytics

3 learnings from the Deep Learning for Sentiment Analysis workshop

I wanted to share with you my learnings from when our team at Thematic organized and sponsored a free 3-hour workshop on Deep Learning for Sentiment Analysis, run by Dr. Felipe Bravo-Marquez. The workshop booked out within 24hrs of announcing it and more than 40 people attended. Felipe was joined by co-host Dr. Edison Marrese. Both Felipe and Edison are Post Docs in the general area of Deep Learning focusing on NLP.

I wanted to share with you my learnings from when our team at Thematic organized and sponsored a free 3-hour workshop on Deep Learning for Sentiment Analysis, run by Dr. Felipe Bravo-Marquez. The workshop booked out within 24hrs of announcing it and more than 40 people attended. 

Felipe was joined by co-host Dr. Edison Marrese. Both Felipe and Edison are Post Docs in the general area of Deep Learning focusing on NLP.

They walked us through a huge amount of material, explaining the maths behind approaches such as distributional word similarity, word embeddings, convolutional and recurrent neural networks.

They also shared some practical tips on models, projects and experimental techniques. The content was adopted from a 3-day course, but they managed to cover the most interesting parts within only 3 hours of the workshop.

Here are my 3 key learnings:

1. Easily use and adopt models created by other researchers – instead of creating your own

Creating a Deep Learning model requires a huge amount of work on pulling together the right kind and the right amount of data, setting up the learning environment and running the algorithms on a server.

The results are often published on GitHub and other websites. Models, shared as part of the results, can be adopted for similar use-cases, often without the need of retraining.

“Fine-tuning the network with your own data is usually the best approach” – Felipe Bravo

For example, the winning solution in a recent sentiment analysis competition was an ensemble of Deep Learning models trained on various data representations.

One model, that we can’t wait to experiment with at Thematic, is the DeepEmoji project, where the specific sentiment of phrases such as “this movie was shit” and “this movie was the shit” could be discovered.

2. Big trend to use character-based NLP models

Typically, NLP models are trained by splitting text into sequences of words. This works particularly well for the English language.

Compared to languages like Finnish or Russian it has a very small number of suffixes and endings. Compared to German, words aren’t typically combined to form new words. Compared to Japanese and Chinese, words are always separated by whitespaces.

English is one of the easiest languages to analyze.

Interestingly, Deep Learning models trained on character sequences don’t rely on language-specific methods for dealing with special language characteristics, such as tokenization for splitting text into words.

Because the English language usually has the greatest amount of training data, this means that other languages can now benefit from character based models.

Similarly, in the 90s, one of the first usable language detection algorithms used the most common character sequence patterns to detect a language. There are hidden linguistic properties of languages that most people aren’t aware of, but Deep Learning models can capture them.

3. Customer feedback analysis is one of the hardest NLP tasks

Our post-event discussions with Felipe and Edison about what we do at Thematic were thought-provoking. In the academic world, thematic analysis of people’s reviews is called “aspect detection”.

For example, “room cleanliness”, “breakfast quality”, “location”, “price”, “check out” are all aspects of hotel reviews. Hotel owners need to know which sentiment is attached to each aspect.

At Thematic, we deal with a variety of businesses, and each one has their unique set of aspects. In fact, our customers don’t just want to know the sentiment of top-level aspects, but they need an in-depth understanding of what is actually driving that sentiment.

We solve this problem by automatically extracting up to hundreds of aspects or themes that are the most common in customer responses. The same algorithm automatically groups these themes into broader categories for easy analysis.

When we explained this process, Felipe and Edison agreed that it’s an extremely hard task to solve across a variety of datasets.

Because most businesses don’t have training data, Deep Learning algorithms can’t easily help, and an approach specifically crafted for this task (as we’ve done at Thematic) is required.

In the academic world, researchers tend to compete on clearly defined tasks and datasets that can be shared. While it’s possible to design a task around hotel reviews, a cross-domain approach is much harder, particularly given how subjective this task is.

I believe that the best ideas come from such interactions, and I’m sure the workshop attendees have benefited from the knowledge shared at this workshop. A huge thanks to GridAKL for sponsoring the venue, and Felipe and Edison for running it.

This article was first published here.

 

Please login or register to add a comment.

Contribute Now!

Loving our articles? Do you have an insightful post that you want to shout about? Well, you've come to the right place! We are always looking for fresh Doughnuts to be a part of our community.

Popular Articles

See all
7 reasons why social media marketing is important for your business

7 reasons why social media marketing is important for your business

Social media is quickly becoming one of the most important aspects of digital marketing, which provides incredible benefits that help reach millions of customers worldwide. And if you are not applying this profitable...

Sharron Nelson
Sharron Nelson 6 February 2018
Read more
Digital Marketing Vs. Traditional Marketing: Which One Is Better?

Digital Marketing Vs. Traditional Marketing: Which One Is Better?

What's the difference between digital marketing and traditional marketing, and why does it matter? The answers may surprise you.

Julie Cave
Julie Cave 14 July 2016
Read more
Top 10 B2B Platforms to Help your Business Grow Worldwide

Top 10 B2B Platforms to Help your Business Grow Worldwide

Although the trend of a Business to Business portal is not new but the evolution of technology has indeed changed the way they function. Additional digital trading features and branding has taken the place of...

Salman Sharif
Salman Sharif 7 July 2017
Read more
What Marketing Content Do Different Age Groups like to Consume?

What Marketing Content Do Different Age Groups like to Consume?

Today marketers have a wide choice of different content types to create; from video to blogs, from memes to whitepapers. But which types of content are most suitable for different age groups?

Lisa Curry
Lisa Curry 21 October 2016
Read more
Collection Of The Best Email Testing Tools Online

Collection Of The Best Email Testing Tools Online

Don’t be afraid of email testing. There are many free or freemium tools online that can help you with testing your SPAM score, deliverability and even the rendering of your email. We feature 30 email testing tools in...

Roland Pokornyik
Roland Pokornyik 31 October 2016
Read more