Generating Cafe Suggestions from Yelp Reviews

August 25, 2017

Abstract

Reviews for businesses and restaurants on Yelp are typically used by consumers to determine where to best spend their money. However, reviews also offer important insights for businesses to leverage. Reviews can show what a business is doing right and what it needs to improve upon. Naive approaches to discovering insights from reviews require an impractical amount of text to iterate through, often skewed by text outliers that would distract a business from implementing meaningful changes. With hopes of circumventing these issues, this project aims to use Latent Dirichlet Allocation (LDA) to pinpoint areas of improvement relevant to a business. Along with topic extraction, we incorporate sentiment analysis to generate recommendations for cafes.

Introduction

Motivation

Yelp currently has “tips” from users to help other users. However, there is useful information that restaurants can obtain from their customers’ reviews. More specifically, features or subtopics can be extracted from reviews to discover the distinct topics most relevant to the restaurant (i.e. what its customers are saying about it).

The reviews of a business on Yelp can yield key ideas for business improvements. The discovery of these core concepts can help a business improve its ratings. Without the ability to acquire actionable insights from review text, however, businesses are left guessing about their flaws. The goal of this project’s analysis is to provide business owners a clear target for the pursuit of improving their ratings.

Dataset

The dataset that was utilized to address this problem was the Yelp Academic Dataset. This dataset includes 4.1 million reviews by users for businesses and 1.1 million business attributes. For this project, the businesses were filtered by the business category “Coffee & Tea”, or cafes. Businesses were narrowed down to cafes so that we could find meaningful subtleties between a particular category of restaurants, rather than extract generic topics across all restaurants (e.g. vague topics like “lunch” or “dinner”). The initial hypothesis was that inspecting the cafe category would reveal influential aspects specific to cafe businesses such as barista wait time, specialty drinks, and bakery items.

Methods

The model devised to help businesses to improve their service incorporates two main components:

Latent Dirichlet Allocation (LDA): The trained LDA model can then be used to determine the subtopics contained in input reviews. LDA is used to lower dimensionality of the large text data of the reviews and extract their latent subtopics.

Sentiment Treebank: The next step involves analyzing the reviews sentiment in order to assess how the individuals feel about the topics extracted from their reviews. The ranking of the topics based on the sentiment of the reviews can then be used to suggest improvements that businesses can make.

Improving Restaurants by Extracting Subtopics from Yelp Reviews highlights the applications of using an LDA model to extract subtopics from text data. This paper focused mainly on the correlation of individual topics with the overall review rating, and how the rating of topics correlated with each other. Our implementation strives to improve upon their model by identifying topics which a business can improve upon. Furthermore, we focus on a specific category which allows our model to better identify subtle, yet important topics.

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank relates to the building of the Sentiment Treebank in the Stanford NLP package. This model of sentiment analysis strives to give a better understanding of sentiment in longer phrases compared to traditional sentiment scores. The sentiment scores from the trees will be used to give recommendations on the review subtopics.

Experiments

Experiment 1: Training a Feature/Topic Extraction Model

Latent Dirichlet Allocation was chosen as the preferred method for extracting subtopics, instead of primary topics from a text corpus. LDA is an unsupervised learning algorithm so it requires later interpretation. Training data was collected from the Yelp Academic Dataset, specifically the reviews under the category of “Coffee & Tea”. The reviews were stemmed and transformed into the correct format and then fed into the GenSim Python library’s LDA function to create our topic model. After review of multiple iterations of the model, 25 was determined to be the correct amount of topic categories. The model could then be used to predict the subtopics contained in a new review and to characterize the topic distribution of a cafe.

Coffee - General Atmosphere Wait Time / Service Baked Bread Items
coffee (20.4%) place (8.9%) time (4.5%) 787 (8.6%)
shop (3.9%) staff (5.2%) service (3.1%) pastry (6.4%)
bean (1.3%) music (2.1%) order (3.4%) bread (2.3%)
espresso (%) fun (1.3%) wait (1.3%) baguette (1.6%)
Fig. 1 - Example subtopic distribution for a review.

Experiment 2: Interpretation, Ranking, and Prioritization of Topics

A list of around 10 words was returned that defined each subtopic of the LDA model. This output was then interpreted in order to come up with general topic labels. For example the label ’Seating’ could have been interpreted from the words ’chair’, ’table’, and ’space’. Given an input review the subtopics returned also included weights related to the percentage that they contributed to the overall categorization. Using these weights it was possible to determine which topics were of the most importance to a given cafe, allowing our model to filter out topics which were of little value to the business.

“Went in here for the first time today to just grab a cup of coffee and do some studying. Great atmosphere. Not a lot of tables to sit down and people tend to stay there for a long time but that’s typical of most coffee shops especially when their coffee is as good as theirs. I actually just got their regular drip coffee and it was some of the best standard coffee around..”

Fig. 2 - Correlation Scatter Plot: CitiBike and Subway Usage.

Experiment 3: Sentiment Analysis

The Stanford NLP Sentiment Treebank was used on reviews as their subtopics were being extracted. This allowed our model to assign a Sentiment Score to topics. This Sentiment Score could then be used to rank the topics, giving insights into which topics were viewed favorably and which were viewed more negatively. Star Rating and Sentiment Score were combined to determine the recommendations to give a business.

Experiment 4: Star Rating Prediction

The Sentiment Treebank could also be used to predict the star rating of a review using the formula generated during linear regression. The sentiment score correlated very strongly with the star rating that users gave on their reviews. This high correlation suggests that the Sentiment Treebank worked very well in assigning its score.

slope 0.920
intercept 1.898
r_squared 0.144
p_value 0.061
std_err 0.469
Fig. 3 - Linear regression of sentiment score on Yelp star rating, run on the first 20 reviews of Pavement Coffeehouse reviews.

Technical Issues and Challenges

Results and Discussion

Prevalent Topics for Cafes Across Yelp Dataset

Although a few of the topic labels were warily assigned, there were many that showed clear topics: Atmosphere, Service, Doughnuts, Seating, Bakery Items, Coffee, Tea and Specialty Drinks. These topics offer very clear items which a business can improve or be proud of. Based on individual inspection of the reviews and the returned topics, the model seems to work well for identifying subtopics.

The relationship between star rating and sentiment of review shows a high correlation over many different cafe reviews and the combination of both appears to be a good indicator for the overall feeling of a topic by the reviewers.

Potential Improvements

Implications for Yelp and Listed Restaurants

Topic modeling using LDA combined with the more nuanced Sentiment Analysis approach of a Sentiment Treebank is a step in the right direction to help businesses realize their strengths and improve on their deficiencies. Although there are some drawbacks to this model in terms of breadth, it still allows for a cafes to gain simple yet valuable insights from its review data. Lastly, it would be interesting if Yelp adopted a similar idea and added tools to help businesses understand how to become better through their consumers’ reviews.

The code for this project can be found here.