SMU Research Data Repository (RDR)
Browse
.MDB
IMDB_random_1000.mdb (520 kB)
.MDB
Douban_random_1000.mdb (460 kB)
.MDB
IMDB_Douban_data.mdb (504 kB)
1/0
3 files

Data from: The Valuation of User-Generated Content: A Structural, Stylistic and Semantic Analysis of Online Reviews

dataset
posted on 2011-12-01, 00:00 authored by Sian KOH Noi

This record contains the underlying research data for the publication "The Valuation of User-Generated Content: A Structural, Stylistic and Semantic Analysis of Online Reviews" and the full-text is available from: https://ink.library.smu.edu.sg/etd_coll/78

The ability and ease for users to create and publish content has provided vast amount of online product reviews. However, the amount of data is overwhelmingly large and unstructured, making information difficult to quantify. This creates challenge in understanding how online reviews affect consumers’ purchase decisions. In my dissertation, I explore the structural, stylistic and semantic content of online reviews. Firstly, I present a measurement that quantifies sentiments with respect to a multi-point scale and conduct a systematic study on the impact of online reviews on product sales. Using the sentiment metrics generated, I estimate the weight that customers place on each segment of the review and examine how these segments affect the sales for a given product. The results empirically verified that sentiments influence sales, of which ratings alone do not capture. Secondly, I propose a method to detect online review manipulation using writing style analysis and assess how consumers respond to such manipulation. Finally, I find that societal norms have influence on posting behavior and significant differences do exist across cultures. Users should therefore exercise care in interpreting the information from online reviews. This dissertation advances our understanding on the consumer decision making process and shed insight on the relevance of online review ratings and sentiments over a sequential decision making process. Having tapped into the abundant supply of online review data, the results in this work are based on large-scale datasets which extend beyond the scale of traditional word-of-mouth research.

History