Data from: The Valuation of User-Generated Content: A Structural, Stylistic and Semantic Analysis of Online Reviews
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
This record contains the underlying research data for the publication "The Valuation of User-Generated Content: A Structural, Stylistic and Semantic Analysis of Online Reviews" and the full-text is available from: https://ink.library.smu.edu.sg/etd_coll/78
The ability and ease for users to create and publish content has provided vast amount of online product reviews. However, the amount of data is overwhelmingly large and unstructured, making information difficult to quantify. This creates challenge in understanding how online reviews affect consumers’ purchase decisions. In my dissertation, I explore the structural, stylistic and semantic content of online reviews. Firstly, I present a measurement that quantifies sentiments with respect to a multi-point scale and conduct a systematic study on the impact of online reviews on product sales. Using the sentiment metrics generated, I estimate the weight that customers place on each segment of the review and examine how these segments affect the sales for a given product. The results empirically verified that sentiments influence sales, of which ratings alone do not capture. Secondly, I propose a method to detect online review manipulation using writing style analysis and assess how consumers respond to such manipulation. Finally, I find that societal norms have influence on posting behavior and significant differences do exist across cultures. Users should therefore exercise care in interpreting the information from online reviews. This dissertation advances our understanding on the consumer decision making process and shed insight on the relevance of online review ratings and sentiments over a sequential decision making process. Having tapped into the abundant supply of online review data, the results in this work are based on large-scale datasets which extend beyond the scale of traditional word-of-mouth research.