Data from: A Convolution Kernel Approach to Identifying Comparisons in Text
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
This record contains the underlying research data for the publication "A Convolution Kernel Approach to Identifying Comparisons in Text" and the full-text is available from: https://ink.library.smu.edu.sg/sis_research/2891
Comparisons in text, such as in online reviews, serve as useful decision aids. In this paper, we focus on the task of identifying whether a comparison exists between a specific pair of entity mentions in a sentence. This formulation is transformative, as previous work only seeks to determine whether a sentence is comparative, which is presumptuous in the event the sentence mentions multiple entities and is comparing only some, not all, of them. Our approach leverages not only lexical features such as salient words, but also structural features expressing the relationships among words and entity mentions. To model these features seamlessly, we rely on a dependency tree representation, and investigate the applicability of a series of tree kernels. This leads to the development of a new context-sensitive tree kernel: Skip-node Kernel (SNK). We further describe both its exact and approximate computations. Through experiments on real-life datasets, we evaluate the effectiveness of our kernel-based approach for comparison identification, as well as the utility of SNK and its approximations.