File(s) under embargo
9
month(s)13
day(s)until file(s) become available
Ground truth and raw data for conceptual dependency knowledge graphs
The data used for this research is scraped from TVTropes.org, specifically from the “Playing With A Trope” subwiki2 and the trope description pages. The ”Playing With A Trope” subwiki provides a detailed guide on the various ways tropes can be manipulated or altered in creative works. It categorizes the methods of using tropes into several distinct approaches:
- Played Straight: The trope is used in its typical manner without alteration.
- Justified: A logical in-universe reason is given for the trope’s occurrence.
- Inverted: The trope is reversed or its expected elements are flipped.
- Subverted: The trope is set up to happen but then deliberately avoided or contradicted.
- Double Subverted: The trope is subverted, but then ultimately occurs any- way in an unexpected way.
- Parodied: The trope is exaggerated humorously to mock or satirize it.
- Deconstructed: The trope is analyzed critically, exposing its flaws or unreal-istic elements.
- Reconstructed: After deconstruction, the trope is pieced back together in a way that addresses previous criticisms but retains its essence.
- Zig Zagged: The use of the trope is inconsistent, combining several of the above methods.
- Averted: The trope is completely absent from the work.
- Enforced: The trope is used because of external pressures or requirements.
- Implied: The trope is hinted at but occurs off-screen.
- Played for Laughs/Drama/Horror: The trope is used specifically to evoke humor, drama, or horror.
- Exploited: Characters in the story leverage the trope for their own benefits.
- Defied: Characters actively avoid the trope’s occurrence.
- Discussed: Characters talk about the trope, acknowledging its presence or typicality.
- Conversed: The trope is discussed in relation to other works, often breaking the fourth wall.
- Lampshaded: The presence of the trope is pointed out within the work by the characters.
This subwiki meticulously describes the permutations available for each trope based on these categories, illustrating the versatility and depth with which tropes can be ”played.” Combined with the structured relationships among tropes as indicated on their description pages, this provides a valuable dataset for constructing knowledge graphs that capture the complex interplay of narrative elements. In testing the effectiveness of the CD-based parser, 15 trope ideas are ran- domly sampled from TVTropes.org. For each trope, sentences are selected from the trope description pages containing information of trope-to-trope relationships and are manually annotated for triples. This results in a total data sample of 63 sentences and 941 triples, which is used as the ground truth for testing the CD concept extractor against ClauCy, a SpaCy implementation of ClauseIE by Chourdakis and Reiss 2018.
To explore the idea recommendation approach, the original dataset of tropes is expanded from 15 to 123 tropes, scraping a selection of tropes the list of tropes cited by the original 15 tropes, resulting in 3,153 sentences. After generating triples as per the procedure in Section 3.5, 26,993 unique concepts with 160,959 extracted triples were extracted. The dataset contains information about the tropes and their relationship with other concepts in the English language. Trope names are written in PascalCase and are embedded within the text using the markup language of PmWiki. Being an online crowdsourcing knowledge base, the style of natural language use is largely informal, though the TVTropes community share a common jargon of explicating trope relationships.