A typical document contains: an identifier, a piece of text (in the full version), a set of annotations (love, satisfaction), the referred brand, the sector, other named entities.
For a number of documents, extended information has been given, linking data to external datasets (Thomson Reuters' PermID, etc.).
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix sabd: <http://sabcorpus.linkeddata.es/data/> . @prefix sabv: <http://sabcorpus.linkeddata.es/vocab/> . @prefix sioc: <http://rdfs.org/sioc/ns#> . @prefix marl: <http://purl.org/marl/ns#> . @prefix onyx: <http://www.gsi.dit.upm.es/ontologies/onyx/ns#> . @prefix permid: <https://permid.org/> . @prefix org: <http://www.w3.org/TR/vocab-org/> . @prefix gr: <http://purl.org/goodrelations/v1#> . @prefix owl: <http://www.w3.org/2002/07/owl#> . sabd:826812979421257730 a sioc:Post ; sioc:id "826812979421257730" ; sioc:content "Ya me quede sin credito?? Hace 3 dias tengo credito nomas... Movistar y la concha de tu hermana 😒"@es ; marl:describesObject sabd:Movistar ; sabd:isInPurchaseFunnel sabv:postPurchase; sabd:hasMarketingMix sabv:price; onyx:hasEmotion sabv:hate, sabv:dissatisfaccion ; marl:hasPolarity marl:negative ; marl:forDomain "TELCO" .Information on companies, brands and emotions is also given.
sabd:Movistar a gr:Brand ; rdfs:seeAlso <http://dbpedia.org/resource/Movistar> ; rdfs:label "Movistar" . sabd:1-5000062703 a gr:Business ; rdfs:label "Telefonica de Espana, S.A.U."; rdfs:seeAlso <https://opencorporates.com/companies/es/82018474> ; owl:sameAs permid:1-5000062703 .
These datasets lack the Twitter texts due to copyright reasons. You can retrieve them from the ID.
Download The corpus contains only sentiment tags made following criteria and using an ad-hoc vocabulary.
This work has been presented at SPECOM17
For copyright reasons, the text is not available for download (but requests at vrodriguez.AT.fi.upm.es will be considered). However, the annotations are work of María Navas, Víctor Rodríguez and Idafen Santana. They are freely downloadable under a CC-BY 4.0 license.
Spanish Corpus for Sentiment Analysis Towards Brands, M. Navas-Loro, V. Rodríguez-Doncel, I. Santana-Pérez, A. Sánchez, in Int. Conf. on Speech and Computer (pp. 680-689). Springer ISBN: 978-3-319-66428-6 (2017)Alba Fernández Izquierdo has actively participated in this endeavour.