TY - CONF
T1 - A prototype deep learning paraphrase identification service for discovering information cascades in social networks
AU - Kasnesis, Panagiotis
AU - Heartfield, Ryan
AU - Toumanidis, Lazaros
AU - Liang, Xing
AU - Loukas, George
AU - Patrikakis, Charalampos
N1 - Note: Published in: IEEE International Conference on Multimedia and Expo Workshops (ICMEW) 2020. Piscataway, U.S. : Institute of Electrical and Electronics Engineers, Inc. ISBN 9781728114859
This work was supported by EUNOMIA project [Grant Number: 825171].
Organising Body: Institute of Electrical and Electronics Engineers (IEEE)
PY - 2020/7/7
Y1 - 2020/7/7
N2 - Identifying the provenance of information posted on social media and how this information may have changed over time can be very helpful in assessing its trustworthiness. Here, we introduce a novel mechanism for discovering "post-based" information cascades, including the earliest relevant post and how its information has evolved over subsequent posts. Our prototype leverages multiple innovations in the combination of dynamic data sub-sampling and multiple natural language processing and analysis techniques, benefiting from deep learning architectures. We evaluate its performance on EMTD, a dataset that we have generated from our private experimental instance of the decentralised social network Mastodon, as well as the benchmark Microsoft Research Paraphrase Corpus, reporting no errors in sub-sampling based on clustering, and an average accuracy of 92% and F1 score of 93% for paraphrase identification.
AB - Identifying the provenance of information posted on social media and how this information may have changed over time can be very helpful in assessing its trustworthiness. Here, we introduce a novel mechanism for discovering "post-based" information cascades, including the earliest relevant post and how its information has evolved over subsequent posts. Our prototype leverages multiple innovations in the combination of dynamic data sub-sampling and multiple natural language processing and analysis techniques, benefiting from deep learning architectures. We evaluate its performance on EMTD, a dataset that we have generated from our private experimental instance of the decentralised social network Mastodon, as well as the benchmark Microsoft Research Paraphrase Corpus, reporting no errors in sub-sampling based on clustering, and an average accuracy of 92% and F1 score of 93% for paraphrase identification.
KW - Information cascade
KW - Clustering
KW - Deep learning
KW - Paraphrase Identification
KW - Computer science and informatics
U2 - 10.1109/ICMEW46912.2020.9106044
DO - 10.1109/ICMEW46912.2020.9106044
M3 - Paper
T2 - International Conference on Multimedia and Expo Workshops (ICMEW)
Y2 - 6 July 2020 through 10 July 2020
ER -