Approach

There are two main current technologies to deal with the detection of disinformation. One, related to the needs of fact-checkers, focuses on the processing and analysis of single messages. The other, related to the detection of disinformation campaigns organised to influence a social network, relies on social network analysis: highly similar behaviour of different user accounts along time series are indications of a disinformation campaign. However, both research lines remain separate research fields, although one gives context to the other. In fact, current AI models for misinformation detection are limited in the ability to represent and consider contextual information. It is still a research frontier we want to address. HAMiSoN’s most breakthrough goal is the integration of different technologies at both message and social network levels into a single system. Although we plan to take advantage of the hidden variable they share (their intentionality), there are many research questions that have to be addressed. A straightforward approach would be to run all involved systems separately and then compare and combine their output. However, they don’t leverage each other's signals and, in fact, the current state of the art achieves rather low performance. The alternative we want to explore is what we call an “holistic” approach, where all tasks are considered simultaneously by one integrated system.



The holistic integration is an important stepping stone for modelling and detecting organised misinformation campaigns. Such campaigns consist of multiple messages and multiple related claims spread in a coordinated way through multiple pathways in social networks. We will articulate the integration of evidence coming from the message and network levels around the idea of disinformation intentionality: agents that create and introduce disinformation in the social media networks carefully select narratives aimed to have a concrete impact such as influencing the outcome of elections by discrediting political adversaries, influence financial markets, polarise and destabilise society, generate distrust, destroy reputation, etc. This adversarial game has, at the end, benefited and injured agents. Our hypothesis is that, given a scenario (e.g. political elections), the set of intentions will be finite (e.g. destroy opponent's reputation), and the narratives used to achieve it (e.g. X has money overseas) will be limited and predictable according to a general taxonomy. Once agents of disinformation select the narratives to achieve their intended goal, they create sets of different messages with the same hidden intent, and finally coordinate their spreading in the social networks through troll farms. Our methodology will start with the reverse path: gather evidence from message and social network levels and try to infer the hidden intent. Then, once the hidden intent is detected, come back to the message and network levels with all the aggregated evidence from the three levels and try to capture the items lost in the first round by the local approaches. The identification of harmful networks will make arise new misleading messages, and newly identified disinformation content can be collected enabling improved disinformation propagation path identification in a virtuous loop.


Background Image

References

  1. Agerri, R., Centeno, R., Espinosa, M., de Landa, J. F., and Rodrigo, A. (2021). VaxxStance@IberLEF 2021: Overview of the Task on Going Beyond Text in Cross-Lingual Stance Detection. Procesamiento del Lenguaje Natural. Volume 67.
  2. Alam, F., Cresci, S., Chakraborty, T., Silvestri, F., Dimitrov, D., Martino, G. D. S., Shaar, S. and Nakov, P. (2021). A survey on multimodal disinformation detection, arXiv:2103.12541.
  3. Alonso-Bartolome, S. and Segura-Bedmar, I. (2021). Multimodal Fake News Detection, arXiv:2112.04831v1.
  4. Arquam, M., Singh, A. and Sharma, R. (2021). A blockchain-based secured and trusted framework for information propagation on online social networks. In Social Network Analysis and Mining, 11 (49).
  5. Baly, R., Karadzhov, G., An, J., Kwak, H., Dinkov, Y., Ali, A., Glass J. and Nakov, P. (2020). What Was Written vs. Who Read It: News Media Profiling Using Text Analysis and Social Media Context. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL).
  6. Barrón-Cedeno, A., Jaradat, I., Da San Martino, G. and Nakov, P. (2019). Proppy: Organizing the news based on their propagandistic content. Information Processing Management, 56(5).
  7. Benevenuto, F., Magno, G., Rodrigues, T. and Almeida, V. (2010), Detecting spammers on twitter. In Collaboration, electronic messaging, anti-abuse and spam conf. (CEAS), vol. 6.
  8. Bertini, F., Sharma, R., Iannì, A. and Montesi D. (2015). Smartphone verification and user profiles linking across social networks by camera fingerprinting. In the 7th EAI International Conference on Digital Forensics & Cyber Crime (ICDF2C).
  9. Campos, J. A., Cho, K., Otegi, A., Soroa, A., Azkune, G. and Agirre, E. (2020). Improving Conversational Question Answering Systems after Deployment using Feedback-Weighted Learning. In Proceedings of the 28th Int. Conf. on Computational Linguistics (COLING).
  10. Cao, Q., Sirivianos, M., Yang, X. and Pregueiro, T. (2012). Aiding the detection of fake accounts in large scale social online services. In the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI).
  11. Centeno, R. and Billhardt, H. (2011). Using incentive mechanisms for an adaptive regulation of open multi-agent systems. In the 22th IJCAI Int. Joint Conference on Artificial Intelligence.
  12. Centeno, R. and Hermoso, R. (2017). Estimating global opinions by keeping users from fraud in online review systems. In Knowledge and Information Systems, 55(2).
  13. Centeno, R., Hermoso, R. and Fasli, M. (2015). On the inaccuracy of numerical ratings: dealing with biased opinions in social networks. In Information Systems Frontiers, 17(4).
  14. Chavoshi, N., Hamooni, H. and Mueen, A. (2016). DeBot: Twitter bot detection via warped correlation. In the IEEE 16th International Conference on Data Mining (ICDM).
  15. Cisneros-Velarde, P., Oliveira, D. and Chan, K. (2019). Spread and control of misinformation with heterogeneous agents. In the International Workshop on Complex Networks. Springer.
  16. Da San Martino, G., Barron-Cedeno, A. and Nakov, P., 2019. Findings of the NLP4IF-2019 shared task on fine-grained propaganda detection. In Proc. of the second workshop on natural language processing for internet freedom: censorship, disinformation, and propaganda.
  17. Da San Martino, G., Cresci, S., Barrón-Cedeño, A., Yu, S., Di Pietro, R.,and Nakov, P. (2021). A Survey on Computational Propaganda Detection. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20) Survey Track.
  18. Daley, D. and Kendall, D., 1964. Epidemics and Rumours. In Nature (Vol. 204, Issue 4963).
  19. Espinosa, M. S., Agerri, R., Rodrigo, Á. and Centeno, R. (2020). DeepReading@SardiStance 2020: Combining Textual, Social and Emotional Features. In Proceedings of the Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA)
  20. Garimella, K. and Eckles, D. (2020). Images and misinformation in political groups: Evidence from WhatsApp in India. arXiv:2005.09784,
  21. Gencheva, P., Nakov, P., Màrquez, L., Barrón-Cedeño, A. and Koychev, I. (2017). A context-aware approach for detecting worthchecking claims in political debates. In Proc. of the International Conference Recent Advances in Natural Language Processing (RANLP).
  22. Ghanem, B., Glavaš, G., Giachanou, A., Ponzetto, S. P., Rosso, P. and Rangel, F. (2019) UPV-UMA at CheckThat! Lab: Verifying Arabic Claims using a Cross Lingual Approach. In Working Notes of CLEF 2019 - Conference and Labs of the Evaluation Forum.
  23. Gupta, A. and Kaushal, R. (2017). Towards detecting fake user accounts in facebook. In 2017 ISEA Asia Security and Privacy (ISEASP).
  24. Hassan, N., Li, C. and Tremayne, M. (2015). Detecting check-worthy factual claims in presidential debates. In Proceedings of the 24th ACM international Conference on Information and Knowledge Management (CIKM).
  25. Jiang, M., Cui, P., Beutel, A., Faloutsos, C. and Yang, S. (2016). Inferring lockstep behavior from connectivity pattern in large graphs. In Knowledge and Information Systems, 48(2).
  26. Jin, F., Dougherty, E., Saraf, P., Mi, P., Cao, Y. and Ramakrishnan, N. (2013). Epidemiological modeling of news and rumors on Twitter. In Proceedings of the 7th Workshop on Social Network Mining and Analysis (SNA-KDD).
  27. Kazemi, A., Garimella, K., Gaffney, D. and Hale, S. A. (2021). Claim matching beyond English to scale global fact-checking, arXiv preprint arXiv:2106.00853.
  28. Lytvyniuk, K., Sharma, R. and Jurek-Loughrey, A. (2018). Predicting Information Diffusion in Online Social Platforms: A Twitter case study. Ïn the International Conference on Complex Networks and Their Applications:
  29. Mazza, M., Cresci, S., Avvenuti, M., Quattrociocchi, W. and Tesconi, M. (2019). RTbust: Exploiting temporal patterns for botnet detection on Twitter. In Proceedings of the 10th ACM Conference on Web Science.
  30. Moreno, Y., Nekovee, M. and Pacheco, A. F. (2004). Dynamics of rumor spreading in complex networks. In Physical Review E - Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics, 69(6).
  31. Nakamura, K., Levy, S. and Wang, W. Y. (2019). r/Fakeddit: A new multimodal benchmark dataset for fine-grained fake news detection. In Proceedings of the 12th Language Resources and Evaluation Conference (LREC).
  32. Patwari, A., Goldwasser, D. and Bagchi, S. (2017). TATHYA: A multi-classifier system for detecting check-worthy statements in political debates. In Proceedings of the ACM Conference on Information and Knowledge Management (CIKM).
  33. Rashkin, H., Choi, E., Jang, J. Y., Volkova, S. and Choi, Y. (2017). Truth of varying shades: Analyzing language in fake news and political factchecking. In Proceedings of the conference on empirical methods in natural language processing (EMNLP).
  34. Ross, B., Pilz, L., Cabrera, B., Brachten, F., Neubaum, G. and Stieglitz, S. (2019). Are social bots a real threat? An agent-based model of the spiral of silence to analyse the impact of manipulative actors in social networks. In the European Journal of Information Systems, 28(4).
  35. Schütz, M., Schindler, A., Siegel, M. and Nazemi, K. (2021). Automatic fake news detection with pre-trained transformer models, In the ICPR International Workshops and Challenges. Springer International Publishing.
  36. Shaar, S., Haouari, F., Mansour, W., Hasanain, M., Babulkov, N., Alam, F., Da San Martino, G., Elsayed, T. and Nakov, P. (2021). Overview of the CLEF-2021 CheckThat! Lab Task 2 on Detecting Previously Fact-Checked Claims in Tweets and Political Debates. In Proceedings of the Working Notes of CLEF 2021 - Conference and Labs of the Evaluation Forum,
  37. Shaar, S., Martino, G. D. S., Babulkov, N. and Nakov, P. (2020). That is a Known Lie: Detecting Previously Fact-Checked Claims, In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL).
  38. Sharma, S. and Sharma, R. (2021). Identifying possible rumor spreaders on twitter: A weak supervised learning approach. In the Int. Joint Conference on Neural Networks (IJCNN).
  39. Sharma, S. and Sharma, R. (2021). Misleading the Covid-19 vaccination discourse on Twitter: An exploratory study of infodemic around the pandemic. arXiv preprint arXiv:2108.10735.
  40. Shu, K., Cui, L., Wang, S., Lee, D. and Liu, H. (2019). defend: Explainable fake news detection. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining.
  41. Shu, K., Mahudeswaran, D., Wang, S., Lee, D. and Liu, H. (2020). FakeNewsNet: A Data Repository with News Content, Social Context, and Spatiotemporal Information for Studying Fake News on Social Media. In Big Data, 8(3).
  42. Stefanov, P., Darwish, K., Atanasov, A. and Nakov, P. (2020). Predicting the Topical Stance and Political Leaning of Media using Tweets. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL).
  43. Tambuscio, M. and Ruffo, G. (2019). Fact-checking strategies to limit urban legends spreading in a segregated society. In Applied Network Science, 4(1).
  44. Tambuscio, M., Ruffo, G., Flammini, A. and Menczer, F. (2015). Fact-checking effect on viral hoaxes: A model of misinformation spread in social networks. In Proceedings of the 24th International Conference on World Wide Web (WWW).
  45. Thomas, K., McCoy, D., Grier, C., Kolcz, A. and Paxson, V. (2013). Trafficking fraudulent accounts: The role of the underground market in twitter spam and abuse. In the 22nd USENIX Security Symposium.
  46. Trpevski, D., Tang, W. K. S. and Kocarev, L. (2010). Model for rumor spreading over networks. In Physical Review E, 81(5).
  47. Vasileva, S., Atanasova, P., Màrquez, L., Barrón-Cedeño, A. and Nakov, P. (2019). It takes nine to smell a rat: Neural multi-task learning for check-worthiness prediction. Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP).
  48. Vosoughi, S., Roy, D. and Aral, S. (2018). The spread of true and false news online. In Science 359(6380).
  49. Vo, N. and Lee, K. (2020). Where are the facts? searching for fact-checked information to alleviate the spread of fake news, arXiv preprint arXiv:2010.03159.
  50. Wu, Y., Zhan, P., Zhang, Y., Wang, L. and Xu, Z. (2021). Multimodal Fusion with Co-Attention Networks for Fake News Detection. In Findings of the Association for Computational Linguistics (ACL-IJCNLP).
  51. Zaman, T., Fox, E. B. and Bradlow, E. T. (2014). A Bayesian approach for predicting the popularity of tweets. In the Annal of Applied Statistics, 8(3).
  52. Zhang, Y. and Yang, Q. (2021). A Survey on Multi-Task Learning. In IEEE Transactions on Knowledge and Data Engineering.
  53. Zhang, J., Zhang, R., Zhang, Y. and Yan, G. (2016). The rise of social botnets: Attacks and countermeasures. In IEEE Transactions on Dependable and Secure Computing, 15(6).
  54. Zhao, L., Qiu, X., Wang, X. and Wang, J. (2013). Rumor spreading model considering forgetting and remembering mechanisms in inhomogeneous networks. In Physica A: Statistical Mechanics and Its Applications, 392(4).
  55. Zubiaga, A., Liakata, M. and Procter, R. (2016). Learning reporting dynamics during breaking news for rumour detection in social media. arXiv preprint arXiv:1610.07363.