This paper addresses several of the key issues facing creation of a classifier for hate-speech on forums, blogs, or other areas of web discourse.
- Home
- About
- Methodologies
- Interviews
- Events
- Other Resources
"Machine Learning" (ML) is a sub-field of computer science related to artificial intelligence that is concerned with the construction and study of systems that can “learn” from data. Essentially, ML involves programming a “learning machine” to extract meaning from data processed automatically - i.e. data not seen by a human analyst - after the machine has been “trained” through exposure to a learning data set.
"Natural Language Processing" (NLP) is an associated sub-field of computer science which applies the principles of ML to detect the meaning of “natural” language in a given data set.
The field of ML, and the associated application of NLP methods, hold great potential for applicability to counterterrorism. As methods that use artificial intelligence principles, these tools can be programmed to work through massive amounts of open source data (including social media) to look for signs of interest to the counterterrorism community. Some examples of this may be:
“Sentiment Analysis”—training a learning machine to recognize sentiments expressed in the natural language of the data set. This can be useful in the counterterrorism context, for example, to recognize “violent” messages in social media data.
“Latent Insight”—training a learning machine to use certain features of a given piece of data (in both the content and metadata) to probabilistically determine certain “latent” features about the author such as age, gender, geographic location, ethnicity, religion, political inclination…etc.) This can be useful in the counterterrorism context, for example, to provide information to officials to profile content of interest.
Using ML and NLP to triage vast amounts of data that can then be looked over more manageably by human analysts. This can be useful in the counterterrorism context as a way to spotting content of interest to be used by intelligence analysts and/or public safety officials.
This paper addresses several of the key issues facing creation of a classifier for hate-speech on forums, blogs, or other areas of web discourse.
This article studied a selection of right-wing extremist (RWE) groups on Twitter. The authors looked at particular language-based networks as case studies, collecting Twitter data for groups across eight countries.
This 2012 article focuses on predictive analytics and in particular crime prediction using events and data extracted from Twitter posts. The authors present a preliminary investigation of Twitter-based criminal incident prediction.
This article presents methods for identifying the recruitment activities of violent groups within extremist social media websites.
This paper examines approaches for analyzing Twitter messages to distinguish between those covering real-world events and non-event messages. To validate this work the authors applied their process to 2.6 million Twitter messages.
This paper explores the use of crawling global social networking platforms to undercover previously unknown radicalized individuals. To prove the utility of this process the authors collect a YouTube dataset from a group that potentially has a radicalizing agenda.
This article focuses on the use of linguistic “weak signals”—digital traces of intent—in social media as a tool of counterterrorism aimed at preventing lone-wolf attacks.
This paper presents a theoretical computerized system to detect social polarization and to estimate the related chances of violent radicalization. Existing technologies are analyzed to determine how they can be integrated into the proposed system to fulfil the authors’ objectives.
This article addresses the degree to which geolocation prediction is vital to geospatial applications like localised search and local event detection.
This article deals with a supervised machine learning text classifier, trained and tested to distinguish between hateful and/or antagonistic response with a focus on race, ethnicity or religion; and more general responses.
This resource contains the description and data analysis from a research project conducted by Fifth Tribe into ISIS’s use of Twitter in the aftermath of the 2015 Paris Attacks.
This resource is an informative, visual and descriptive infographic by Fifth Tribe focused in ISIS’s use of Twitter. The infographic and descriptions in the article were written following the Paris attacks in November 2015.
This 2013 article focuses on Natural Language Processing (NLP) and particularly social media monitoring of Twitter for purposes of national security.
This article focuses on challenges to law enforcement when dealing with massive amounts of data in criminal investigations.
The authors of this 2011 article hypothesize that the results of Google Trends, given its daily and weekly reports on queries related to various industries may be correlated to the current level of economic activity in these industries.
This 2012 paper develops a novel methodology for modeling cyber-collective social networks (CSMs) from individual, community, and transnational perspectives. The authors do this by utilizing existing collective action theories and computational approaches for social network analysis.
This article examines the use of Machine Learning (ML) algorithms for predicting key nodes for targeting within terrorist organizational structures.
This paper concerns the creation of a prototype for sentiment analysis, capable of discerning key aspects of an entity under review, and the type of polarity in the response associated with it.
This is a foundational report and a seminal work in the study of social media intelligence and open source research. The paper reviews 245 papers in a semi-systematic literature review of how information and insight can be drawn from open social media sources.
This 2014 article focuses on sentiment analysis and proposes a new model for analyzing user sentiments and opinions online.
In this paper, the authors argue that despite the widespread use of social media in various domains (e.g.
This article reveals just how vital geographical location is to geospatial applications like local search and event detection. In this paper, the research team investigates and seeks to improve on the task of text-based geolocation prediction of Twitter users.
This article is a seminal piece and a foundational resource in the field of social media analytics and open source intelligence by some of the field’s leading authors.
This paper describes the methodology that the authors have developed for the collection and sampling of conversational threads, as well as the tools they have developed to identify rumour-based threads.
This article focuses on the problem of identifying influential Twitter users using Machine Learning (ML) techniques and Natural Language Processing (NLP).
This 44 page resource is the comprehensive final report of the Umati Project that focused on monitoring hate speech online.
+1-613-755-4007 • info@secdev.foundation
Copyright © 2014 - 2017 • The SecDev Foundation