Fake News Challenge

Why did you choose the stance detection task rather than the task of labeling a claim, headline or story True/False, which seems to be what the fake news problem is all about?

Answer:

There are several reasons Stance Detection makes for a good first task for the Fake News Challenge:

Our extensive discussions with journalists and fact checkers made it clear both how difficult “truth labeling” of claims really is, and how they’d rather have reliable semi-automated tool to help them in do their job better rather than fully-automated system whose performance will inevitably fall far short of 100% accuracy.
Truth labeling also poses several large technical / logistical challenge for a contest like the FNC:
- There exists very little labeled training data of fake vs. real news stories.
- The data that does exist (e.g. fact checker website archives) is almost all copyright protected.
- The data that does exist is extremely diverse and unstructured, making hard to train on.
- Any dataset containing claims with associated “truth” labels is going to be contested as biased.
Together these make the truth labeling task virtually impossible with existing AI / NLP. In fact, even people have trouble distinguishing fake news from real news.
The dataset we are using to support the Stance Detection task for FNC-1 was created by accredited journalists, making it both high quality and credible. It is also in the public domain.
Variants of the FNC-1 Stance Detection task have already been explored and proven feasible but far from trivial by Andreas Vlachos & his students from U. of Sheffield. Cite: Ferreira & Vlachos (2016) & Augenstein et al. (2016).
We considered targeting the truth labeling task for the FNC-1, but without giving teams any labeled training data. We decided against it both because we thought a competition with a more traditionally structured Machine learning tasks would appeal to more teams, and because such an open-ended truth labeling competition was recently completed, called the Fast & Furious Fact Check Challenge.
Our discussions with human fact checkers lead us to believe that a solution to the stance detection problem could form the basis of a useful tool for real-life human fact checkers. Also see, next question/answer.

OK, but what does stance detection have to do with detecting fake news?

Answer:

There are two important ways the Stance Detection task is relevant for fake news.

From our discussions with real-life fact checkers, we realized that gathering the relevant background information about a claim or news story, including all sides of the issue, is a critical initial step in a human fact checker’s job. One goal of the Fake News Challenge is to push the state-of-the-art in assisting human fact checkers, by helping them quickly gather the information they need to make their assessment.

In particular, a good Stance Detection solution would allow a human fact checker to enter a claim or headline and instantly retrieve the top articles that agree, disagree or discuss the claim/headline in question. They could then look at the arguments for and against the claim, and use their human judgment and reasoning skills to assess the validity of the claim in question. Such a tool would enable human fact checkers to be fast and effective.
It should be possible to build a prototype post-facto “truth labeling” system from a “stance detection” system. Such a system would tentatively label a claim or story as true/false based on the stances taken by various news organizations on the topic, weighted by their credibility.

For example, if several high-credibility news outlets run stories that Disagree with a claim (e.g. “Denmark Stops Issuing Travel Visas to US Citizens”) the claim would be provisionally labeled as False. Alternatively, if a highly newsworthy claim (e.g. “British Prime Minister Resigns in Disgrace”) only appears in one very low-credibility news outlet, without any mention by high-credibility sources despite its newsworthiness, the claim would be provisionally labeled as False by such a truth labeling system.

In this way, the various stances (or lack of a stance) news organizations take on a claim, as determined by an automatic stance detection system, could be combined to tentatively label the claim as True or False. While crude, this type of fully-automated approach to truth labeling could serve as a starting point for human fact checkers, e.g. to prioritize which claims are worth further investigation.

Is there any prior work on the stance detection task that would be useful background?

Answer:

Yes! Here are two recent papers on related stance detection tasks to the one we’re using for FNC-1:

There is also a very good whitepaper on the state-of-the-art in automated fact checking available from the UK fact-checking organization FullFact.org.

Can you elaborate on the restriction against using auxiliary training data?

Answer:

Participants are free to use any unlabeled data (as pretrained embeddings or as manifold regularization), but any kind of direct or indirect supervision is not allowed other than the labels Fake News Challenge provides.

Do you plan to have an auto-scoring submission system, so teams can assess their performance before the deadline?

Answer:

We will be providing an evaluation script, but other than that there will be no autoscoring system or a leaderboard.

Why is the window between release of the testing dataset and the end of the competition so short (48H)? What if my team can’t complete the challenge in that time window?

Answer:

We are limiting the duration of the testing phase of FNC-1 to make it extremely difficult for teams to cheat by labeling the test set manually. Given the test set size, this would be very difficult to do in the two day window we are providing. We apologize to teams who cannot work with the timeline we’ve outlined.

Will the distribution of output labels (agree, disagree, discuss, unrelated) be the same in the test set as the training set?

Answer:

No, not necessarily. Real world doesn’t make i.i.d. assumptions :)

How did the Fake News Challenge Originate?

Answer:

Shortly after the uproar over fake news and its potential impact on the US elections, Dean Pomerleau proposed using artificial intelligence to address the problem as a casual bet / dare to his friends and colleagues in the machine learning community on Twitter. The initial idea was inspired by the fact AI-based filtering techniques has been quite effective at conquering email spam - a problem that seems on the surface to be quite similar to fake news. Why can’t we address fake news the same way?

Dean was certainly not the first to have this idea. He quickly learned from others who joined the effort to organize the FNC that much fundamental research in AI, ML and NLP has been happening in recent years. The convergence of this groundbreaking research and the widespread recognition that fake news is an important real-world problem resulted in an explosion of interest in our efforts by volunteers, teams and the technology press. The FNC has grown dramatically since that initial bet between friends, to the point where it now includes over 100 volunteers and 72 teams from around the world. While the details of the challenge have evolved from that initial (rather naive) wager, the goal has always remained the same - foster the use of AI, machine learning and natural language processing to help solve the fake news problem.

Do you think that automated fact checking is really possible? If yes, in the next 10 years?

Answer:

The answer depends on what kind of facts/statements you are talking about fact checking. Well defined, narrow-scoped statements like:

“US Unemployment went up during the Obama years”

could be fact checked (debunked) automatically now with a reasonably amount of additional research.

But a statement like:

“The Russians under Putin interfered with the US Presidential Election”

won’t be possible to fact check automatically until we’ve achieved human-level artificial intelligence capable of understanding subtle and complex human interactions, and conducting investigative journalism.

That’s why we’re focusing in round 1 of the Fake News Challenge (FNC-1) on the stance detection task that is tractable now (we think) and could serve as a useful tool for human fact checkers today if we had it.

A great source about the state-of-the-art in automated fact checking and what the future holds, is this 36-page white paper from FullFact.org.

How can we hope to solve the problem of fake news when people can’t even agree on the definition of “fake news”?

Answer:

In the eyes of some, ‘fake news’ means “whatever I don’t agree with.” This is not the definition adopted for the FNC. We’ve extensively investigated the various ways credible media experts have defined ‘fake news’ and have boiled it down to what they virtually all share in common. For the purposes of the FNC, we are defining fake news as follows:

Fake News: “A completely fabricated claim or story created with an intention to deceive, often for a secondary gain.”

The “secondary gain” is most often monetary (i.e. to capture clicks and/or ‘eyeballs’ to generate ad revenue), but sometimes the secondary gain may be political.

However several important distinctions need to be made when it comes to the definition of fake news.

First, claims made by newsworthy individuals, even demonstrably false claims, are by definition newsworthy and therefore not considered fake news for the FNC. This is opposed to fabricated claims about newsworthy individuals made by obscure sources seeking to make money and/or a political statement, which are considered fake news by our definition.

Second, our operative definition of fake news explicitly excludes humorous or satirical stories designed to entertain rather than deceive. The same goes from opinion pieces or editorials - they too are excluded from the category of fake news. To qualify for these exemptions, these types of stories must be clearly labeled as such in the story itself, and not, for example, buried somewhere else on the website where the story appears.

From a practical perspective, we guarantee none of the headlines or stories in the FNC-1 task will consist of recent controversial claims made by well-known individuals. Nor will they be humor, satire or OpEd pieces.

Now that FNC-1 is done, what’s next?

Answer:

Fake News Challenge was conceived to inspire AI researchers and practitioners to work on fact-checking related problems. We are in touch with our journalist and fact-checker colleagues to understand what other problems they encounter in their day-to-day work and how that can inform FNC-2. Stay tuned for the next challenge. If you have suggestions, please stop by our Slack and leave a comment. We would love to hear from you!