Fact Check Review Methodology

<<< Back to Fact Check Review


The goal of the RCP Fact Check Review project is to understand how the flagship fact-checking organizations operate in practice, from their claim and verification sourcing to their topical focus to just what even constitutes a “fact.” To answer these questions, we have created a centralized searchable database, updated weekly, that codifies key characteristics of all fact checks bearing on issues of civic and public concern published by six major fact-checking organizations: FactCheck, the New York Times, Politifact, Snopes, the Weekly Standard, and the Washington Post. These fact checkers were selected due to their outsize influence in the fact-checking landscape and the reliance of major internet platforms such as Facebook on their decisions. Each relevant fact check is recorded using a dual coder reconciliation workflow that codifies several key attributes.

We rely on a human review workflow due to the nuanced nature of some of the attributes we compile about each fact check. While some of the basic attributes could be extracted using automated tools, many of the fields are more resistant to high-quality, automated extraction. For example, we break each fact check into the discrete claims it evaluates; summarize each claim using the fact checker’s own words; separate the list of sources to associate each source with the specific claim(s) it was used to confirm or refute; assess a claim as “fact” or “opinion”; and record whether the fact checker specifically notes that their determination was based on a lack of evidence or belief that the claim is misleading and classify each source into a type taxonomy.

Detailed Look At Coding Workflow

Each week, our two human reviewers review all six fact-checking sites and compile a list of all new fact checks published over the previous week. At the same time, they determine whether each fact check is relevant to our project or not.

This project reviews only those fact checks bearing on civic and public concern. In practice, we define this as any topic that relates to the political or social environment. Issues involving past or present public office holders or topics of national or international discussion are all included under this heading, while run-of-the-mill urban legends are not. Thus, a fact check about how many votes a U.S. senator has missed would be included, but a fact check about whether Bigfoot was sighted in Northern Virginia last week would not. If in doubt, our coders are instructed to err on the side of inclusion.

Once they have compiled the list of new fact checks and determined whether each is relevant or not, both reviewers send their lists to a reconciler who merges them together and produces a final list of the new relevant fact checks from the previous week and sends the list to both reviewers. Irrelevant fact checks are recorded in our database, but no further action is taken.

At this point, the actual coding process begins.

The first step is to break the fact check into the distinct individual “claims” being investigated. A given fact check may investigate a single claim or may examine multiple claims in the same fact check. For example, a June 2017 Politifact fact check examined a statement by Ivanka Trump, splitting it into two core claims: the first that there are six million unfilled jobs and second that the positions remained unfilled due in part to the “skills gap.” Fact checks with multiple claims may assign different rulings to each and rely on different sets of evidence to evaluate them. It is important that we record them at this native resolution of the individual claims being evaluated, as it allows us to understand the fact-checking environment at its native resolution: evaluating individual claims, rather than lumping together all claims found in a single fact check web page.

For each individual claim we record a number of pieces of information about it. The fields below record

  • URL. This is the URL of the fact check. A single fact check may contain multiple claims.
  • Date. This is the date the fact check was published. It may be in the byline at top, at the bottom, in the URL, or indicated elsewhere in the page. Left blank if not available.
  • Fact Checkers. The list of names of all of the fact checkers involved with this fact check. We include anyone listed in the byline of the fact check and anyone called out by name – this includes “researchers,” “editors,” “assistants,” etc.
  • Claim. A short, summarized version of the “claim” being evaluated by the fact checker. In some cases, the fact check provides this at the top, otherwise the reviewers identify and copy the key sentences outlining the claim.
  • Claim Fact/Opinion. For each claim, we assess whether the claim being reviewed by the fact checker is a statement of fact that can be definitively evaluated or whether it is a statement of opinion that rests in the eye of the beholder. The line between these may sometimes be blurry. If the claim's context makes it appear to be a figurative, rather than literal statement, we list as Opinion. For example, a statement like "there are too many immigrants in this country" would be Opinion, while "there are 10 million immigrants” would not. The idea of this field is to see whether fact checkers are focusing on factual claims that can be definitively proven or disproven or on opinions for which there is not a definitive “right” or “wrong” answer. Note that this field only evaluates whether the claim is one of fact that can be definitively evaluated, not whether the claim was determined to be true or not. Thus, “Obama was the 25th president” would be listed as Fact, since it can be definitively proven or disproven, even though it is a false statement.
  • Sources of Claim. Most fact checks indicate the source(s) of the claim they are investigating: in other words, where they first encountered the claim. For example, they might list a news article, web page, blog post, social media post, etc. Here we record the URL(s) of each source cited as to where the fact checker found the given claim. Sometimes this may simply say something like “NBC Interview” and not include a URL or further detail. Sources are typically cataloged via the entity making the statement, not the medium through which the source was published, and thus a CNN clip of a President Trump speech would be classified as Government, rather than Media, since the actual source of the statement is Donald Trump – CNN is simply the clip selected by the fact checker. However, if the focus of the fact check is a doctored clip in which the fact check is examining whether the video had been modified and the source itself is the focus of the fact check, then it would be listed. We review each source and classify it as one of several categories, including the following:
    • Business. A commercial business not covered by the other categories.
    • Campaign. A political campaign. This includes statements made by the reelection campaign of a sitting politician. If a candidate is elected to office, then future statements by themselves or their official government office are classified as Government, while statements from their campaign staff are classified as Campaign. A sitting politician who is campaigning for reelection will still be classified as Government, as they are always a government official even when campaigning. Statements by a political candidate directly who does not hold office are listed as Candidate.
    • Fact Checker. Fact checkers may cite or review the work of other fact checkers. If a fact checker uses data from a Government, Business or other source, but substantially interprets that data into a graph, it may be cited as the Fact Checker, rather than the original source, due to the interpretation.
    • Government. A governmental source at any level. This includes statements from government agencies, sitting officials and political candidates that currently hold governmental office.
    • Media. Any news media source.
    • Nonprofit. Non-profit entities that don’t fall into the other categories, such as being a think tank or institution of higher education. This category includes activist and advocacy groups such as the ACLU, trade unions, etc.
    • Reference. A dedicated general reference resource, such as an encyclopedia, Wikipedia, etc.
    • Think Tank. A self-identified think tank organization that focuses on providing advice and guidance on issues. For example, the Brookings Institution.
    • University. An institution of higher education. A think tank that is part of a university will typically be listed here.
  • Verdict. The fact check’s verdict regarding the claim. We also add two additional verdict labels that we assess from the fact checker’s own language:
    • Verdict Basis: Misleading. Some fact checks conclude that the claim is factually correct, but rate it false because the fact checker believes it is “misleading.” For example, the White House might claim that government spending is increasing, which the fact checker finds is correct, but that it is increasing less than it might otherwise be. In such a case the fact checker might list the claim as False, stating that the basis for this determination is that the claim is “misleading” or similar. In such cases we add a secondary verdict label of “Verdict Basis: Misleading” in addition to its official verdict. This label is only applied in cases where the fact-checking verdict is that the claim is false and the fact check explicitly states that the verdict is based on the claim being “misleading” or “contorted logic” or “a stretch” or similar language. In short, the reviewers are instructed not to decide on their own that a false verdict was based on a determination that the claim is misleading, but rather to only apply this label in cases where the fact checker explicitly states that the verdict is because they believe the claim is misleading.
    • Verdict Basis: Lack Evidence. Some fact checks conclude that a claim is false on the basis of the fact checker’s inability to locate evidence that could prove or refute it either way. For example, a fact check might examine a claim that the White House had back channel communications with a foreign power and determine that it cannot locate evidence either for or against the claim, so it is listing it as false. Again, this label is not based on an evaluation of whether the fact checker’s verdict is based on a lack of evidence, but rather is applied only in cases where a fact checker assigns a false verdict and states explicitly that this false verdict is due to a lack of evidence to assess the claim further and that they had erred on the side of rejecting it rather than leaving it as Undecided.
  • Sources for Verdict. This is the list of sources the fact checker relied upon to render their verdict on the claim. In essence, one can think of these as the sources of “truth” for the fact-checking community. Some fact checks provide a list of sources in an inset box on the page, but list all sources used for all claims in the fact check. For our database, we list only the sources used to evaluate each claim. Thus, a fact check which examines two separate claims and relies upon 10 sources in total, 2 for the first claim and the remaining 8 for the second claim, will have those sources properly separated in our database. All sources are classified using the same taxonomy as we use for Sources of Claims.

Once all claims have been reviewed, both reviewers submit their results to our tech team, who prepare the results to be uploaded onto the website. Any new claim or verification sources that have not been seen before are categorized into our source taxonomy and the final set of new records is then made available.

<<< Back to Fact Check Review