UK Newspaper Headlines

Discussion

While print circulation of newspapers in the UK continues to fall they retain a physical presence which can be hard to avoid. You can be exposed to their front pages every time you go to a supermarket or pay for petrol, so that even if you wouldn’t ever consider buying a particular paper you might still be familiar with their front pages. They act as mini-billboards, with the main headline being all that most people actually read.

I wanted to see if I could quantify the biases in their choice of lead headline. In particular I wanted to check whether the Express was really as obsessed with Europe as it appeared, and the Daily Mail has preoccupied with migrants. So to that end it was a success.

Technical details

All the data was extracted from The Paperboy website. I didn’t have to do anything fancy like text recognition, the front-page headlines themselves are available on the site. However the data isn’t perfect. Particularly with broadsheets, which carry multiple headlines on their front page, the headline recorded is actually a minor one and not the main story. Furthermore, some recorded headlines just seem to be inaccurate, or at the very least refer to earlier editions. In aggregate the data isn’t too bad though, particularly for the Mail and the Express.

I was constrained by the data available. That’s why the Mirror, Sun and Times don’t feature, which is a pity as it would have made it rather more interesting.

The code was written in Python, and I made use of the natural language processing library nltk.

After extracting the headlines I did the following:

  • Split the headline into words.
  • Generated their stems, so for example, migrant and migrants or shock and shocking would be grouped together.
  • Removed common “stopwords” which don’t carry any meaning on their own, there are just over a hundred them such as the, of, but and so forth.
  • After that I aggregated to get the results.
  • This entry was posted in Data science, Python. Bookmark the permalink.

    2 Responses to UK Newspaper Headlines

    1. Ben says:

      “ever present physical presence”

      Very interesting Nick. I got the Mail and Express the wrong way around but apart from that was right.

    Leave a Reply

    Your email address will not be published. Required fields are marked *