Back in December, Google News was granted its second continuation patent in ten years, giving the layman some insight into the search giant’s algorithm, and how it chooses which articles to feature.
This news was reported favorably a few months later in outlets like Computerworld and The Nation, who singled out parts of the patent that indicate a news source’s importance based on how many bureaus a publication might have, or what its print subscription numbers look like. But somewhat ironically given the subject matter, these outlets missed the most important part of the story.
When the US Patent and Trademark Office issues a patent (or in this case, a continuation patent – Google News filed its first in 2003), it’s the Claims section that comes under scrutiny. The Description section is substantially the same in the original 2003 version as well as the two continuation patents. It’s the Claims section which lays out the claims being patented.
A continuation patent takes the filing date of the original patent, but clarifies or narrows the claims that were in the first version. So December’s Google News continuation patent essentially updated specific aspects of the original patent filed in 2003, adding more focused claims.
The Claims section of the latest version of Google News’ patent shows a shift away from the traditional signals of a news source’s importance. Now rather than where a story is being reported, it is how a story is being reported both online and off that makes a difference in terms of ranking. These signals, per the patent, include:
- How quickly an event happened before publication of an article about it took place,
- The “usage pattern regarding traffic associated with the source” means that the search engine is looking at things such as how many people click upon a link to specific articles from the source, monitoring traffic to articles from a source to see how often people click (or don’t click) on links to particular articles from individual sources. They tell us in the patent: “Well known sites, such as CNN, tend to be preferred to less popular sites, such as Unknown Town News, which users may avoid. The traffic measured may be normalized by the number of opportunities readers had of visiting the link to avoid biasing the measure due to the ranking preferences of the news search engine.”
- How many “entities” – that is, proper nouns, or people, places and things – are mentioned in an article compared to similar articles within the same “cluster” of related articles. This promotes more original and more in-depth writing), and
- The “international diversity of the source” may look at the audience for the source and where that audience comes from, and/or where links to the source might come from. The patent tells us that the number of different countries that traffic to the source comes from, or the number of links from different countries to the source might be how Google may measure this “international diversity.” So, if traffic from more countries tend to go to the Guardian than the New York Post, that’s one way of measuring international diversity. If links from more countries tends to Point to the New York Post than to the Guardian, that could be another way to measure international diversity.
It’s your turn now. Does the analysis above make sense to you? What has been your experience with Google News? Please share your feedback via comments. Thank you.