Google can now filter by "reading level"


This is a really neat advanced search feature just released by Google! You can now filter by “reading level,” including within posts for a certain site. How do they determine what’s considered Basic, Intermediate, and Advanced?

The feature is based primarily on statistical models we built with the help of teachers. We paid teachers to classify pages for different reading levels, and then took their classifications to build a statistical model. With this model, we can compare the words on any webpage with the words in the model to classify reading levels. We also use data from Google Scholar, since most of the articles in Scholar are advanced.

Okay, I’m sure that’s not a perfect method, but it’s still nifty. For example, nature.com comes up very advanced:

I wonder what my blog looks like?Oh…um… well, I’m sure that’s just a result of blogging in general, right? We’re all a bit more informal around here.
…Well, PZ does a lot more reviews of scientific articles than me, I’m sure that helps his score. Surely I must be better than something like

I’m going to interpret this as “I write in a way that’s easily accessible to the general public,” rather than “I write like a goddamn moron.”*

*I should note that my ex-boyfriend pointed all of this out to me, along with this. I think this is payback for my quip about engineers being bad in bed. Internet karma, indeed.

Comments

  1. says

    It is reasons like this that I refuse to blog….that and my blogs inevitably end up me whining about the weather or other really uninteresting stuff to read.

  2. says

    Thanks for that quip about engineers, by the way. I found it quite amusing to forward on to an engineer friend of mine to tease her. Who then got to gripe about it to me in the pub in front of Rebecca Watson.It’s like a sentence worth of amusing story.

  3. says

    Woohoo! My site is mostly “intermediate” (85%) with the remainder being “advanced” (14%) – I’m not sure where the 1% went (it said 0% for basic)…I think I now know why I don’t get many hits on my site (oh darn)…

  4. John says

    Years ago, the word processor, Word Perfect, introduced a feature that supposedly would rate a piece of writing according to the level of education required to comprehend it. It rated most everything I wrote as high-school level. To test it, I created a document composed of gibberish – long strings of letters, meaningless “words”, which the software graded as the most advanced, college level writing. It apparently did its grading based on nothing more than word length. I’m hoping Google put a little more thought into it.

  5. Ratshag says

    I feels cheated. I only got 80% “basic”. Is a poor showing fer a simple orc with a silly little blog. And how the hey did I hit 1% “advanced”?

  6. says

    I got 100% intermediate, although I’m not sure if this is really an accurate assessment of my writing. I only have a few entries up on my blog, and in them, I’ve included quotes from other people whose writing is much better than mine.

  7. says

    I got 19% basic, 57% intermediate and 23% advanced. Not too shabby for an engineer. ;-) (I wasn’t going to say anything about your comment before, but since you brought it up…)

  8. says

    First you remind us engineers that we have less women than any other faculty. Now we’re bad in bed? I’m losing the will to live here!

  9. chicagodyke says

    i’m always disturbed by the idea that people trust google to do, well, much of anything. they don’t even make the most useful search engine; it’s just “built in” to most browswers these days and people use it because it’s only one click away. google is a giant for-profit shareholder owned corporation with tentacles in about everything, from libraries to politics to music and culture, and are the farthest things from “neutral, reason-based evaluators” as can be imagined. but so often, i read stuff that proclaims, “google has it all figured out now, let’s believe them!” (not saying you’re doing that Jen, obviously you’re not) but seriously, google is taking propaganda and mind control and marketing technique to a new level. a scary new level.

  10. says

    I assume they use vocabulary as the way to measure reading levels. If your blog uses words that are commonly known, then it will fall into the basic category.

  11. says

    It’s apparently an algorithm based on teachers grading samples from Google Scholar and such. There’s a Q&A that explains on the link Jen provided.

Leave a Reply