They like me! They really like me!


When we switched over to the new server last week, there were some glitches about comments. Marcus Ranum alerted me to the fact that some comments by long time commenters were going straight to the spam folder and not even being queued for moderation, so I went into that folder and rescued them. While doing so, I had to read a lot of genuine spam and two things struck me. One is the sheer volume of spam and the other is the quirky nature of so many of the comments, their peculiar choice of words (‘fastidious’ and ‘peer’ seem to be popular for some reason) and their general lack of grammar and coherence.

Here’s a sample:

Hello my loved one! I wish to say that this post is amazing, great written and come with almost all significant infos. I would like to peer extra posts like this .

Tremendous issues here. I’m very satisfied to peer
your article. Thanks a lot and I’m having a look forward to contact you.
Will you please drop me a mail?

Pretty component of content. I simply stumbled upon your site and in accession capital to say that I get in fact enjoyed account your blog posts. Anyway I will be subscribing in your augment or even I success you get admission to constantly quickly.

Hi to every , for the reason that I am actually eager of reading this web site’s post to be updated on a regular basis. It carries fastidious stuff.

What i don’t understood is in reality how you’re no longer actually much more smartly-favored than you might be right now. You’re so intelligent. You understand therefore considerably when it comes to this matter, produced me individually imagine it from numerous varied angles. Its like women and men aren’t interested unless it is something to do with Lady gaga! Your personal stuffs nice. Always handle it up!

I was quite flattered by the last one and will definitely try to handle it up more in future, once I figure out what that means.

These comments are obviously generated by some bot somewhere to try and dupe spam filters but I am at a loss why the bots think that these seemingly random aggregations of words and phrases have a better chance of getting through than a coherent message.

Comments

  1. Friendly says

    When I peer at these comments, I think that they must have been generated by stupid entities acting quickly (in other words, fastidiots).

  2. says

    Well, I find it rather obvious that “peer” is showing up due to something lost in translation when the proper word choice would have been “see.” I’m guessing, from that first example, “extra” should have been “more,” leading to “I would like to see more posts like this.” This would suggest English is not the primary language (and likely not even a secondary language) of those writing the bots. It therefore may be the bots (and those writing the bots) don’t even realize that these are “random aggregations of words and phrases.” Oh, and all your base are belong to us.

  3. says

    their peculiar choice of words

    They are trying to circumlocute around the obvious vocabulary of their concerns, in hope of avoiding bayesian spam filters. Once the spam filter is trained to detect word pairs like “special offer” or “in confidence” then they have to come up with new pairings. Plan B is they use new character encodings – but the spam filters pretty quicky included visual character mappings so that various encodings that look like ‘b’ are treated like ‘b’. And the fight goes on!

  4. Chiroptera says

    I am definitely going to be using “always handle it up!” a lot more often now.

  5. Pierce R. Butler says

    I really miss the older anti-Bayesian-filter tactic of stringing together really random pieces of text.

    From my collection:

    Tory moved  Buy Lincoln Center next two party house kept playing Red Red Wine UB40 means Unemployment Benefit  & Black Finger Nails Band Kidnaping Song

    Congressman’s Corn detasseling Girls live across Ice Hockey Stadium

    inquiry. Section: Wedding
    Articles Compare Prices
    Contents courtesy
    VA. Elizabeth castle queen
    executed cycles. cycle. directly
    Splat bugs

    They just don’t write poetry like that any more…

  6. Owlmirror says

    What I suspect happened is that they started with a markov chaining bot, which was given a large corpus of positive phrases from blog comments. But then, too many phrases started appearing similar enough that spam filters were able to spot them. So the next phase was to blindly substitute synonyms, possibly by using machine translations/backtranslations of the individual words — except of course, the synonym for one sense of a particular word isn’t the same as the different sense the word was originally used in the original sentence, and there are all sorts of different connotations that are brought in or dropped by different words.

    So “peer” was “see”, of course, “fastidious” was “careful” (or maybe “thoughtful”, or maybe there was a synonym chain from “thoughtful” to “careful” to “fastidious”), “augment” was “feed” (completely wrong sense of the word!) and “produced me individually” was probably originally “made me uniquely” or something like that.

    Perhaps “accession capital” might have been “taking stock”?

    In parsing “success you get admission to constantly quickly”, I am guessing that it might have been “gain access to daily posts”, or something like that.

  7. Owlmirror says

    And one sense of “keep” is synonymous with “hold”, and one sense of “hold” is synonymous with “handle”…

  8. timberwoof says

    I have made Markov chain word-garbage generators and I have played with Google’s automated translation service. My garbage generators were never clever enough to generate anything like diagrammable strings of words (I try to be somewhat cleverer than that), but the examples here have all the usual parts of speech. The diction and word order, however, are peculiar in the same sort of way that translations through Google can be. If I loosen up on my diction and grammar tightness requirements (I work with people of many different first languages, so this is a useful skill), those odd posts do make sense. So my guess is that these were written by Larry Skywalker and his weird bear.

Leave a Reply

Your email address will not be published. Required fields are marked *