Interesting People mailing list archives

IP: Yahoo's editing word corruption


From: Dave Farber <dave () farber net>
Date: Sun, 14 Jul 2002 11:47:00 -0400


http://www.ntk.net/2002/07/12/


                                 >> HARD NEWS <<
                                in powers of two

          Nice to see, in the midst of all these scandals, Yahoo
          turning a healthy profit. But as other companies fiddle the
          figures, Yahoo's been busy instead with fiddling its own
          users' private correspondence. In a fantastically clumsy
          attempt to prevent cross-site scripting attacks, the free
          e-mail wing of the sprawling giant has long been replacing
          complete English words in the text of HTML mail sent to its
          users. Mention "mocha" in an HTML mail to a friend with a
          @yahoo.com account, and your choice in coffee will be
          silently switched to "espresso". Talk about "free
          expression", and your recipient will think you said "free
          statement". Here's the full list of swaperoos:
          http://www.ntk.net/2002/07/12/yahoo.txt
                                  - try not to mail it to your friends

          This fiddling has been going on now for over a year year
          (the ever vigilant RISKS digest noted it back in March
          2001). But because of Yahoo's underhand methods, very few
          people have spotted the turnabout - certainly far fewer than
          if Yahoo had done the sensible thing and, say, "**"'ed out
          the vowels in the word, or, God forbid, written a smarter
          parser. But the sneakier you are, the wider the damage
          spreads. The word "medieval" (since it contains the
          javascript command "eval") is converted in Yahoo mail to
          "medireview". Google now shows over 640 sites (and 1,150
          separate instances) of the word "medireview" being used as a
          synonym for medieval. University papers, bibliographies and
          book reviews, Indian newspaper columnists, and endless
          enthusiast sites drop it unseen into texts. People have
          begun to ask where it originally came from, and does it have
          a subtler meaning beyond "medieval"? Is Yahoo ever going to
          fix its filters? Or is it time we pushed to get the first
          regexp-obfuscated word into the Oxford English Dictionary?
          http://catless.ncl.ac.uk/Risks/21.34.html
            - does anyone still at Yahoo even know how to turn it off?
          http://www.google.com/search?q=medireview
                           - NTK now entirely filled with google links


For archives see:
http://www.interesting-people.org/archives/interesting-people/


Current thread: