Full Disclosure mailing list archives
What Lexical Analysis Became in The Web-Slave New World
From: M.B.Jr. <marcio.barbado () gmail com>
Date: Tue, 7 Oct 2008 14:30:38 -0300
What Lexical Analysis Became in The Web-Slave New World The point here is XSS, but rather than talking about the Internet weaknesses it exposes, this text goes against the poor algorithms being used to "detect" and/or avoid it. Hazardous XSS. Hazardous low-quality-XSS-filtering. These are critical times for Internet users, undoubtedly. We face negligence‑oriented services at each new click. It's a contradiction seeing so many efforts (RFCs) being made and concomitantly, the only "user-friendly" (oh yeah, that expression) place offered by the industry to regular end users, remaining the same application layer, the top of the iceberg. But regular end users don't know that. Paraphrasing Josh Homme, they just "go with the flow", victimized by a doctrine that makes them believe those practices and technologies are the only ones available, this way forming the new industry‑led slave mass. And it becomes a severer issue by the moment one realizes this commercially called "Web 2.0" and its risks disclose, more than vulnerabilities, web apps programming laziness, also known as XP or Agile methodology. Hail, Kent Beck! One way or another, a jungle presents itself to users, into the highest layer and preoccupations rise faster as indolent techniques are applied to XSS‑filtering. So, let's discuss it. You know Google? Well, check this out, there's this Google corporation stating that their BETA releases represent a new web-based BETA concept. As if their web apps weren't client-server software. Two of their free BETA services, Google Calendar and Orkut, are going to be discussed here along with an eager-to-follow-bad-examples Brazilian company, Locaweb, and its paid web-based e-mail service, Locamail. The worst case to be analyzed implies using the combination "<>" without quotation marks, to delimit some information. The referred services' handling for those characters can cause users' data to be lost. Readers will be able to test it, easily, at least on Google's services. In opposition to the once vulnerable Google Documents, which was used to accept html tags, Google Calendar, Orkut and Locamail simply discard anything which might resemble a tag. Their input analysis is like: "Oh, did you see that less-than character and that other greater-than, ten lines below? Trim'em. Oh, wait! I just had a better idea. Delete them and all the content they enclose as well. I'm one helluva genius!" What is worst? A cross-site scripting attack or an "Extreme Programming" team deploying such simple anti‑XSS mechanisms? Why spending time writing cautious lexical analysis algorithms? Why struggle seeking and/or trying to forecast specific hazardous strings? Is it laziness? Perhaps Google doesn't have processing grid guts for such: http://blog.managednetworks.co.uk/it-support/googles-20-petabytes/ Not yet. At least for Google, it seems like some sort of indolence-guided programming technique, indeed. Specifically on Google Calendar, now. It has two basic views. A broad view of one's schedule and an event‑specific view. The latter is where one goes for inputting, say, a meeting's prior points to be discussed. Let's start with its lighter problems. Incoherent functions/methods. When one's in there, scheduling something means creating an "Event". By the time one creates an "Event", he's given the option to name that "Event", like a reminder which will appear in the broader view. If that event's name finishes with a semicolon, this character's simply trimmed. Hey! That's bad for a start, isn't that? The incoherency comes with the algorithm which edits an already created "Event". PoC-1: creating an "Event" and editing the "What" field When creating an "Event", if one writes something to the "What:" field and finishes his writing with a semicolon, this last character will disappear by the time the "Create Event" button is activated. Example: know your enemy; becomes know your enemy then, the event is already created, the semicolon is lost and if one corrects (edits) it, adding the disappeared semicolon again in the "What" field, and saves it: know your enemy; there you go, incoherent XP; this time the semicolon remains intact. Well, let's go for it. The worst case. PoC-2: "less-than" and "greater-than" delimiting information Though, let's continue playing in this very same situation. Suppose one encloses his Event's name between less‑than and greater‑than characters: <know your enemy;> This time, clicking the "Save" button is going to send them all to hell. All is lost. In the "event-specific" view, there's this "Description" field for one to put associated details. It's really nice to emphasize Google Calendar's behavior when a user saves that sort of content in the specific view. By the time he clicks the "Save" button, the web app automatically switches for the "broad" view, stating that the user's stuff was saved: "Your event was updated." Everything looks pretty fine. Bad Google! That is so nasty because as matter of fact, sometimes stuff gets lost without even an advisory. A deceiving trap which will cause the user to get confident about the integrity of some information that doesn't exist anymore. And time to act on data loss situations matters. Putting it simple: if some information is put inside inequality signs in the "Description" field, clicking the "Save" button will apparently produce a regular behavior. Apparently I wrote, because nothing will be saved. It doesn't matter if your input is: <parameter> or <parameter with spaces in the same line> or even <parameter with spaces and newline> Trying to save it will cause all of it (characters and information involved) to be lost. Obs.: concerning this last PoC, the same problem applies to some fields within Orkut and Locamail. Real world example: suppose a Unix/Linux researcher's also a Google Calendar user (!!!). What the hell, if he likes unstable, testing and evaluation packages, why not using third party BETA applications? Continuing, let's say he scheduled a meeting for the next day in order to debate some man pages alterations he wants to propose. Even though he has the original text files backed up in his lab, he chooses to paste their content into Google Calendar's "Description" field. His drafts, as expected, present some XML tags and some parameters enclosed by "<" and ">". By the time he saves his "Event" creation and/or changes, everything looks normal but Google Calendar simply gets rid of tags and parameters enclosed by less‑than and greater‑than characters. They all simply disappear without the researcher's acknowledgement. In the following morning, the researcher decides to save another thing, his time, printing that event's content at home before going directly to the meeting. Yeah, right. Print what? I wonder what Google would say about it: "... That's correct, if specific characters are used in specific conditions..." Well, what to do? Their BETA concept is different from the world's. So, users of Google Calendar, Orkut and/or Brazilian web-based Locamail, beware! Your worries now must reside enclosed by inequality signs. One philosophical and profound advice of a potential catastrophe. -- Marcio Barbado, Jr. "And if any Agile methodology shall combine itself with BETA releases, men are screwed." Revelations 22:22 _______________________________________________ Full-Disclosure - We believe in it. Charter: http://lists.grok.org.uk/full-disclosure-charter.html Hosted and sponsored by Secunia - http://secunia.com/
Current thread:
- What Lexical Analysis Became in The Web-Slave New World M . B . Jr . (Oct 07)