Bugtraq mailing list archives
RE: Webtrends HTTP Server %20 bug
From: Glynn Clements <glynn.clements () virgin net>
Date: Fri, 8 Jun 2001 04:51:57 +0100
Eric Hacker wrote:
Unicode is a superset of ACSII and thus all ASCII characters are Unicode. UTF8 is a way of encoding unicode code points for transport over the internet in a restricted character set. Conveniently, UTF8 uses the same values as ASCII for ASCII representation. Above the standard ASCII 127 character representation, UTF8 uses multi-byte strings beginning with 0xC1.
No; the sequences for codes 128 to 255 begin with 0xC2 and 0xC3 (128-191 and 192-255 respectively). 0xC0 and 0xC1 indicate (illegal) overlong encodings of 0-63 and 64-127 respectively. In general, the two-byte sequences have the (binary) form: 110xxxxx 10xxxxxx The range 0-127 (which must use the single-byte form instead) corresponds to: 1100000x 10xxxxxx Hence, any sequence beginning with 11000000 (0xC0) or 11000001 (0xC1) is illegal. -- Glynn Clements <glynn.clements () virgin net>
Current thread:
- Webtrends HTTP Server %20 bug Auriemma Luigi (Jun 04)
- Re: Webtrends HTTP Server %20 bug Michael Grice (Jun 04)
- Re: Webtrends HTTP Server %20 bug H D Moore (Jun 05)
- RE: Webtrends HTTP Server %20 bug Eric Hacker (Jun 07)
- RE: Webtrends HTTP Server %20 bug Glynn Clements (Jun 08)
- Re: Webtrends HTTP Server %20 bug (UTF-8) Peter W (Jun 10)
- Re: Webtrends HTTP Server %20 bug (UTF-8) zsn (Jun 11)
- RE: Webtrends HTTP Server %20 bug Eric Hacker (Jun 07)