Why u_char* not char*

Maxim Dounin mdounin at mdounin.ru
Wed Jul 15 05:41:14 MSD 2009


Hello!

On Tue, Jul 14, 2009 at 11:56:27PM +0300, Marcus Clyne wrote:

> Hi,
>
> Why are strings in Nginx stored as u_char*'s and not char*'s pointers?   
> What's the advantage?

I'm not sure why Igor choose it, but there are at least several 
reasons to use 'unsigned char' (aka u_char) instead of 'char' 
(which may be either signed or unsigned):

- Constructs like

    u_char   map[] = { 0, 0, 0, 1, 1, ... };
    u_char  *p;

    ...

    if (map[*p]) { ... }

  work as expected for all possible character values without any 
  extra typecasting.

- Comparision works in predictable way.  And you will get (mostly) 
  reasonable sorting on any arbitraty data even without collation 
  support.

- Overflow behaviour undefined for signed types, and bitwise 
  operators are undefined for negative values.

So basically if you deal with abitrary byte streams in some 
arbitrary way as nginx do - 'unsigned char' is better choice.

Maxim Dounin

p.s. It's really good idea to start new thread for unrelated 
questions.





More information about the nginx mailing list