International characters and serving files

David Connors david at connors.com
Sun Feb 11 02:07:11 UTC 2024


On Sun, 11 Feb 2024 at 00:24, Maxim Dounin <mdounin at mdounin.ru> wrote:

> File names on Unix systems are typically stored as bytes, and it
> is user's responsibility to interpret them according to a
> particular character set.
>
> As long as nginx returns 404, this suggests that you don't have a
> file with the name with C3 BC UTF-8 bytes in it: instead, there is
> something different.  My best guess is that you are using Latin1
> as a charset for your terminal, and there is an FC byte instead.  To
> see what's there in fact, consider looking at the raw bytes in the
> file name with something like "ls | hd".
>
> Also, you can use nginx autoindex module - it will generate a page
> with properly escaped links, so it will be possible to access
> files regardless of the charset used in the file names.
>

You were spot on Maxim. Thank you so much. I fixed it with mv
Aliinale-Für-Alina.pdf Aliinale-Für-Alina.pdf where the first was the
autocompletion from the shell and the second was the UTF-8 pasted from
WordPress.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20240211/3c5c8f89/attachment.htm>


More information about the nginx mailing list