Character sets on the web server
pjb at informatimago.com
Mon Sep 1 23:41:00 PDT 2003
Alexander E. Patrakov writes:
> Pascal J.Bourguignon wrote:
> > Alexander E. Patrakov writes:
> >> I tried to enforce koi8-r by printing this requirenent (and others) on
> >> paper and distributing this letter, but everyone (including my boss)
> >> violates that and uses cp1251 because Notepad in Windows has no support
> >> for koi8-r and MS Word has no drop-down list to select the character set
> >> of the exported document. I told him to install Aditor, he didn't.
> > What about converting the files on the server?
> I am afraid that I have no complete understanding of your words. Either you
> mean (1) "Let's store documents only in koi8-r, but automate the process of
> conversion from cp1251", or (2) "Let's store two copies of each document,
> one for humans and one (converted) for htdig".
> Variant (1) has a drwaback that when a user views a page in MSIE and then
> selects "View HTML Source" from the menu, the result in Notepad is
> Variant (2) is probably impossible since htdig is a bot that makes requests
> to the actual web server.
Variant (3): Let the user use whatever encoding they want, but when
they upload their edited copy of the document, they do it in a place
that is not directly used by the web server. Then a daemon takes these
whatever-encoding copies and convert them to the official koi8-r
encoding for the web server and the ht://Dig engine.
I could suggest to use another web browser, but the users should
either keep their own "original" copy of the documents to be further
edited, or you could set up another system. For example, you could
have these documents in a CVS server, and CVS can be configured to
process the documents on check-in or check-out so you could do a
different convertion for each user.
Alternatively, if you insist for the users being able to edit
documents fetched from an HTTP server, then you could have two servers
(virtual servers): one www.example.com (or koi8-r.www.example.com) and
one cp1251.www.example.com. Then your users could browse
cp1251.www,example.com and be happy, and you would have a daemon that
would convert the encoding from one web virtual server to the other,
and have you and the public be happy with the standard encoding.
> Anyway, at least on other sites people like to insert &#xx; for opening and
> closing "French" quotes, and the codes for them are different in koi8-r and
> cp1251. Also there are some unconvertible characters in both directions
> (like em-dash), so the conversion script must be rather elaborate to find
> these cases.
> BTW, if I could adapt to having the documents stored in cp1251, there would
> be no problem. The incorrect sorting order in PHP can be fixed by setting
> LC_ALL=ru_RU.cp1251. But still, if I ssh from home (Linux, koi8-r) to the
> server, my local encoding is koi8-r, and fonts expect koi8-r. The same
> applies to everyone using PuTTY.
> Something internal still tells me that it is very wrong to use Linux with
> such windowsish people.
That's the reverse: it's very wrong to use MS-Windows and MS software
Do not adjust your mind, there is a fault in reality.
More information about the lfs-chat