[OT] Re: Commit 6803 contains invalid char on commit log

Alexander E. Patrakov patrakov at ums.usu.ru
Sat Oct 1 21:10:19 PDT 2005

Anderson Lizardo wrote:

> I understand, but in this case "svn log --xml" should have output "ü" as
> some strange character (maybe not even a valid UTF-8 sequence) and not
> just keep it in "plain" ISO-8859-1 (which is what happens), right?

Almost correct. "svn log --xml" should output \xc3 \xbc in the log. What 
happens is that it dumps the invalid log as-is, because it assumes that 
the log is well-formed. "svn commit" should have automatically converted 
\xfc to \xc3 \xbc or refused to commit at all with the "invalid 
character" message.

> >>Notice that the XML header says the content is in utf-8, but the
> >>content itself contains the "ü" char (the ü HTML entity). That
> >>confuses the XML::Parser module used on the script. As a workaround, I
> >>had to force the script to always interpret its input as ISO-8859-1.
> >Is this possible to undo the hack now? It would prevent "normal"
> >non-ASCII log messages from working properly.
> I'd suggest we keep the hack for now while the real issue (Subversion
> accepting invalid input on the commit message) is fixed. Otherwise we
> would risk the website script breaking again...

But that already prevented me from adding example Unicode characters (to 
demonstrate what scim-input-pad is useful for) to the log of r813 in the 
livecd repository.

> BTW, does anyone know whether it's possible to change the commit message
> of a specific revision on Subversion? That's because even when we fix
> the issue, that invalid char will still be there...

I suggest dumping and restoring. With my toy repository at 
file:///home/patrakov/svn-test, the sequence of actions is:

1. Reproduce my toy repository:

svnadmin create /home/patrakov/svn-test
svn co file:///home/patrakov/svn-test working-copy
cd working-copy
echo 123 >file1
svn add file1
svn commit -m "First commit"
echo 456 >>file1
svn commit -m "Second commit: edit ths message" # should be ths -> this
echo 789 >>file1
svn commit -m "Third commit, to make things complicated"

2. Tell the list to commit nothing.

3. Dump the repository:

svnadmin dump /home/patrakov/svn-test >dump

4. Edit the dump. Look for:

V 31
Second commit: edit ths message

The number after "V" is the number of bytes in the message. Replace 
"ths" with "this", and add 1 to 31 because one byte was added. Result:

V 32
Second commit: edit this message

5. Remove the original repository or, better, move it away to some safe 

6. Recreate the repository and restore it from the edited dump:

svnadmin create /home/patrakov/svn-test
svnadmin load /home/patrakov/svn-test <dump

7. Verify that the issue is fixed:

svn log -r2 file1

8. Tell the lists that commits are allowed again.

I understand that with fsfs this sequence of actions may be suboptimal.

Alexander E. Patrakov

More information about the lfs-dev mailing list