[cvsnt] Problems with text/binary *.rtf files

Flávio Etrusco flavio.etrusco at gmail.com
Mon Sep 25 23:09:32 BST 2006


On 9/25/06, Thomas Muller <ttm at online.no> wrote:
> |  > I have some problems with *.rtf files - when I add them as
> |  text they
> |  > become corrupt - MS Word fails to read the files. If I try
> |  to force them
> |  > into CVS as binary (-kb), they are still showing up as
> |  encoding 'text' and
> |  > MS Word still can't read them.

What do you mean by "force them into cvs"?
Once you've imported them as binary you can't fix them by just
changing the expansion mode (actually, maybe you can if checkout with
Unix LF before you commit a second revision...).

You can 'update -k+L' it, overwrite with a good copy and then 'commit
-f'; but you better just delete the file from repository...


> |  rtf files are text files, not binary files.
>
> I know (hence my statement below "insert as text (which rtf in fact is)"),

I'm not quite sure about this. TortoiseCVS added '*.rtf' to its
default list of binary file extensions quite some time ago when
somebody argued it's not really plain-text "compatible".
I thought it could use non-DOS linebreaks which got screwed when checking out.
Then I read this in msdn:

"A carriage return (character value 13) or linefeed (character value
10) will be treated as a \par control if the character is preceded by
a backslash. You must include the backslash; otherwise, RTF ignores
the control word. (You may also want to insert a
carriage-return/linefeed pair without backslashes at least every 255
characters for better text transmission over communication lines.)"

I was about to scrap my reply when I then read this:

"Unicode RTF

Word 2000 is a Unicode-enabled application. Text is handled using the
16-bit Unicode character encoding scheme. Expressing this text in RTF
requires a new mechanism, because until this release (version 1.6),
RTF has only handled 7-bit characters directly and 8-bit characters
encoded as hexadecimal. The Unicode mechanism described here can be
applied to any RTF destination or body text."

Are you sure Word isn't saving the file with 16-bit chars (UTF-16,
UCS-2, whatever).

Regards,
Flávio


More information about the cvsnt mailing list