[cvsnt] Re: check CVS repository integrity

Tony Hoyle tony.hoyle at march-hare.com
Fri May 19 18:24:46 BST 2006


Michael Wojcik wrote:
> I believe that's highly dubious.  CVS has to rewrite the whole file for
> most operations, since the RCS file format is plain-text (and

Actually for the most part it's simply a fast copy routine.  It's designed to 
be *very* fast.  At no time is the entire file or anything resembling it in 
memory.

Checksumming the individual revisions won't work due to keyword expansion - 
you'd have to find some way of doing a checksum that took that into account 
but didn't involve processing every revision twice - again revision generation 
is a fast single pass operation and adding linear processing to verify a 
checksum would murder the performance.

The only way you could checksum would be to do it to binary revisions, and 
even then you'd need a size threshold - the calculation is very CPU intensive 
compared to everything else.

Really it's not worth it.  The only thing that could corrupt an RCS file is 
actual hardware failure - and then you routinely recover from backups anyway.. 
I wouldn't trust a file stored on such a device checksummed or not.

Tony



More information about the cvsnt mailing list