[cvsnt] Ann: XML schema for cvs log output

Tue Apr 15 17:25:55 BST 2003

On Tue, 15 Apr 2003 18:11:42 +0200, "Bernhard Weichel"
<from-support.cvsnt at b-weichel.7to.de> wrote:

>One of the major problems is the extraction of the descriptions.
>So the production rules below will fail if a user comment starts with
>"--" or with "==".

..shoudn't be a problem.   There's always at least 1 line of commend followed
by a line of '-------' which is fixed length (30 characters IIRC).  The last
line of the log is a line of '===' which is also fixed length (78 I think).
If your parser is doing a bit of lookahead it should be able to make sure that
the line of '---' is followed by a 'revision ' line.  It's still possible to
fool it, but you'd have to be trying to deliberately break it in that case.

If you know any python look at how ViewCVS does this - it parses the log
output (from RCS, but they're basically the same) and puts it in to local
structures.  Mostly it just uses PCRE for this.

Really you want to parse to a tag per line as in:

<Log RcsFile="f:/data/projects/cvs/develop/test2.txt,v"  
        WorkingFile="test2.txt" Head="1.57" Branch="" Locks="Strict"
        DefaultKopt="kv" TotalRevisions="66" SelectedRevisions="66"
        Description="">
  <AccessList />
  <SymbolicNames>
    <Tag Name="r_7431_Build_13" Revision="1.33" />
  </SymbolicNames>
  <LogRevision Revision="1.33.8.1" Date="2003/04/03 16:41:58" Author="tmh"
                        State="Exp" Lines="+1 -0" Kopt="kv"
                        Commitid="1f63e8c66ad0000" >
    Test file
  </LogRevision>
</Log>

Then you're not adding too much cruft to slow down your parsing, by using too
many tags.

Of course 'log' is the easy one...  The output from 'cvs commit' can be a lot
more varied - for that you'd really want to be taking the client/server
protocol so you can get the entry data as it is generated.

Tony