[cvsnt] Re: Migrating to new hardware

Mon Feb 6 12:59:20 GMT 2006

> From: cvsnt-bounces at cvsnt.org 
> [mailto:cvsnt-bounces at cvsnt.org] On Behalf Of Bo Berglund
> Sent: Sunday, 05 February, 2006 09:53
> 
> On Sun, 5 Feb 2006 15:00:54 +0100, "Marcel Stör"
> <marcel at frightanic.com> wrote:
> 
> >>> - can we expect to gain a performance boost if using a server with
> >>> multiple CPUs?
> >>
> >> I think I remember that Tony has stated that CVSNT does not benefit
> >> for dual processors. OTOH I have a test server with a dual core
> >> Pentium processor, which mostly appears as a dual CPU. It 
> >> works well

I wouldn't expect SMP to pose any problems for CVSNT, any more than it does for most applications.  So "work[ing] well" is to be expected.  However (and I assume this is what Bo meant) that says nothing about whether it *benefits* from SMP.

> >...which means that we might give it a shot.

Considering how common SMP systems are these days, it's difficult to avoid, if you need a high-capacity machine for your CVS server for other reasons (eg you want a big, fast RAID array).

> >Well, I look at it pragmatically: would a dual-CPU based CVSNT server be
> >able to process more requests/operations than the same system with a single
> >CPU? Your answer suggest "yes, it would".
> 
> Well, that is not really what I said. I just stated that in my PC
> CVSNT works well with a dualcore Pentium. But I have no clue as to its
> ability to divide the load between the cores, maybe it is just using
> one core processor and the other does other tasks as controlled by
> Windows? As I said I don't know...

Indeed.  I suspect under typical use CVSNT would rarely benefit noticeably from multiple CPUs.

Consider that most CVS activity is I/O-bound.  Very little that CVS does is CPU-bound.  Computing diffs takes some CPU, as does parsing the RCS file, but those are also both heavily I/O-oriented activities.

I've just taken a quick look at the sources for diff in 2.0.51d (which is what I happen to have checked out here), and it does appear to read entire files at once and then hash their lines sequentially, which for large files could mean more than a single timeslice between I/O operations (barring paging), so there could possibly be some CPU contention.  But that wouldn't apply to most CVS operations other than diff.

Also remember that two CVS operations can only be done in parallel if they have no locking conflicts; so in general the subset of N CVS operations that can be performed in parallel is that with no common lock targets.  If most of your CVS clients are trying to operate on the same CVS files, they're going to be serialized by locking anyway.

One place where I can see CVS benefitting indirectly from SMP is if clients have an encrypted network connection directly to the CVS server (eg over IPSec, without offloading the crypto).  Then the network stack will be burning CPU cycles encrypting and decrypting traffic, and having more cores will help with that.  Again, though, that's not the typical situation.

For similar reasons, if you typically run CVS with the "-z9" flag to compress data over the network, you might see some CPU contention there, particularly with large files.

In my case, for example, when I'm running against our corporate repository (rather than the one I keep locally for my own files), the vast majority of the time for most CVS operations is in transferring data over the network, even with "-z9".  (Server-side operations like "rtag" are an exception, of course.)  Because I'm using compression I might have slightly less server-side work time with more CPU power available there, but any benefit would likely be negligible compared to the network latency.

I'd emphasize fast I/O on the server, and plenty of RAM (particularly if you have a lot of users and/or large files), rather than CPU power.

-- 
Michael Wojcik
Principal Software Systems Developer, Micro Focus