[cvsnt] Distributed Teams and Multisite

Rahul Bhargava coderobo at gmail.com
Tue Jan 25 15:46:46 GMT 2005


WANdisco (http://www.wandisco.com)
has just released a multisite solution for
CVS repositories accessed using standard CVS protocol.
Works with any backend CVS server (including CVSNT)
as long as the client can communicate the std protocol.

Here is a short description of the approach:

If you look at existing solutions there is a preponderance
of two patterns:

1. Branching at each site - which is akin to partitioning
your "data-set" (in this case files in CM system), to avoid
collisions as updates arrive at local sites. ClearCase
Multisite is an example.

2. Centralized approaches whereby all transactions are send
to a single master.  CVSup is an example of this.

Both these approaches have been around for a while. They
address part of the problem but do not offer a seamless,
non-intrusive way for the CM admin to deal with the distributed
development paradigm.

In particular -

- Branching based solutions punts the difficult, often impossible,
problem of conflict resolution back into user-land - it is far from
transparent to the end user. Automatic merging can help to an extent,
but there will always be conflicts that the user has to manually resolve.

- Master/Slave solutions, the passive server and associated resources
(hardware & software) sit like a rock gathering dust waiting for
disaster/downtime to happen. They are primarily used for backup.
Resource utilization is poor. Existing solutions tend to be
unidirectional. Often, manual steps are required to have the primary
catch up with the secondary after a failure. Distance limitations exist
when synchronous replication is desired.

What is needed is a completely new way of looking at the problem and
addressing it in a
manner that ensures :

1. Write-Anywhere - Also known as Active-Active. Replicas should be able
to process updates concurrently without propagating updates to a master
or a primary node first.

2. WAN Scale - Offshore development is forcing Enterprises to access
their CM
servers over a WAN. The solution needs to scale on a wide area network.
It should deal with all the pitfalls of a long-haul network connection.

3. Performance - Typical CM clients should perceive significant
performance improvement. The overall responsiveness of a replicated
solution should be far better than a vanilla deployment of CM.

4. One-Copy Equivalence - The system of CM replicas should be
functionally identical to a single copy of a CM repository.

5. Reliability -
* The consistency of the CM Repository AND
* One-Copy Equivalence should not be compromised regardless of arbitrary
failures in the network, or arbitrary crashes of CM clients, or the CM
server itself.

6. Availability - CM server access should remain available in the
presence of all the above failures. In particular, there should be no
single point of failure. In the event of network segmentation, the
split-brain problem should not occur.

7. Transparency - The CM client should be unaware that it is using a
replicated CM server. In particular, no changes to the CM client should
be required to prepare it to use a replicated CM server.

If you look at our approach (http://www.wandisco.com/cvs) you can see we
avoid
all the pitfalls of centralization, master/slave or branching. This is
made possible
by the virtue of using our patent-pending distributed coordination
technology.

Right now we have applied our technology to CVS.

 From the CM user perspective they will see benefits like :

1. Local Reads

CVS commands like checkout, update, status, log are read commands. They
always pass through the Replicator to the local CVS server with no WAN
traffic.

CVS users typically do a lot of repository browsing, be it checkouts or
looking at change logs. Replicator always executes CVS read commands
locally ensuring low latencies for the end user. For an offshore site,
this is a huge win.

2. Rotating Quorum

Quorum is a minimal set of CVS nodes that must be available for a write
command. This amazing feature makes it possible to implement a
follow-the-sun development model. Offsite developers even when they
execute CVS write commands like check-in, tag, branching, do not see WAN
latencies! It's as if they have a local autonomous CVS server.
Patent-pending Rotating Quorum technology ensures that repositories
remain in sync.

3. Multisite, Active-Active Architecture

Replicator is the only game in town, that lets you setup
"write-anywhere" active CVS replicas without using half-way solutions
like branching and merge.

Load on any given CVS server is reduced, as read commands can be scaled
to multiple CVS servers. This provides better responsiveness to the end
CVS user.

There is no upper limit on how many active replicas you can setup. You
could setup one for each R&D site.

4. Reduced TCP Chatter,

CVS protocol is text based and verbose. A typical write command like
check-in incurs at least 4.5 network round trips. Our Replicators talk
via an optimized binary protocol.

We reduce the WAN traffic to 1 round trip. That's how long a CVS client
may have to wait to complete a write with our Replicator. Of course with
Rotating Quorum the CVS client's WAN latency can be reduced to 0!!


Regards,
Rahul Bhargava
CTO, WANdisco
http://www.wandisco.com




More information about the cvsnt mailing list