Introduction to Source Code
Version Control using CVS

Overview

An important problem in the development and maintenance of large programs is version control.  This is the task of organizing the files for a software system consisting of many parts, each of which can have different versions and may be worked on by several people in geographically diverse locations. There are many tools that assist with this task.  On Unix systems the most common are SCCS (Source Code Control System), RCS (Revision Control System) and CVS (Concurrent Versioning System). There tools help manage revisions of program code, documentation, and test data by automating the storing, retrieval, logging and identification of revisions.

Even in the small programs (20,000 and fewer lines) that you'll be writing while you're at RIT, revision control can be valuable. As one example of how the effort of using CVS to maintain your code often pays off for a student, suppose you have a program partially completed, and have verified through testing that the parts you've implemented are all working correctly. You then add another feature to the program.  This feature modifies a large number of files and suddenly previous functionality no longer works. If your code was under version control, you could return to the earlier working version of the program with a single command and start from that point again.  You could also produce a listing of the differences showing what changes you made to the working version which might help you locate the problem. Without version control, you would have to rely on your memory to find the changes that broke the program and manually remove them. This is a very time consuming and error prone process, that is readily avoided by using an appropriate tool, such as CVS.  It is not uncommon for a student to be working on a project and realize that the current approach being used will not work and several hours of coding must be undone.  CVS can help if you use it regularly.  Your motto when using CVS should be: Commit often.  You'll understand what this means when you read through the rest of this document.

Introduction

This document provides an introduction to the use of CVS.  It will highlight its usage by an individual programmer rather than a team.  One advantage that CVS has over other source code management systems is that it easily works over the network.  This allows you to maintain a single repository in your Unix account for all your code and have access to it from any machine on the Internet.

There are many CVS features not described here.  Most aspects associated with team usage of CVS have been left out.  More information on team usage or other features not described here can be found from the links at the end of this document.

Getting Started with CVS

Preparatory steps

CVS stores the files it controls in a directory called a repository.  You can set up with a separate repositories for each project, lab or program but it is probably better to create a single repository in your Unix account.  In this repository you will keep all your source code separated into different areas, known as projects in CVS.  A good name would be repository and a good place for it would be in your home directory.  Create this directory now.  In the rest of this document we assume that it is called repository.

You need to configure CVS so that it creates files accessible only by you.  CVS is meant for use in a collaborative environment and by default it creates files that are readable by everyone.  You need to change this by adding the following line to your .cshrc file:

setenv CVSUMASK 077

You can also save yourself from having to specify your repository each time you run CVS by defining the environment variable CVSROOT.  Assuming that you used repository as the directory name, add something similar to the following in your .cshrc file:

setenv CVSROOT ~login-id/repository

where login-id is your Unix login.

After you have edited and saved your changes to your .cshrc file, logout and back in to make sure that the environment changes take effect.

Now initialize the repository with the command:

cvs init

or if you did not set the CVSROOT variable:

cvs -d ~login-id/repository init

In all CVS operations you can use the -d option to specify the CVSROOT or override the environment variable setting.  For the rest of this document, we assume that the CVSROOT environment variable is correctly set.

cd into your repository directory and you will notice that CVS created an administrative directory CVSROOT.  You rarely will need to deal with this directory.  Right now though you do need to change permissions on the files in this directory.  For some reason, CVS ignores the CVSUMASK setting when creating these files.  From within your repository directory execute the following command:

chmod -R go-rwx CVSROOT

The information in the CVSROOT files really does not tell an intruder very much about the files in your repository but this guarantees that nothing is divulged.

You are now ready to use CVS to control all your source code.

CVS terminology

There are three terms associated with directories that you hear or read when using CVS.  You will need to keep in mind the distinction between these three items:

CVS repository root
It is recommended that you have one of these to hold all of your source code.  You can define the CVSROOT environment variable to point here or use the -d option with any CVS operation to specify the location of the repository.
 
CVS project name
There is one of these for each "project" under CVS control.  Each project should have only one of these.  The word project does not refer to course project.  Each of your labs will also be considered a "project" in CVS terminology.
 
Project or CVS working directory
There will be at least one of these for each project.  This is the directory in which you do all of your development work.  You might have several for one project.  For example, you will most likely have one somewhere in your department Unix account and you could also have one on a machine in your room so you can do development there.  When you work on a team project each person on the team will have a working directory.

Starting a CVS Project

CVS keeps track of your code in units known as projects.  It will work best if you place each of your lab or project assignments in a separate CVS project directory.  Each project directory resides somewhere in the CVS repository.  The project name matches the directory name where the files are stored in the root directory.

Starting in an empty directory

You will want to use CVS to control all of your source code for your class work.  To facilitate that you should get in the habit of creating a CVS working directory as soon as you create a new directory for another assignment.  It is much easier to control what is put into CVS if you start immediately when the assignment directory is created.

In the empty directory execute the following command:

cvs import -m "comment" project-name login-id initial

The command line options specify the following:

import Perform a CVS import operation
-m "comment" Use this as the identification message for this import.  It can be multiple words long if quoted.
project-name Name of the CVS project.  It is recommended that you separate out "projects" by course and then assignment, such as se362/project1 or se441/transit.  The name can have directory components like the two examples shown here.  The files are stored in the repository in a directory hierarchy with the same names.
login-id Name of importer.  You should always use your login-id for this.
initial This is the vendor tag.  You must provide some value but, for now, you can ignore exactly what purpose it serves.

Executing the import statement created the project in the repository.  Next you must transform your current directory into a CVS working copy of this project.  Execute the following command to accomplish this:

cvs checkout -d . project-name

The command line options specify the following:

checkout Perform a CVS checkout operation
-d . Normally CVS will do the checkout will create a directory in the current directory with the same name as the project.  All the checked out files will be placed in this directory.  Assuming that your current directory location matches the project name hierarchy you could cd up to the correct directory and do the checkout without this option.  This option specifies that you want the current directory to be the working directory for this project.  Note that this is different than the -d option that overrides a setting of CVSROOT.  That option must appear on the command line before the CVS operation name.
project-name This should match the name of the project you used when the import was executed.

When the check-out is completed you will see that a CVS directory was created in your working directory.  This CVS directory distinguishes the directory as a working directory.  After the first checkout, you no longer need to specify the project name or the name of the working directory.  This information is obtained from the files in the CVS directory in your working directory.

As you create new project files in your working directory you can add them to the CVS repository.  Refer to "Adding new files" for the procedure.

Starting in a directory with files

If you did not make the CVS working directory initially and you now want to check-in a new project you will have an exact step to perform.  When you do the import operation as specified above CVS will automatically add everything in the current directory and any subdirectories.  This includes class files, executables, image files and a lot of other things that you might not want in the CVS repository.  The power of the import operation comes at the expense of CVS possibly going wild and importing a large number of files you do not want imported.  CVS does automatically ignore some files during import and you can specify other files to ignore.  For more information on importing files refer to one of the resources below.

After you have imported the files you still do not have a working copy of the files.  You have to manually do a checkout but a problem exists.  CVS will not let you overwrite an existing file during the checkout.  CVS will generate an error message specifying that you have to move the original files out of the way before you can create the working copy.  This is the second reason why it is much easier to create a new project starting with an empty directory.  The approach to take would be to rename the directory you are in and then do the checkout to create a clean working copy of the files that you just imported.  If all looks correct you can delete the old copy.

An alternative approach is to create a new empty directory and do the initial import and checkout in that empty directory.  Then copy all the files into this directory, which is now a CVS working directory, and manually add them to the CVS project.

cvs_newproj: a helper script

There is a helper script that you can use to create a new project.  The script is named cvs_newproj and is located in xxx.  This script correctly handles the creation of a project if the project starts in an empty directory or one that has files in it.  The program performs the import and checkout operations.  If files are present in the current directory, they are moved to a temporary directory while the program executes the import and checkout operations.  After completion of these CVS operations, cvs_newproj moves the files back to the current directory and deletes the temporary directory.  You can now add to the repository whatever files you desire.

Execute cvs_newproj in the directory you would like to hold your working copy.  The syntax for the command is:

cvs_newproj [-d repository] project-name

The -d option is required to specify the location of your repository unless you have defined the environment variable CVSROOT to point to your CVS repository.

Working with CVS

Adding new files

New files can be added to a CVS project in two steps.  First, you use the add operation to tell CVS about the new files.  The command would be

cvs add files

where files are all of the new files to be added to the project.  This step does not actually add the files.  It prepares CVS to get the files with the next commit operation.

Committing a project

The commit operation puts the modifications that you made to any files back under CVS control.  Until you do this step any changes you made are not reflected in the repository including files you added or removed.  The syntax for the commit is

cvs commit [files]

If a file list is specified only those files are committed.  If you omit a file list CVS will commit changes to all files in the current directory and, by default, will recursively commit files in all CVS working subdirectories.  During the commit CVS brings up an editor window to allow you to enter a description of the commit.  This will happen for each file unless you specify to use the came log message for all files committed.  If you perform check-ins on a regular basis after completing small incremental additions then a common log message for all files is appropriate.

Even if you are working on an individual project it is worth doing a status check or update before attempting to do a commit.  This will ensure that your local working copy is up-to-date with any modifications you committed from another working copy.  In fact, CVS performs a status check before doing the commit.  If any files fail the status check because there is a newer version in the repository that has not been checked-out or merged into the working copy CVS will abort the commit.

If you have uncommitted changes that are sitting on your machine at home and you want to work from the department now, you are out of luck.  Changes committed to the repository can be retrieved to any machine on the Internet.  Remember the motto: commit often.

Checking out and updating a project

Checkout and update are similar operations.  Both commands will update the local working copy by

  1. leaving files that have not been modified 

  2. overwriting/patching files modified in the repository that you have not modified, and

  3. attempting to merge changes in the repository with local modifications you have made to files.

The primary difference between the two operations is that checkout will create the local working copy directories if they do not exist.  Usually you will use checkout to get the initial copy of a project.  This command requires you to name the project that you want to checkout.  Subsequent synching with the repository will be done with update which looks in the CVS directory for information about the project.  Remember the update does not put any of your modifications into the repository.  You must do that with a subsequent commit.

When executing the operation you can specify individual files to checkout or update.  If no files are specified CVS performs the operation on all files in the repository and, by default, it recursively checks out/updates file in all subdirectories in the repository.

Checking project status

With only you as the single developer, status is not as important as when you are working on a team project.  However, if you work on campus and on a machine at home the status operation will be useful to know when you may need to update the local set of files to match the latest version in the repository.  You could also just do the checkout to get the latest set of files and bypass the status completely.  If all you are interested in is the status of each file, the following command will be useful:

cvs status | grep Status

The following are the various status values a file can have: 

Status Meaning
Up-to-date Repository and working copy match.
Locally Modified Local uncommitted modifications to the working copy are the only changes since this file was checked-out.
Locally Added Local working copy has been added but not committed.
Locally Removed Local working copy has been removed but not committed.
Needs Checkout Local working copy will be overwritten with a newer version during checkout or update.
Needs Patch Local working copy will be patched to bring it up-to-date during checkout or update.
Needs Merge Modified local working copy and an updated repository copy must be merged.
File had conflicts on merge Merge of local changes and updated repository copy was attempted and conflicts resulted.  In a team environment, the programmer doing the commit should resolve any conflicts.
Unknown CVS repository knows nothing about this file.  It may be a new file that has not been added to the project yet.

Other operations

There are many more features in CVS that have not been covered in this introduction.  For information about these other features see the references in For More Information.

Access from Outside the Department

If you want to access your repository from a non-SE department machine you must do it using ssh.  This mechanism provides secure transmission of data to/from the SE department machine.  The operation of CVS is rather straight forward.  Two environment variables must be set on the remote machine where cvs is run as a client.  Depending on which operating environment you are using there will be a different mechanism for setting the environment variable.  The information below tells you the name of the variable and the value it should be set.

Variable

Value

CVS_RSH ssh
CVSROOT :ext:login-id@linus.se.rit.edu:/full-path-to-repository

The first value specifies that ssh should be used as the connection mechanism.  The second specifies that the repository is on an external machine.  Your login-id is given and normal password validation occurs for authentication.  The first time you connect to the machine using ssh you may be asked to accept the signature for the host you are connecting to.  Accept the offered signature unless for some reason you think the system is compromised.  After setting these environment variables on your remote system all of the CVS command operate just like they do if you are working with a local repository.  You can find more information about using CVS with ssh on the WinCVS web pages.

Using CVS for your Course Work

You are required to place all the source code you develop for your courses under version control.    CVS provides repository access to any machine on the network.  There are native versions of CVS available on all popular platforms.  Do not waste your own or your instructor's time pleading:

"I couldn't keep my program under source code control because I worked on my machine in my dorm room and transferred the files to my Unix account when it was finished."

Plainly said, make sure that you regularly use your CVS repositories for all your course work!

For More Information

There are many open source projects that use CVS as their code management system.  This introduction has touched on only a few of the more common CVS operations.  If you want more information about using CVS for managing your source code, refer to the following additional resources:


Revision: $Revision: 1.2 $, $Date: 2013-02-02 19:55:10 -0500 (Sat, 02 Feb 2013) $