To define structure in an SCM repository, you need to make a
distinction between four dimension:
- Directory structure
- Branching structure
- Product structure
- Workspace structure
In principle there strucutures are independent dimensions which means
that one structure can be defined and changed without affecting the
other. But in practice they are often dependent.
Typical dependencies are:
- Directory & Branching
- Directory & Product
- Workspace & Directory
Directory & Branching dependencies are common in certain tools such as
Subversion or Perforce. You have a 'trunk' branch which is reflected as
a 'trunk' directory and branch directories. The tool does "smart" copy
across branches/directories to optimizes disk usage: different copies
are stored as a single instance in the repository. Tools like ClearCase
and CM Synergy support independence between directories and branches.
When having Dir & Branch dependency, typically the top level directory
reflects the branching and the lower level directories reflect the
(branch independent) directory structure.
Another common dependency is Workspace & Directory structure. You need
a workspace to access the repository. Typically, the workspace is the
top directory level. In combination with Dir & Branch dependency, the
directory structure looks like this:
workspace/branch/directory/.../filename
(dir/... reflects a tree structure of directories)
A common way of creating a directory structure independent of the
workspace and branch, is by using a mounted filesystem (possibly a
vertual filesystem) for the workspace and workspace settings to specify
the branch(e.g. in ClearCase: configspec, or in CM Synergy: reconfigure
properties). The directory structure then no longer contains a
workspace or branch identification.
Instead of mounted filesystem you can have a snapshot (copy from the
repository on a local file system). The latter maybe optimized for disk
storage by using a symbolic links (not supported on Windows) to a
shared cache.
Now the final dependency is Dir & Product. This is a very common
dependency: the directory structure reflects the product structure. For
example, when the product (or system) is decomposed into subsystems,
modules and component, then the directory structure may be:
subsystemname/modulename/componentname/dir/.../filename
(dir/... indicates other subdirectory structures)
A disadvantage of this approach is that a component that belongs to
different modules or subsystems, will occur in multiple directory
locations. This may be overcome by symbolic links (not supported on
Windows) or my "smart" copies (not supported by many SCM tools). But
the approach may confuse developers or (worse) configuration managers.
A way to overcome this is by using a single level structure:
componentname/dir/.../filename
But where to the subsystem and module go? Of course they occur in the
architectural model, but from an SCM point-of-view they are used in the
build structure. Commonly, the build structure is reflected in
makefiles and build scripts (which is yet-another-SCM-structure to
consider).
Speaking of build structure, we enter the arena of interfaces and
libraries. Typically, a build of one unit (subsystem, module,
component, or what-ever-you-define) makes use of objects of other
units. This may be source definitions (e.g. header files), generated
objects (e.g. object files, DLLs or libraries of sources and objects).
Typically, these external definitions are grouped into a seperate
directory, such as 'ext' or 'lib'. And to separate from the internal
definitions (only used within the unit), there is another directory
called 'src' or 'int'. The directory structure will then be:
componentname/ext/filename (for external interfaces)
componentname/lib/filename (for libraries)
componentname/int/filename (for internal interfaces)
componentname/src/dir/.../filename (for internal code)
So...
The question how to structure you SCM repository is not a simple one. I
welcome any comments, corrections and/or additions to the above
discussion.
Frank.