ClearCase Branching and Labeling Best Practices for Parallel Development

ClearCase Branching and LabelingBest Practices for Parallel Development Brad Appleton Motorola Network Solutions

Outline • Professional background • Intended audience • Branching & labeling concepts & issues • Branching & labeling “best practices” • Using the “best practices” • General advice and recurring themes • Q&A

Professional background • Brad Appleton, Lead Software Engineer, Motorola NSS • SCM tool development (commercial & “in house”) since 1988 • SCM implementation using ClearCase at Motorola since 1995 • Have been conducting a multi-year study of “best practices” for using SCM tools like ClearCase • Presentation material comes from my ACME project at: http://www.enteract.com/~bradapp/acme (see “Streamed Lines” and “Common ClearCase Practices”) • Goal is to document & disseminate SCM “best practices” • You can help by sharing your war stories & best practices

Intended audience • This is an intermediate-level presentation - I assume you already have at least some branching experience: • You know what a branch is, how to create it and work on it • You know how to merge, and you’ve used findmerge • You’ve seen branches used for parallel maintenance and development of more than one release at the same time • For a more introductory-level presentation on branching, please see Doug Fierro’s presentation from RUC’98

ClearCase branching and labeling Concepts and issues

Project-oriented vs file-oriented branching File-oriented branching (low-level) • Branches are organized & viewed in the context of a single file and its version tree (vtree) for that one file • Focuses primarily on physical modifications to individual files Project-oriented branching (high-level) • Branches are organized & viewed in the context of a project (or product) and its version tree (vtree) for the whole system • Focuses primarily on the flow of logical changes/activities to entire components and (sub)systems Project-oriented branching is more conceptually powerful • Project-wide vtree depicts the evolution of the whole system • Better conceptual fit for how we try to plan, manage & track projects and workflow!

Dimensions of branching The main uses of branches can be categorized as follows: • Physical - branching for components and subsystems • Functional - branching for patches, releases, fixes, features & enhancements • Environmental - branching different build & runtime platforms: platform, OS, tools, GUI [often undesirable!] • Organizational - branching for work activities & projects, roles, and teams • Procedural - branching for work behaviors to support policies, processes, and states These are not mutually exclusive! (e.g., a bugfix-task)

Branching risks and competing forces • Teamwork • Communication, visibility, roles & responsibilities, workflow organization • Reusability (of changes) • Baseline reproducibility, change separability & traceability • Safety • Baseline consistency, reliability, integrity & stability • Avoid lost changes & reappearing bugs • Liveness • increased productivity, coordination & contention • merge complexity, build-time

Tool support issues for branching & baselining • Branching support • Meaningful names & hierarchical structure • Merging support • Merge ancestry & graphical, multi-way merge assistance • Labeling support • Labeling performance, label renaming • Logical change-grouping • Change “sets” and “transactions” • Derived object reuse • Extensibility (hooks & triggers & UI)

ClearCase branching and labeling Recurringly successful “best practices”

ClearCase branching & labeling “best practices” Private development branch Shared development branch Early vs lazy branching Push vs pull integration Shared integration branch Pull-push integration-task branch Push-pull “docking” branch Integration lock Partial integration lock Rebase before merge-back Checkpoint after merge-back Checkpoint before rebase “Reserved” rebase Floating label Pre-build vs post-build label/checkpoint Simulated “-mklabel rule” Baseline promotion via label attribute Baseline promotion via label renaming

Private development branch • Branch-type created per development change/activity (just like a “private branch” with a “view profile” on NT) • Branch-access restricted exclusively to the developer responsible for implementing the fix or feature • Checkout/checkin as desired on the “private branch” (automatically grouping together all modifications made) • Other developers insulated from seeing changes until the branch is “merged back” to integration branch • Branch “lock down” when change/activity is completed

Private development branch integration branch private branch Developer merge back

Shared development branch • 2-3 developers working together on one “private branch” working on the same change or tightly coupled changes • Also good for mentor/apprentice pairings • Can work in the same or separate views • Separate - if you don’t want to see others’ checked-out files • Same - if you need to reuse the others’ view-private files • Close communication & coordination is essential! • Not the usual advice, but can work well in above conditions • Can avoid some otherwise nasty interdependencies between branches for close-knit changes • Cost is less separability of changes, more coordination effort

Shared development branch integration branch shared branch Developers merge back

Early vs Lazy Branching • Early Branching (a.k.a. “up front” or “just in case”) • Branch-off a new line of work after it is planned, but before it actually begins • Branches for each release • Creates separate development & maintenance lines early on • Lazy Branching (a.k.a. “deferred” or “just in time”) • Wait to branch-off a new line of work until real work actually begins on that line • Branches for major releases, & for latest-and-greatest work • Creates maintenance lines very late, starting them off as release-engineering work to allow LAG work to continue

Early Branching Lazy Branching Latest & Greatest Mainline R1.1 dev R1.0 R1 maint R1.0 R1.1 R1.1 R2.0 dev R1.2 dev R1.2 R1.2 R2.0 R2.0 vs

Push vs Pull integration • Developer-Push Integration • Developers merge (push) their own changes to the int-branch • Risks greater stability to achieve greater productivity • Helps avoid dysfunctional “throw it over the wall” scenarios • Requires greater trust and proficiency of the developers • Integrator-Pull Integration • Developers submit completed changes to an integrator • Integrator merges (pulls) dev. changes to the int.branch • Offers more reliable stability/integrity, but less up-to-date • Can often turn into dysfunctional “throw it over the wall” scenario

Shared integration branch • Multiple developers need to share their changes, but still need their own private branch (no shared development branches or views) • Create a shared “mini-integration” branch for them to see the latest “good” status of each others changes • Work still happens on private branches, but changes get pushed to the shared integration branch when needed, and also when completed • Private branches merge to/from the collaborative int-branch instead of each other (prevents ugly branch dependencies) • After all private branches are pushed to it, the shared “mini-integration” branch is merged to the int-branch

Shared integration branch shared int.branch private branch private branch Developer Developer merge back integration branch

Pull-push integration-task branch • Instead of merging development branches directly to the integration branch, the integrator creates a “task branch” solely for integrating a set of development branches • The integrator then pulls the changes from the development branches to the integration-task branch • After some testing, the integrator “pushes” the changes on the integration-task branch to the primary int-branch • Keeps the primary int-branch in a pristine & stable state, at the cost of additional trivial branching & merging

Pull-push integration task branch integration branch private branch private branch Developer integration task branch Developer Integrator integrator push integrator pull

Push-pull “docking” branch • Tries to “balance” the push-vs-pull tension by combining the best of each: • Developers base their changes of a stable int-branch, and push (“dock”) their changes to an intermediate int-branch • Integrator regularly builds & tests the state of the docking branch and pulls its changes to the stable int-branch • Allows more choices for productivity vs stability risk • Allows developers a choice of which branch to base changes on and how stable they need their baselevel to be • Allows integrators the choice of pulling from the docking-line, or the individual dev-branches on a case-by-case basis • Helps avoid “throw it over the wall” without sacrificing stability

Push-pull “docking” branch integration branch docking branch private branch private branch Developer Developer Integrator Integrator pull developer push

Integration lock • If have multiple developers “pushing” changes, or multiple integrators “pulling changes” need to worry about overlapping/inconsistent changes getting merged back at the same time • One solution is to ensure that only one change is merged-back at any given time • But locking the int-branch can only be done by the branch-owner or VOB owner • Instead of locking, use an attribute or hyperlink on the brtype, and have a pre-op trigger prevent co/ci unless the executing user “holds” the “integration lock” attr/hlink • Merge-back happens as an atomic “commit” transaction • Don’t forget to “relinquish” the lock when done merging!

Partial integration lock • What if “integration lock” is too coarse-grained? • May have many developers pushing their change-sets at the same time, but will very rarely overlap or conflict • Serializing merge-back to the entire int-branch may defeat the purpose of parallel branching in such cases • Only lock “integration” for the set of elements to merge • Use a hyperlink as the “integration” lock (can attach multiple hlinks to a brtype) and check for overlap between the to-be-merged change-set, and the ones currently being merged • AND/OR do a reserved checkout for each element to merge before doing any merges; abort if can’t reserve all of them

Checkpoint after merge-back • Backing-out already merged changes should be rare, but can be painful and time consuming • Tagging merge-results makes it easier to find & fix them • First - merge the changes to the integration branch • Then - mark only the newly merged versions (the merge-set) using a label or attribute (label is most visible in vtrees) • Tag merge-set using “simulated -mklabel” or cleartool find • Gives you the option of doing a subtractive merge on the unwanted merge-set AND/ORyou can still correct the build “on line” or on a new branch • Marking the merge-set makes a useful reference point in diffs and vtrees

Checkpoint after merge-back integration branch private branch Developer merge back MRG_CSET

Rebase before merge-back • When employing “developer-push” model, want to minimize risk & complexity of merges to int-branch • Common solution is to “rebase” before merging back • First - edit cspec to replace old base-selector with the new • Then - merge from the new baselevel to the dev-branch • Don’t forget step one or you will merge too much! • Ensures that merging to the int-branch is as trivial as possible (the “heavy lifting” occurs on the dev-branch) • Forces recent changes to be reconciled together sooner & keeps developers more aware of others activities • Possibly increased risk of merging in “tainted” changes (but it happens on dev-branch instead of the int-branch)

Rebase before merge-back integration branch private branch rebase Developer merge back MRG_CSET

Checkpoint before rebase • What if a rebase completely disrupts the stability of my branch/view at the time of the merge? • Checkpoint before rebasing so you can easily “rollback” to the last “good state” of the branch/view • Several possible ways of checkpointing: • Label the state of the development branch • Label the state of the entire view • Use an attribute or hyperlink to record a timestamp for the branch, or for the entire cspec

Checkpoint before rebase integration branch private branch rebase Developer merge back CKPT_CSET MRG_CSET

“Reserved” rebase • What if, after I rebase, someone else merges-back while I’m rebuilding & testing? Now I’m no longer up-to-date! • Can “reserve” your merge-back just prior to rebasing • Acquire an “integration lock” before you rebase • Rebase, then retest as quickly as feasible • Now merge back your changes & release the integration lock • Closes the window between rebase & merge-back • Voluntarily “unreserve” the lock if rebase/retest is complex • Peer pressure helps minimize the “reservation time” • Don’t reserve the integration lock for too long or you’ll have an angry mob at your door!

Floating label • How do I “stay aware” of which baselevel is the latest? • If builds and baselevels are very frequent, it can be hard to keep track of what’s the latest “good” baselevel • Selecting …/branch/LATEST isn’t stable enough to use • Agree on a name for a “floating label” for the int-branch • Create a baselevel-label with the same name when you want those using the “LAST_GOOD_BUILD” to refresh their views • May want to do this in addition to (rather than instead of) a unique baselevel-label name • Can also be one of the “steps” in a promotion model scheme • User’s of dynamic views get the new LAST_GOOD_BUILDwhether they want it or not!(unless they use a “time rule”)

Floating label Before New Build After New Build integration branch integration branch BLD1 BLD1 BLD2 BLD2 BLD3 LAST_GOOD_BUILD LAST_GOOD_BUILD

Pre-build vs post-build label/checkpoint • Pre-build label (integrate then label then build) • Handy initial reference to have for later comparison • But if the build breaks and changes are needed, you have to apply a new-label or “bump up” the existing one • Not needed if have already labeled after the last merge • Post-build label (integrate then build then label) • No need to hassle with re-applying before/after labels • But you lose the handy initial reference if you wanted it (could use an hlink/attr to checkpoint instead!) • May not be needed if you have to do trivial copymerges to a mainline before applying the official baseline label

Simulated “-mklabel” rule • When using “pre-build label”, how can I keep the label up to date without relabeling everything? • Can simulate a “-mklabel” rule for your cspec: • Use an hlink/attr to record a “-mklabel” target for int-branch (this could even be part of an hlink-based integration lock) • Have a post-op ci trigger look for the “-mklabel” target and, if found, automatically do a mklabel on the checked-in version • Automatically updates an “in progress” baselevel • Don’t forget to remove the “-mklabel” target after the baseline is created and the label is “locked down”

Baseline promotion via label attribute • How do I indicate a progression of quality (promotion) levels for a baseline without having to relabel the world? • Use an attribute or hyperlink on the lbtype to record a formal promotion level • Create a PromotionLevel attribute type with your desired set of promotion levels (e.g., INITIAL, BUILT, TESTED, RELEASED) • When the in-progress baseline passes a new quality “gate”, bump up the promotion level to the next one in the sequence • Triggers can be used to take special actions or precautions based on the promotion level associated with the label-type

Baseline promotion via label attribute Before Promotion After Promotion integration branch integration branch Promotion Level = BUILT Promotion Level = TESTED BLD21 BLD21

Baseline promotion via label renaming • How do I ensure no one tries to base their changes off of an in-progress (unfinished) build/baselevel label? • Use a different name for in-progress vs final labels • Create the lbtype with a name like TMP_BLD2.1.2 that clearly reflects its transient nature • Rename lbtype to BLD2.1.2 when it is ready for “prime time” • Also useful when you don’t know the release/build number until the last minute (start with an internal name, rename to the official release/build ID at the end) • Anyone (mis)using the “in progress” label should quickly notice something is different in their view configuration!

Baseline promotion via label renaming Before Promotion After Promotion integration branch integration branch TMP_BLD 2.1.2 BLD 2.1.2

ClearCase branching and labeling Using the “best practices”

Determine your risk tolerance • Which risk factors apply to your environment? • Trade-off in favor of safety or liveness? • How critical is the impact of an “unwanted” change that “breaks” the build • How much time and overhead to back out a change? • How much can you trust your developers?

Select an appropriate branching style • Early or lazy? • Use early branching when you have larger teams with multiple frequent parallel releases at once • Use lazy branching when you have smaller teams that usually have at most one development release and a few maintenance-only lines • Can use both at once for medium-large project teams • Early branching at the top to isolate major changes early • Lazy branching at the lower levels to keep the vtree from getting to wide and unwieldy

Select an appropriate merging style • Developer-push or Integrator-pull? • Use developer-push with less than a dozen or so developers per integration branch and the developers know how to do the merge • Use integrator-pull when the branch must be highly stable, or highly volatile and accommodates more than a dozen or so developers • Can use both at once for medium-large project teams • Integrator-pull at top-branches to isolate for safety at broader scope and impact • Developer-push at the lower-levels to promote liveness and visible communication

Select an appropriate labeling style • Pre-build labeling or post-build labeling? • Use pre-build labeling when you usually have very little tweaking to do after a build, and need to know what you started with as much as what you end up with • Use post-build labeling when you typically have to modify files after the build, or already know what you started with • Label-promotion via attributes or renaming? • Use attributes when you need >2 formal promotion levels • Use renaming when you only need ~2 formal promotion levels (the rest can be informal if needed) • Can use both to get desired number of levels while ensuring no one relies upon build-labels of questionable stability

General advice and recurring themes • Use meaningful names • Prefer branching to freezing • Encapsulate change - isolate the thing that varies • Communicate stability - integrate/build/baseline early & regularly • Branch on incompatibility • Add a level of indirection by adding a “line” of integration • KISS (keep it simple stupid!) • Preserve integrity /consistency • Isolate work, not people!

Support for the Best Practices in UCM • Implemented directly in UCM • Private branch,Push integration [remote-pull only], Partial integration lock, Rebase before merge-back, Baseline promotion via attribute • Supported (but not directly implemented) by UCM • Shared branches, Integration lock, Merge-set checkpoint, Change-set checkpoint, Early/Lazy branching, Pre/Post build baselining • Not Supported by UCM (but manual workarounds exist) • Local pull integration, Reserved rebase, Floating label, Pull-push branch, Push-pull branch, Simulated mklabel (but see incremental baselines), Baseline promotion via renaming

Questions?

Brad AppletonMotorola Network Solutions<bradapp@enteract.com>Learn more about my ACME project at http://www.enteract.com/~bradapp/acme Thank You! This presentation will be posted by tomorrow to http://www.rational.com/ruc

ClearCase Branching and Labeling Best Practices for Parallel Development

ClearCase Branching and Labeling Best Practices for Parallel Development

Presentation Transcript

Branching and Merging Practices

ClearCase

Software Development Best Practices

Development Best Practices

Branching and Merging for Parallel Development

Code and Asset Branching Best Practices

Best Practices for Membership Development

Best Practices for Channel Development

ClearCase Branching, Labeling, and Building

Parallel Programming on EGEE: Best practices

Audience Development Best Practices

Application Development Best Practices

Best Practices For Application Development

Advice and Best Practices For Laravel Development

Software Development Best Practices

Parallel Programming on EGEE: Best practices

Report Development Best Practices

Software Development Best Practices

Best Practices for Secure Development

Best Practices For Overseas Software Development

Web Development Best Practices

Best Practices for Node.JS development