Needing a New Workflow for Contributing to eLxr Git Packaging Repository

Table of Contents

  1. Introduction
  2. What We Have
  3. What We Need to Address
    3.1 Issues with “quilt” Git packaging repositories
    3.1.1 Issues with upgrading a package
    3.1.2 Issues with publishing a package for a new eLxr release or due to addition/deprecation
    3.2 Issues with “native” Git packaging repositories

1. Introduction

We need a new workflow for contributing to the eLxr Git packaging repositories. The current contribution workflow (explained in section 2) for the eLxr GitLab projects does not work well with the eLxr Git packaging repositories.

An eLxr Git packaging repository is a GitLab project from which eLxr maintained packages are built and published. One example is the cri-o package offered by eLxr: eLxr / Cloud / packages / cri-o · GitLab.

To understand why we need a new workflow, we need to look at what eLxr already has for the workflow of contribution and the gap introduced by the distinct characteristics of Git packaging repositories.

2. What We Have

Currently, for all GitLab projects under eLxr, we use a “merge request from fork” workflow for contribution:

  1. The contributor forks the GitLab project they wishes to contribute to (including the branch which they wishes to contribute to). We shall call the branch they wishes to contribute to the target branch.
  2. The contributor creates a development branch from the branch they wishes to contribute to.
  3. The contributor commits their work to the development branch.
  4. The contributor submits a merge request from their development branch in their fork into the target branch.
  5. The contributor iterates on their development branch in their fork based on the comments from the merge request.
  6. The merge request eventually gets merged or closed.

3. What We Need to Address

The “merge request from fork” does not work well with the Git packaging repositories. One can categorize Git packaging repositories into two types - “quilt” or “native” - based on whether there is an upstream for the software being packaged. Here, I shall point out the issues of this “merge request from fork” workflow has with “quilt” and “native” Git packaging repositories separately.

3.1 Issues with “quilt” Git packaging repositories

For “quilt” Git packaging repositories, they are meant for software that has an upstream.

In these packaging repositories, the Git history from the software upstream is mixed with all the packaging work. All packaging work are on release branches and upstream branches. An upstream branch contains commits that merge the targeted new release from software upstream we wish to package plus any repackaging. A release branch contains commits that modify the debian/* for packaging instructions. For details of the structure of such a packaging repository, please refer to DEP-14: Recommended layout for Git packaging repositories.

Some explanation about repackaging commits on an upstream branch is provided here, since it may be confusing when debian/patches/* already provides a way to modify the source tree. Typically, only functional changes to the source tree are captured by debian/patches/*, while non-functional modifications to the source tree are typically done directly as commits to the source tree. These commits on the upstream branch cause the “upstream version” part of the version string to be suffixed with “+dfsg.N” or “+ds.N” suffixes (see PackagingFAQ - What does “dfsg” or “ds” in the version string mean?). The Debian Policy Manual also gives some examples of reasons of upstream source tree repackaging.

Unlike most GitLab projects where the contribution is almost always done to a single target branch, upgrading a package or publishing a package for a new eLxr release requires contribution to two target branches - the aforementioned upstream branch and release branch.

3.1.1 Issues with upgrading a package

The upstream branches are used to pull in target versions for packaging from the commits belonging to the software upstream. The upstream branches are also responsible for doing source tree repackaging. As a result, if eLxr ever needs an upgrade to a package, the upstream branch needs updating, and a merge request against that upstream branch will bring in hundreds or thousands of all the software upstream commits into the merge request.

A similar issue happens with merge requests into the release branch. After the repackage on the upstream branch is done, we tag the target commit that contains the finalized repackaged source tree and merge that commit into the release branch. This merge will cause the merge request to the release branch to have hundreds or thousands of all the software upstream commits.

3.1.2 Issues with publishing a package for a new eLxr release or due to addition/deprecation

Every time we have a new eLxr release, or every time we are doing addition/deprecation of a package, we need to create brand new pairs of upstream branch and release branch. Merge requests cannot be used for the creation of branches.

3.2 Issues with “native” Git packaging repositories

“Native” packages are packages which upstream is eLxr. Since eLxr is the upstream of these packages, there are no upstream branches. There is only release branch. However, the issue with publishing a package for a new eLxr release or due to addition/deprecation in the “quilt” case still applies here.

If I understand correctly, for 3.1 issue: using gbp import-origin instead of directly importing an upstream branch or tag when importing an upstream project on gitlab may allow to create only a clean import commit without preserving the upstream Git history.

This indeed works.

If a project is hosted on GitHub, one can easily download an upstream tarball from the release tags in the GitHub repository.

In the following hypothetical scenario, I am adding helm v4.1.1 into the upstream branch elxr/upstream/4.0.x/bianca.

$ gbp import-orig ../helm-4.1.1.tar.gz --upstream-branch elxr/upstream/4.0.x/bianca --no-merge
What will be the source package name? [helm] 
What is the upstream version? [4.1.1] 
gbp:info: Importing '../helm-4.1.1.tar.gz' to branch 'elxr/upstream/4.0.x/bianca'...
gbp:info: Source package is helm
gbp:info: Upstream version is 4.1.1
gbp:info: Successfully imported version 4.1.1 of ../helm-4.1.1.tar.gz

$ git log --decorate --oneline --graph
* 8145702a0 (HEAD -> elxr/upstream/4.0.x/bianca, tag: upstream/4.1.1) New upstream version 4.1.1
* 14b5dad22 (tag: upstream/4.0.5+elxr13ds1, elxr-ssh/elxr/upstream/4.0.x/bianca) remove vendor content already in software repository
* 3b67babdb add back vendor for air-gapped build
*   6c2feebe8 New upstream version 4.0.5
|\    
| * 1b6053d48 (tag: v4.0.5, upstream-http/release-4.0) fix(upgrade): pass --server-side flag to install when using upgrade --install
| * 1e3ee1d2b fix(cli): handle nil config in EnvSettings.Namespace()                      
| * 31bd995ce fix(getter): pass settings environment variables  

Alternatively, if the repackaging of the upstream tarball is not too complicated, one can even use the pristine-tar branch (see runc) or debian/watch (see crun).

Still, merging from an software upstream tag while preserving the upstream history is a standard and acceptable workflow for Debian (see busybox). I think eLxr’s contribution workflow should be able to support all common practices acceptable by Debian.

The following is an excerpt from the Debian busybox packaging repository on the Salsa server:

* c01ddf84d (tag: debian/1%1.37.0-1) update changelog; upload version 1.37.0-1 to unstable
* c68214d18 update changelog
* aab969dc9 d/config/pkg/ *: update configs
* 4a860e58c refresh patches
*   7279a0aee Update upstream source from tag 'upstream/1.37.0'
|\  
| *   f8171d0e0 (tag: upstream/1.37.0, origin/upstream) New upstream version 1.37.0
| |\  
| | * be7d1b7b1 (tag: 1_37_0) Bump version to 1.37.0
| | * a667a7f02 wget: fix compile warnings when WGET_FTP is not selected
| | * 371fe9f71 ash: move hashvar() calls into findvar()
| | * e4b5ccd13 timeout: allow fractional seconds in timeout values
| | * b20b3790b powertop: code shrink
| | * 23da5c4b7 hush: do not exit interactive shell on some redirection errors
| | * 14e28c18c hush: fix "exec 3>FILE" aborting if 3 is exactly the next free fd
| | * 6c38d0e9d hush: avoid duplicate fcntl(F_SETFD, FD_CLOEXEC) during init
| | * 08fb86726 ash: remove limitation on fd# length

Thanks for you extensive write down and bringing up this discussion.

I am still a bit confused which part of the workflow is not supported by the eLxr Project. I took away the following issues in summary, please let me know what issues you see in relation.

3.1.1 Issues with upgrading a package

The issue in this paragraph you’re describing is that the merge requests contains potentially thousands of commits, for example when pulling in a new version of an upstream package. Why is that exactly an issue with our workflow today? Merge requests are allowed to be big in that case, I don’t think we limit them in that sense?

I think common sense can be expected from reviewers here as well, that no changes have been made, just a new upstream version is being pulled into the repository?

3.1.2 Issues with publishing a package for a new eLxr release or due to addition/deprecation

Here I believe you describe the creation of branches as a potential issue. But a package maintainer should be fully capable and allowed to create a new branch for a release?

The workflow should be heavily based on Debian Salsa CI, as described in DEP-18.

Please let me know if I misunderstood something.

Thank you marcel for referencing DEP-18 which I was not aware of at the time of writing these posts.

DEP-18 recommends Merge Requests, I have not found any stated limitations on the number of commits in the Merge Requests. The following excerpt is from DEP-18 that states the recommendation of Merge Requests as well as the value of Merge Requests.

Debian maintainers should make reasonable efforts to publish planned changes as Merge Requests on Salsa, and solicit feedback and reviews. While pushing changes directly on the main git branch is the fastest workflow, second only to uploading all changes directly to Debian repositories, it is not an inclusive way to develop software. Even packages that are maintained by a single maintainer should at least occasionally publish Merge Requests to allow for new contributors to step up and participate.

PROPOSED RESOLUTION (part 1)

Based on the discussion, for issue 3.1.1 - “issues with upgrading a package (quilt package)” - the following describe the new contribution workflow (largely the same but with more guidelines and clarifications):

For a quilt package - a package with an upstream - we shall follow the following steps (Note that steps marked with EXISTING means that it is the same as the old “merge request from fork” workflow, while steps marked with NEW means that it is not clarified in the old “merge request from fork” workflow):

  1. EXISTING: Create a fork of the packaging repository.
  2. NEW: Identify two target branches - the upstream branch and the release branch. Typically, the upstream branch is elxr/upstream/bianca and the release branch is elxr/bianca. In general, the upstream branch is elxr/upstream/{elxr release codename}, while the release branch is elxr/{elxr release codename}.
  3. NEW: Merge the commit from the software upstream targeted for packaging into the upstream branch. Alternatively, as aptwinner suggested, download the upstream tarball from the software upstream and use gbp import-orig {downloaded upstream tarball} --upstream-branch {the upstream branch identified in step 2} --no-merge to put the new source tree into the upstream branch. The alternative is useful if the contributor does not want a lot of upstream commits. Note that, both options are acceptable and there are no preferences for one over the other.
  4. EXISTING: Perform upstream source repackaging.
  5. NEW: Tag the final repackaged commit as something like upstream/1.2.3+elxr13ds1.
  6. NEW: On the other target branch - the release branch - merge the tag from step 5.
  7. EXISTING: Update the Debian packaging instructions - the debian/* - to make the package build for the new upstream source tree. Notably, the contributor should pay extra attention to any changes made to the upstream build system (Makefile for example), as a new version of the source tree from the upstream may configure the toolchain differently.
  8. NEW: Sometimes, the package build fails and the contributor needs to repackage the upstream package. This means repeating step 3 to 7. That is why one should not create the merge requests for the two target branches - the upstream branch and the release branch - before one gets a successful build. One may ask if this is wrong - since fixes to the source tree is typically done using quilt patches. For Go packages, eLxr aims to respect the upstream build dependency as much as possible. As a result, when doing packaging in eLxr, we are very conservative when it comes to replacing the vendored Go modules with Debian Golang source code packages (do not confuse it with Debian source packages as these are binary packages with name golang-*-dev).
  9. EXISTING: Create two merge requests for the two target branches - the upstream branch and the release branch. IMPORTANT NOTE: The merge request must use “fast-forward” or “ff” mode, otherwise, the Git history will be messed up.
  10. NEW: Make sure that the release branch is merged after the upstream branch is merged. When the upstream branch is merged, push the tag created in step 5. This tag is critical for the eLxr Salsa pipeline to function. Only run the pipeline on the release branch after it is merged.

REMAINING ISSUE

For 3.1.2 as well as 3.2, for example, all Cloud native packages for eLxr Bianca were non-existent before. Tom and I, being the temporary maintainer, completed the packaging, created, and pushed the branches. How does the work of this initial packaging for a new eLxr release gets properly reviewed?