| .. SPDX-License-Identifier: GPL-2.0 |
| |
| =================================== |
| Backporting and conflict resolution |
| =================================== |
| |
| :Author: Vegard Nossum <vegard.nossum@oracle.com> |
| |
| .. contents:: |
| :local: |
| :depth: 3 |
| :backlinks: none |
| |
| Introduction |
| ============ |
| |
| Some developers may never really have to deal with backporting patches, |
| merging branches, or resolving conflicts in their day-to-day work, so |
| when a merge conflict does pop up, it can be daunting. Luckily, |
| resolving conflicts is a skill like any other, and there are many useful |
| techniques you can use to make the process smoother and increase your |
| confidence in the result. |
| |
| This document aims to be a comprehensive, step-by-step guide to |
| backporting and conflict resolution. |
| |
| Applying the patch to a tree |
| ============================ |
| |
| Sometimes the patch you are backporting already exists as a git commit, |
| in which case you just cherry-pick it directly using |
| ``git cherry-pick``. However, if the patch comes from an email, as it |
| often does for the Linux kernel, you will need to apply it to a tree |
| using ``git am``. |
| |
| If you've ever used ``git am``, you probably already know that it is |
| quite picky about the patch applying perfectly to your source tree. In |
| fact, you've probably had nightmares about ``.rej`` files and trying to |
| edit the patch to make it apply. |
| |
| It is strongly recommended to instead find an appropriate base version |
| where the patch applies cleanly and *then* cherry-pick it over to your |
| destination tree, as this will make git output conflict markers and let |
| you resolve conflicts with the help of git and any other conflict |
| resolution tools you might prefer to use. For example, if you want to |
| apply a patch that just arrived on LKML to an older stable kernel, you |
| can apply it to the most recent mainline kernel and then cherry-pick it |
| to your older stable branch. |
| |
| It's generally better to use the exact same base as the one the patch |
| was generated from, but it doesn't really matter that much as long as it |
| applies cleanly and isn't too far from the original base. The only |
| problem with applying the patch to the "wrong" base is that it may pull |
| in more unrelated changes in the context of the diff when cherry-picking |
| it to the older branch. |
| |
| A good reason to prefer ``git cherry-pick`` over ``git am`` is that git |
| knows the precise history of an existing commit, so it will know when |
| code has moved around and changed the line numbers; this in turn makes |
| it less likely to apply the patch to the wrong place (which can result |
| in silent mistakes or messy conflicts). |
| |
| If you are using `b4`_. and you are applying the patch directly from an |
| email, you can use ``b4 am`` with the options ``-g``/``--guess-base`` |
| and ``-3``/``--prep-3way`` to do some of this automatically (see the |
| `b4 presentation`_ for more information). However, the rest of this |
| article will assume that you are doing a plain ``git cherry-pick``. |
| |
| .. _b4: https://people.kernel.org/monsieuricon/introducing-b4-and-patch-attestation |
| .. _b4 presentation: https://youtu.be/mF10hgVIx9o?t=2996 |
| |
| Once you have the patch in git, you can go ahead and cherry-pick it into |
| your source tree. Don't forget to cherry-pick with ``-x`` if you want a |
| written record of where the patch came from! |
| |
| Note that if you are submiting a patch for stable, the format is |
| slightly different; the first line after the subject line needs tobe |
| either:: |
| |
| commit <upstream commit> upstream |
| |
| or:: |
| |
| [ Upstream commit <upstream commit> ] |
| |
| Resolving conflicts |
| =================== |
| |
| Uh-oh; the cherry-pick failed with a vaguely threatening message:: |
| |
| CONFLICT (content): Merge conflict |
| |
| What to do now? |
| |
| In general, conflicts appear when the context of the patch (i.e., the |
| lines being changed and/or the lines surrounding the changes) doesn't |
| match what's in the tree you are trying to apply the patch *to*. |
| |
| For backports, what likely happened was that the branch you are |
| backporting from contains patches not in the branch you are backporting |
| to. However, the reverse is also possible. In any case, the result is a |
| conflict that needs to be resolved. |
| |
| If your attempted cherry-pick fails with a conflict, git automatically |
| edits the files to include so-called conflict markers showing you where |
| the conflict is and how the two branches have diverged. Resolving the |
| conflict typically means editing the end result in such a way that it |
| takes into account these other commits. |
| |
| Resolving the conflict can be done either by hand in a regular text |
| editor or using a dedicated conflict resolution tool. |
| |
| Many people prefer to use their regular text editor and edit the |
| conflict directly, as it may be easier to understand what you're doing |
| and to control the final result. There are definitely pros and cons to |
| each method, and sometimes there's value in using both. |
| |
| We will not cover using dedicated merge tools here beyond providing some |
| pointers to various tools that you could use: |
| |
| - `Emacs Ediff mode <https://www.emacswiki.org/emacs/EdiffMode>`__ |
| - `vimdiff/gvimdiff <https://linux.die.net/man/1/vimdiff>`__ |
| - `KDiff3 <http://kdiff3.sourceforge.net/>`__ |
| - `TortoiseMerge <https://tortoisesvn.net/TortoiseMerge.html>`__ |
| - `Meld <https://meldmerge.org/help/>`__ |
| - `P4Merge <https://www.perforce.com/products/helix-core-apps/merge-diff-tool-p4merge>`__ |
| - `Beyond Compare <https://www.scootersoftware.com/>`__ |
| - `IntelliJ <https://www.jetbrains.com/help/idea/resolve-conflicts.html>`__ |
| - `VSCode <https://code.visualstudio.com/docs/editor/versioncontrol>`__ |
| |
| To configure git to work with these, see ``git mergetool --help`` or |
| the official `git-mergetool documentation`_. |
| |
| .. _git-mergetool documentation: https://git-scm.com/docs/git-mergetool |
| |
| Prerequisite patches |
| -------------------- |
| |
| Most conflicts happen because the branch you are backporting to is |
| missing some patches compared to the branch you are backporting *from*. |
| In the more general case (such as merging two independent branches), |
| development could have happened on either branch, or the branches have |
| simply diverged -- perhaps your older branch had some other backports |
| applied to it that themselves needed conflict resolutions, causing a |
| divergence. |
| |
| It's important to always identify the commit or commits that caused the |
| conflict, as otherwise you cannot be confident in the correctness of |
| your resolution. As an added bonus, especially if the patch is in an |
| area you're not that famliar with, the changelogs of these commits will |
| often give you the context to understand the code and potential problems |
| or pitfalls with your conflict resolution. |
| |
| git log |
| ~~~~~~~ |
| |
| A good first step is to look at ``git log`` for the file that has the |
| conflict -- this is usually sufficient when there aren't a lot of |
| patches to the file, but may get confusing if the file is big and |
| frequently patched. You should run ``git log`` on the range of commits |
| between your currently checked-out branch (``HEAD``) and the parent of |
| the patch you are picking (``<commit>``), i.e.:: |
| |
| git log HEAD..<commit>^ -- <path> |
| |
| Even better, if you want to restrict this output to a single function |
| (because that's where the conflict appears), you can use the following |
| syntax:: |
| |
| git log -L:'\<function\>':<path> HEAD..<commit>^ |
| |
| .. note:: |
| The ``\<`` and ``\>`` around the function name ensure that the |
| matches are anchored on a word boundary. This is important, as this |
| part is actually a regex and git only follows the first match, so |
| if you use ``-L:thread_stack:kernel/fork.c`` it may only give you |
| results for the function ``try_release_thread_stack_to_cache`` even |
| though there are many other functions in that file containing the |
| string ``thread_stack`` in their names. |
| |
| Another useful option for ``git log`` is ``-G``, which allows you to |
| filter on certain strings appearing in the diffs of the commits you are |
| listing:: |
| |
| git log -G'regex' HEAD..<commit>^ -- <path> |
| |
| This can also be a handy way to quickly find when something (e.g. a |
| function call or a variable) was changed, added, or removed. The search |
| string is a regular expression, which means you can potentially search |
| for more specific things like assignments to a specific struct member:: |
| |
| git log -G'\->index\>.*=' |
| |
| git blame |
| ~~~~~~~~~ |
| |
| Another way to find prerequisite commits (albeit only the most recent |
| one for a given conflict) is to run ``git blame``. In this case, you |
| need to run it against the parent commit of the patch you are |
| cherry-picking and the file where the conflict appared, i.e.:: |
| |
| git blame <commit>^ -- <path> |
| |
| This command also accepts the ``-L`` argument (for restricting the |
| output to a single function), but in this case you specify the filename |
| at the end of the command as usual:: |
| |
| git blame -L:'\<function\>' <commit>^ -- <path> |
| |
| Navigate to the place where the conflict occurred. The first column of |
| the blame output is the commit ID of the patch that added a given line |
| of code. |
| |
| It might be a good idea to ``git show`` these commits and see if they |
| look like they might be the source of the conflict. Sometimes there will |
| be more than one of these commits, either because multiple commits |
| changed different lines of the same conflict area *or* because multiple |
| subsequent patches changed the same line (or lines) multiple times. In |
| the latter case, you may have to run ``git blame`` again and specify the |
| older version of the file to look at in order to dig further back in |
| the history of the file. |
| |
| Prerequisite vs. incidental patches |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| Having found the patch that caused the conflict, you need to determine |
| whether it is a prerequisite for the patch you are backporting or |
| whether it is just incidental and can be skipped. An incidental patch |
| would be one that touches the same code as the patch you are |
| backporting, but does not change the semantics of the code in any |
| material way. For example, a whitespace cleanup patch is completely |
| incidental -- likewise, a patch that simply renames a function or a |
| variable would be incidental as well. On the other hand, if the function |
| being changed does not even exist in your current branch then this would |
| not be incidental at all and you need to carefully consider whether the |
| patch adding the function should be cherry-picked first. |
| |
| If you find that there is a necessary prerequisite patch, then you need |
| to stop and cherry-pick that instead. If you've already resolved some |
| conflicts in a different file and don't want to do it again, you can |
| create a temporary copy of that file. |
| |
| To abort the current cherry-pick, go ahead and run |
| ``git cherry-pick --abort``, then restart the cherry-picking process |
| with the commit ID of the prerequisite patch instead. |
| |
| Understanding conflict markers |
| ------------------------------ |
| |
| Combined diffs |
| ~~~~~~~~~~~~~~ |
| |
| Let's say you've decided against picking (or reverting) additional |
| patches and you just want to resolve the conflict. Git will have |
| inserted conflict markers into your file. Out of the box, this will look |
| something like:: |
| |
| <<<<<<< HEAD |
| this is what's in your current tree before cherry-picking |
| ======= |
| this is what the patch wants it to be after cherry-picking |
| >>>>>>> <commit>... title |
| |
| This is what you would see if you opened the file in your editor. |
| However, if you were to run ``git diff`` without any arguments, the |
| output would look something like this:: |
| |
| $ git diff |
| [...] |
| ++<<<<<<<< HEAD |
| +this is what's in your current tree before cherry-picking |
| ++======== |
| + this is what the patch wants it to be after cherry-picking |
| ++>>>>>>>> <commit>... title |
| |
| When you are resolving a conflict, the behavior of ``git diff`` differs |
| from its normal behavior. Notice the two columns of diff markers |
| instead of the usual one; this is a so-called "`combined diff`_", here |
| showing the 3-way diff (or diff-of-diffs) between |
| |
| #. the current branch (before cherry-picking) and the current working |
| directory, and |
| #. the current branch (before cherry-picking) and the file as it looks |
| after the original patch has been applied. |
| |
| .. _combined diff: https://git-scm.com/docs/diff-format#_combined_diff_format |
| |
| |
| Better diffs |
| ~~~~~~~~~~~~ |
| |
| 3-way combined diffs include all the other changes that happened to the |
| file between your current branch and the branch you are cherry-picking |
| from. While this is useful for spotting other changes that you need to |
| take into account, this also makes the output of ``git diff`` somewhat |
| intimidating and difficult to read. You may instead prefer to run |
| ``git diff HEAD`` (or ``git diff --ours``) which shows only the diff |
| between the current branch before cherry-picking and the current working |
| directory. It looks like this:: |
| |
| $ git diff HEAD |
| [...] |
| +<<<<<<<< HEAD |
| this is what's in your current tree before cherry-picking |
| +======== |
| +this is what the patch wants it to be after cherry-picking |
| +>>>>>>>> <commit>... title |
| |
| As you can see, this reads just like any other diff and makes it clear |
| which lines are in the current branch and which lines are being added |
| because they are part of the merge conflict or the patch being |
| cherry-picked. |
| |
| Merge styles and diff3 |
| ~~~~~~~~~~~~~~~~~~~~~~ |
| |
| The default conflict marker style shown above is known as the ``merge`` |
| style. There is also another style available, known as the ``diff3`` |
| style, which looks like this:: |
| |
| <<<<<<< HEAD |
| this is what is in your current tree before cherry-picking |
| ||||||| parent of <commit> (title) |
| this is what the patch expected to find there |
| ======= |
| this is what the patch wants it to be after being applied |
| >>>>>>> <commit> (title) |
| |
| As you can see, this has 3 parts instead of 2, and includes what git |
| expected to find there but didn't. It is *highly recommended* to use |
| this conflict style as it makes it much clearer what the patch actually |
| changed; i.e., it allows you to compare the before-and-after versions |
| of the file for the commit you are cherry-picking. This allows you to |
| make better decisions about how to resolve the conflict. |
| |
| To change conflict marker styles, you can use the following command:: |
| |
| git config merge.conflictStyle diff3 |
| |
| There is a third option, ``zdiff3``, introduced in `Git 2.35`_, |
| which has the same 3 sections as ``diff3``, but where common lines have |
| been trimmed off, making the conflict area smaller in some cases. |
| |
| .. _Git 2.35: https://github.blog/2022-01-24-highlights-from-git-2-35/ |
| |
| Iterating on conflict resolutions |
| --------------------------------- |
| |
| The first step in any conflict resolution process is to understand the |
| patch you are backporting. For the Linux kernel this is especially |
| important, since an incorrect change can lead to the whole system |
| crashing -- or worse, an undetected security vulnerability. |
| |
| Understanding the patch can be easy or difficult depending on the patch |
| itself, the changelog, and your familiarity with the code being changed. |
| However, a good question for every change (or every hunk of the patch) |
| might be: "Why is this hunk in the patch?" The answers to these |
| questions will inform your conflict resolution. |
| |
| Resolution process |
| ~~~~~~~~~~~~~~~~~~ |
| |
| Sometimes the easiest thing to do is to just remove all but the first |
| part of the conflict, leaving the file essentially unchanged, and apply |
| the changes by hand. Perhaps the patch is changing a function call |
| argument from ``0`` to ``1`` while a conflicting change added an |
| entirely new (and insignificant) parameter to the end of the parameter |
| list; in that case, it's easy enough to change the argument from ``0`` |
| to ``1`` by hand and leave the rest of the arguments alone. This |
| technique of manually applying changes is mostly useful if the conflict |
| pulled in a lot of unrelated context that you don't really need to care |
| about. |
| |
| For particularly nasty conflicts with many conflict markers, you can use |
| ``git add`` or ``git add -i`` to selectively stage your resolutions to |
| get them out of the way; this also lets you use ``git diff HEAD`` to |
| always see what remains to be resolved or ``git diff --cached`` to see |
| what your patch looks like so far. |
| |
| Dealing with file renames |
| ~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| One of the most annoying things that can happen while backporting a |
| patch is discovering that one of the files being patched has been |
| renamed, as that typically means git won't even put in conflict markers, |
| but will just throw up its hands and say (paraphrased): "Unmerged path! |
| You do the work..." |
| |
| There are generally a few ways to deal with this. If the patch to the |
| renamed file is small, like a one-line change, the easiest thing is to |
| just go ahead and apply the change by hand and be done with it. On the |
| other hand, if the change is big or complicated, you definitely don't |
| want to do it by hand. |
| |
| As a first pass, you can try something like this, which will lower the |
| rename detection threshold to 30% (by default, git uses 50%, meaning |
| that two files need to have at least 50% in common for it to consider |
| an add-delete pair to be a potential rename):: |
| |
| git cherry-pick -strategy=recursive -Xrename-threshold=30 |
| |
| Sometimes the right thing to do will be to also backport the patch that |
| did the rename, but that's definitely not the most common case. Instead, |
| what you can do is to temporarily rename the file in the branch you're |
| backporting to (using ``git mv`` and committing the result), restart the |
| attempt to cherry-pick the patch, rename the file back (``git mv`` and |
| committing again), and finally squash the result using ``git rebase -i`` |
| (see the `rebase tutorial`_) so it appears as a single commit when you |
| are done. |
| |
| .. _rebase tutorial: https://medium.com/@slamflipstrom/a-beginners-guide-to-squashing-commits-with-git-rebase-8185cf6e62ec |
| |
| Gotchas |
| ------- |
| |
| Function arguments |
| ~~~~~~~~~~~~~~~~~~ |
| |
| Pay attention to changing function arguments! It's easy to gloss over |
| details and think that two lines are the same but actually they differ |
| in some small detail like which variable was passed as an argument |
| (especially if the two variables are both a single character that look |
| the same, like i and j). |
| |
| Error handling |
| ~~~~~~~~~~~~~~ |
| |
| If you cherry-pick a patch that includes a ``goto`` statement (typically |
| for error handling), it is absolutely imperative to double check that |
| the target label is still correct in the branch you are backporting to. |
| The same goes for added ``return``, ``break``, and ``continue`` |
| statements. |
| |
| Error handling is typically located at the bottom of the function, so it |
| may not be part of the conflict even though could have been changed by |
| other patches. |
| |
| A good way to ensure that you review the error paths is to always use |
| ``git diff -W`` and ``git show -W`` (AKA ``--function-context``) when |
| inspecting your changes. For C code, this will show you the whole |
| function that's being changed in a patch. One of the things that often |
| go wrong during backports is that something else in the function changed |
| on either of the branches that you're backporting from or to. By |
| including the whole function in the diff you get more context and can |
| more easily spot problems that might otherwise go unnoticed. |
| |
| Refactored code |
| ~~~~~~~~~~~~~~~ |
| |
| Something that happens quite often is that code gets refactored by |
| "factoring out" a common code sequence or pattern into a helper |
| function. When backporting patches to an area where such a refactoring |
| has taken place, you effectively need to do the reverse when |
| backporting: a patch to a single location may need to be applied to |
| multiple locations in the backported version. (One giveaway for this |
| scenario is that a function was renamed -- but that's not always the |
| case.) |
| |
| To avoid incomplete backports, it's worth trying to figure out if the |
| patch fixes a bug that appears in more than one place. One way to do |
| this would be to use ``git grep``. (This is actually a good idea to do |
| in general, not just for backports.) If you do find that the same kind |
| of fix would apply to other places, it's also worth seeing if those |
| places exist upstream -- if they don't, it's likely the patch may need |
| to be adjusted. ``git log`` is your friend to figure out what happened |
| to these areas as ``git blame`` won't show you code that has been |
| removed. |
| |
| If you do find other instances of the same pattern in the upstream tree |
| and you're not sure whether it's also a bug, it may be worth asking the |
| patch author. It's not uncommon to find new bugs during backporting! |
| |
| Verifying the result |
| ==================== |
| |
| colordiff |
| --------- |
| |
| Having committed a conflict-free new patch, you can now compare your |
| patch to the original patch. It is highly recommended that you use a |
| tool such as `colordiff`_ that can show two files side by side and color |
| them according to the changes between them:: |
| |
| colordiff -yw -W 200 <(git diff -W <upstream commit>^-) <(git diff -W HEAD^-) | less -SR |
| |
| .. _colordiff: https://www.colordiff.org/ |
| |
| Here, ``-y`` means to do a side-by-side comparison; ``-w`` ignores |
| whitespace, and ``-W 200`` sets the width of the output (as otherwise it |
| will use 130 by default, which is often a bit too little). |
| |
| The ``rev^-`` syntax is a handy shorthand for ``rev^..rev``, essentially |
| giving you just the diff for that single commit; also see |
| the official `git rev-parse documentation`_. |
| |
| .. _git rev-parse documentation: https://git-scm.com/docs/git-rev-parse#_other_rev_parent_shorthand_notations |
| |
| Again, note the inclusion of ``-W`` for ``git diff``; this ensures that |
| you will see the full function for any function that has changed. |
| |
| One incredibly important thing that colordiff does is to highlight lines |
| that are different. For example, if an error-handling ``goto`` has |
| changed labels between the original and backported patch, colordiff will |
| show these side-by-side but highlighted in a different color. Thus, it |
| is easy to see that the two ``goto`` statements are jumping to different |
| labels. Likewise, lines that were not modified by either patch but |
| differ in the context will also be highlighted and thus stand out during |
| a manual inspection. |
| |
| Of course, this is just a visual inspection; the real test is building |
| and running the patched kernel (or program). |
| |
| Build testing |
| ------------- |
| |
| We won't cover runtime testing here, but it can be a good idea to build |
| just the files touched by the patch as a quick sanity check. For the |
| Linux kernel you can build single files like this, assuming you have the |
| ``.config`` and build environment set up correctly:: |
| |
| make path/to/file.o |
| |
| Note that this won't discover linker errors, so you should still do a |
| full build after verifying that the single file compiles. By compiling |
| the single file first you can avoid having to wait for a full build *in |
| case* there are compiler errors in any of the files you've changed. |
| |
| Runtime testing |
| --------------- |
| |
| Even a successful build or boot test is not necessarily enough to rule |
| out a missing dependency somewhere. Even though the chances are small, |
| there could be code changes where two independent changes to the same |
| file result in no conflicts, no compile-time errors, and runtime errors |
| only in exceptional cases. |
| |
| One concrete example of this was a pair of patches to the system call |
| entry code where the first patch saved/restored a register and a later |
| patch made use of the same register somewhere in the middle of this |
| sequence. Since there was no overlap between the changes, one could |
| cherry-pick the second patch, have no conflicts, and believe that |
| everything was fine, when in fact the code was now scribbling over an |
| unsaved register. |
| |
| Although the vast majority of errors will be caught during compilation |
| or by superficially exercising the code, the only way to *really* verify |
| a backport is to review the final patch with the same level of scrutiny |
| as you would (or should) give to any other patch. Having unit tests and |
| regression tests or other types of automatic testing can help increase |
| the confidence in the correctness of a backport. |
| |
| Submitting backports to stable |
| ============================== |
| |
| As the stable maintainers try to cherry-pick mainline fixes onto their |
| stable kernels, they may send out emails asking for backports when when |
| encountering conflicts, see e.g. |
| <https://lore.kernel.org/stable/2023101528-jawed-shelving-071a@gregkh/>. |
| These emails typically include the exact steps you need to cherry-pick |
| the patch to the correct tree and submit the patch. |
| |
| One thing to make sure is that your changelog conforms to the expected |
| format:: |
| |
| <original patch title> |
| |
| [ Upstream commit <mainline rev> ] |
| |
| <rest of the original changelog> |
| [ <summary of the conflicts and their resolutions> ] |
| Signed-off-by: <your name and email> |
| |
| The "Upstream commit" line is sometimes slightly different depending on |
| the stable version. Older version used this format:: |
| |
| commit <mainline rev> upstream. |
| |
| It is most common to indicate the kernel version the patch applies to |
| in the email subject line (using e.g. |
| ``git send-email --subject-prefix='PATCH 6.1.y'``), but you can also put |
| it in the Signed-off-by:-area or below the ``---`` line. |
| |
| The stable maintainers expect separate submissions for each active |
| stable version, and each submission should also be tested separately. |
| |
| A few final words of advice |
| =========================== |
| |
| 1) Approach the backporting process with humility. |
| 2) Understand the patch you are backporting; this means reading both |
| the changelog and the code. |
| 3) Be honest about your confidence in the result when submitting the |
| patch. |
| 4) Ask relevant maintainers for explicit acks. |
| |
| Examples |
| ======== |
| |
| The above shows roughly the idealized process of backporting a patch. |
| For a more concrete example, see this video tutorial where two patches |
| are backported from mainline to stable: |
| `Backporting Linux Kernel Patches`_. |
| |
| .. _Backporting Linux Kernel Patches: https://youtu.be/sBR7R1V2FeA |