# hgbook / en / mq.tex

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 \chapter{Managing change with Mercurial Queues} \label{chap:mq} \section{The patch management problem} \label{sec:mq:patch-mgmt} Here is a common scenario: you need to install a software package from source, but you find a bug that you must fix in the source before you can start using the package. You make your changes, forget about the package for a while, and a few months later you need to upgrade to a newer version of the package. If the newer version of the package still has the bug, you must extract your fix from the older source tree and apply it against the newer version. This is a tedious task, and it's easy to make mistakes. This is a simple case of the patch management'' problem. You have an upstream'' source tree that you can't change; you need to make some local changes on top of the upstream tree; and you'd like to be able to keep those changes separate, so that you can apply them to newer versions of the upstream source. The patch management problem arises in many situations. Probably the most visible is that a user of an open source software project will contribute a bug fix or new feature to the project's maintainers in the form of a patch. Distributors of operating systems that include open source software often need to make changes to the packages they distribute so that they will build properly in their environments. When you have few changes to maintain, it is easy to manage a single patch using the standard \texttt{diff} and \texttt{patch} programs (see section~\ref{sec:mq:patch} for a discussion of these tools). Once the number of changes grows, it starts to makes sense to maintain patches as discrete chunks of work,'' so that for example a single patch will contain only one bug fix (the patch might modify several files, but it's doing only one thing''), and you may have a number of such patches for different bugs you need fixed and local changes you require. In this situation, if you submit a bug fix patch to the upstream maintainers of a package and they include your fix in a subsequent release, you can simply drop that single patch when you're updating to the newer release. Maintaining a single patch against an upstream tree is a little tedious and error-prone, but not difficult. However, the complexity of the problem grows rapidly as the number of patches you have to maintain increases. With more than a tiny number of patches in hand, understanding which ones you have applied and maintaining them moves from messy to overwhelming. Fortunately, Mercurial includes a powerful extension, Mercurial Queues (or simply MQ''), that massively simplifies the patch management problem. \section{The prehistory of Mercurial Queues} \label{sec:mq:history} During the late 1990s, several Linux kernel developers started to maintain patch series'' that modified the behaviour of the Linux kernel. Some of these series were focused on stability, some on feature coverage, and others were more speculative. The sizes of these patch series grew rapidly. In 2002, Andrew Morton published some shell scripts he had been using to automate the task of managing his patch queues. Andrew was successfully using these scripts to manage hundreds (sometimes thousands) of patches on top of the Linux kernel. \subsection{A patchwork quilt} \label{sec:mq:quilt} In early 2003, Andreas Gruenbacher and Martin Quinson borrowed the approach of Andrew's scripts and published a tool called patchwork quilt''~\cite{web:quilt}, or simply quilt'' (see~\cite{gruenbacher:2005} for a paper describing it). Because quilt substantially automated patch management, it rapidly gained a large following among open source software developers. Quilt manages a \emph{stack of patches} on top of a directory tree. To begin, you tell quilt to manage a directory tree; it stores away the names and contents of all files in the tree. To fix a bug, you create a new patch (using a single command), edit the files you need to fix, then refresh'' the patch. The refresh step causes quilt to scan the directory tree; it updates the patch with all of the changes you have made. You can create another patch on top of the first, which will track the changes required to modify the tree from tree with one patch applied'' to tree with two patches applied''. You can \emph{change} which patches are applied to the tree. If you pop'' a patch, the changes made by that patch will vanish from the directory tree. Quilt remembers which patches you have popped, though, so you can push'' a popped patch again, and the directory tree will be restored to contain the modifications in the patch. Most importantly, you can run the refresh'' command at any time, and the topmost applied patch will be updated. This means that you can, at any time, change both which patches are applied and what modifications those patches make. Quilt knows nothing about revision control tools, so it works equally well on top of an unpacked tarball or a Subversion repository. \subsection{From patchwork quilt to Mercurial Queues} \label{sec:mq:quilt-mq} In mid-2005, Chris Mason took the features of quilt and wrote an extension that he called Mercurial Queues, which added quilt-like behaviour to Mercurial. The key difference between quilt and MQ is that quilt knows nothing about revision control systems, while MQ is \emph{integrated} into Mercurial. Each patch that you push is represented as a Mercurial changeset. Pop a patch, and the changeset goes away. This integration makes understanding patches and debugging their effects \emph{enormously} easier. Since every applied patch has an associated changeset, you can use \hgcmdargs{log}{\emph{filename}} to see which changesets and patches affected a file. You can use the \hgext{bisect} extension to binary-search through all changesets and applied patches to see where a bug got introduced or fixed. You can use the \hgcmd{annotate} command to see which changeset or patch modified a particular line of a source file. And so on. Because quilt does not care about revision control tools, it is still a tremendously useful piece of software to know about for situations where you cannot use Mercurial and MQ. \section{Getting started with Mercurial Queues} \label{sec:mq:start} Because MQ is implemented as an extension, you must explicitly enable before you can use it. (You don't need to download anything; MQ ships with the standard Mercurial distribution.) To enable MQ, edit your \tildefile{.hgrc} file, and add the lines in figure~\ref{ex:mq:config}. \begin{figure}[ht] \begin{codesample4} [extensions] hgext.mq = \end{codesample4} \label{ex:mq:config} \caption{Contents to add to \tildefile{.hgrc} to enable the MQ extension} \end{figure} Once the extension is enabled, it will make a number of new commands available. To verify that the extension is working, you can use \hgcmd{help} to see if the \hgcmd{qinit} command is now available; see the example in figure~\ref{ex:mq:enabled}. \begin{figure}[ht] \interaction{mq.qinit-help.help} \caption{How to verify that MQ is enabled} \label{ex:mq:enabled} \end{figure} You can use MQ with \emph{any} Mercurial repository, and its commands only operate within that repository. To get started, simply prepare the repository using the \hgcmd{qinit} command (see figure~\ref{ex:mq:qinit}). This command creates an empty directory called \sdirname{.hg/patches}, where MQ will keep its metadata. As with many Mercurial commands, the \hgcmd{qinit} command prints nothing if it succeeds. \begin{figure}[ht] \interaction{mq.tutorial.qinit} \caption{Preparing a repository for use with MQ} \label{ex:mq:qinit} \end{figure} \begin{figure}[ht] \interaction{mq.tutorial.qnew} \caption{Creating a new patch} \label{ex:mq:qnew} \end{figure} \subsection{Creating a new patch} To begin work on a new patch, use the \hgcmd{qnew} command. This command takes one argument, the name of the patch to create. MQ will use this as the name of an actual file in the \sdirname{.hg/patches} directory, as you can see in figure~\ref{ex:mq:qnew}. Also newly present in the \sdirname{.hg/patches} directory are two other files, \sfilename{series} and \sfilename{status}. The \sfilename{series} file lists all of the patches that MQ knows about for this repository, with one patch per line. Mercurial uses the \sfilename{status} file for internal book-keeping; it tracks all of the patches that MQ has \emph{applied} in this repository. \begin{note} You may sometimes want to edit the \sfilename{series} file by hand; for example, to change the sequence in which some patches are applied. However, manually editing the \sfilename{status} file is almost always a bad idea, as it's easy to corrupt MQ's idea of what is happening. \end{note} Once you have created your new patch, you can edit files in the working directory as you usually would. All of the normal Mercurial commands, such as \hgcmd{diff} and \hgcmd{annotate}, work exactly as they did before. \subsection{Refreshing a patch} When you reach a point where you want to save your work, use the \hgcmd{qrefresh} command (figure~\ref{ex:mq:qnew}) to update the patch you are working on. This command folds the changes you have made in the working directory into your patch, and updates its corresponding changeset to contain those changes. \begin{figure}[ht] \interaction{mq.tutorial.qrefresh} \caption{Refreshing a patch} \label{ex:mq:qrefresh} \end{figure} You can run \hgcmd{qrefresh} as often as you like, so it's a good way to checkpoint'' your work. Refresh your patch at an opportune time; try an experiment; and if the experiment doesn't work out, \hgcmd{revert} your modifications back to the last time you refreshed. \begin{figure}[ht] \interaction{mq.tutorial.qrefresh2} \caption{Refresh a patch many times to accumulate changes} \label{ex:mq:qrefresh2} \end{figure} \subsection{Stacking and tracking patches} Once you have finished working on a patch, or need to work on another, you can use the \hgcmd{qnew} command again to create a new patch. Mercurial will apply this patch on top of your existing patch. See figure~\ref{ex:mq:qnew2} for an example. Notice that the patch contains the changes in our prior patch as part of its context (you can see this more clearly in the output of \hgcmd{annotate}). \begin{figure}[ht] \interaction{mq.tutorial.qnew2} \caption{Stacking a second patch on top of the first} \label{ex:mq:qnew2} \end{figure} So far, with the exception of \hgcmd{qnew} and \hgcmd{qrefresh}, we've been careful to only use regular Mercurial commands. However, there are more natural'' commands you can use when thinking about patches with MQ, as illustrated in figure~\ref{ex:mq:qseries}: \begin{itemize} \item The \hgcmd{qseries} command lists every patch that MQ knows about in this repository, from oldest to newest (most recently \emph{created}). \item The \hgcmd{qapplied} command lists every patch that MQ has \emph{applied} in this repository, again from oldest to newest (most recently applied). \end{itemize} \begin{figure}[ht] \interaction{mq.tutorial.qseries} \caption{Understanding the patch stack with \hgcmd{qseries} and \hgcmd{qapplied}} \label{ex:mq:qseries} \end{figure} \subsection{Manipulating the patch stack} The previous discussion implied that there must be a difference between known'' and applied'' patches, and there is. MQ can manage a patch without it being applied in the repository. An \emph{applied} patch has a corresponding changeset in the repository, and the effects of the patch and changeset are visible in the working directory. You can undo the application of a patch using the \hgcmd{qpop} command. MQ still \emph{knows about}, or manages, a popped patch, but the patch no longer has a corresponding changeset in the repository, and the working directory does not contain the changes made by the patch. Figure~\ref{fig:mq:stack} illustrates the difference between applied and tracked patches. \begin{figure}[ht] \centering \grafix{mq-stack} \caption{Applied and unapplied patches in the MQ patch stack} \label{fig:mq:stack} \end{figure} You can reapply an unapplied, or popped, patch using the \hgcmd{qpush} command. This creates a new changeset to correspond to the patch, and the patch's changes once again become present in the working directory. See figure~\ref{ex:mq:qpop} for examples of \hgcmd{qpop} and \hgcmd{qpush} in action. Notice that once we have popped a patch or two patches, the output of \hgcmd{qseries} remains the same, while that of \hgcmd{qapplied} has changed. \begin{figure}[ht] \interaction{mq.tutorial.qpop} \caption{Modifying the stack of applied patches} \label{ex:mq:qpop} \end{figure} MQ does not limit you to pushing or popping one patch. You can have no patches, all of them, or any number in between applied at some point in time. \subsection{Working on several patches at once} The \hgcmd{qrefresh} command always refreshes the \emph{topmost} applied patch. This means that you can suspend work on one patch (by refreshing it), pop or push to make a different patch the top, and work on \emph{that} patch for a while. Here's an example that illustrates how you can use this ability. Let's say you're developing a new feature as two patches. The first is a change to the core of your software, and the second--layered on top of the first--changes the user interface to use the code you just added to the core. If you notice a bug in the core while you're working on the UI patch, it's easy to fix the core. Simply \hgcmd{qrefresh} the UI patch to save your in-progress changes, and \hgcmd{qpop} down to the core patch. Fix the core bug, \hgcmd{qrefresh} the core patch, and \hgcmd{qpush} back to the UI patch to continue where you left off. \section{Mercurial Queues and GNU patch} \label{sec:mq:patch} MQ uses the GNU \command{patch} command to apply patches. Because MQ doesn't hide its patch-oriented nature, it is helpful to understand the data that MQ and \command{patch} work with, and a few aspects of how \command{patch} operates. The \command{diff} command generates a list of modifications by comparing two files. The \command{patch} command applies a list of modifications to a file. The kinds of files that \command{diff} and \command{patch} work with are referred to as both diffs'' and patches;'' there is no difference between a diff and a patch. A patch file can start with arbitrary text; MQ uses this text as the commit message when creating changesets. It treats the first line that starts with the string \texttt{diff~-}'' as the separator between header and content. MQ works with \emph{unified} diffs (\command{patch} can accept several other diff formats, but MQ doesn't). A unified diff contains two kinds of header. The \emph{file header} describes the file being modified; it contains the name of the file to modify. When \command{patch} sees a new file header, it looks for a file with that name to start modifying. After the file header comes a series of \emph{hunks}. Each hunk starts with a header; this identifies the range of line numbers within the file that the hunk should modify. Following the header, a hunk starts and ends with a few (usually three) lines of text from the unmodified file; these are called the \emph{context} for the hunk. Each unmodified line begins with a space characters. Within the hunk, a line that begins with \texttt{-}'' means remove this line,'' while a line that begins with \texttt{+}'' means insert this line.'' For example, a line that is modified is represented by one deletion and one insertion. The \command{diff} command runs hunks together when there's not enough context between modifications to justify When \command{patch} applies a hunk, it tries a handful of successively less accurate strategies to try to make the hunk apply. This falling-back technique often makes it possible to take a patch that was generated against an old version of a file, and apply it against a newer version of that file. First, \command{patch} tries an exact match, where the line numbers, the context, and the text to be modified must apply exactly. If it cannot make an exact match, it tries to find an exact match for the context, without honouring the line numbering information. If this succeeds, it prints a line of output saying that the hunk was applied, but at some \emph{offset} from the original line number. If a context-only match fails, \command{patch} removes the first and last lines of the context, and tries a \emph{reduced} context-only match. If the hunk with reduced context succeeds, it prints a message saying that it applied the hunk with a \emph{fuzz factor} (the number after the fuzz factor indicates how many lines of context \command{patch} had to trim before the patch applied). When neither of these techniques works, \command{patch} prints a message saying that the hunk in question was rejected. It saves rejected hunks to a file with the same name, and an added \sfilename{.rej} extension. It also saves an unmodified copy of the file with a \sfilename{.orig} extension; the copy of the file without any extensions will contain any changes made by hunks that \emph{did} apply cleanly. If you have a patch that modifies \filename{foo} with six hunks, and one of them fails to apply, you will have: an unmodified \filename{foo.orig}, a \filename{foo.rej} containing one hunk, and \filename{foo}, containing the changes made by the five successful five hunks. \subsection{Beware the fuzz} While applying a hunk at an offset, or with a fuzz factor, will often be completely successful, these inexact techniques naturally leave open the possibility of corrupting the patched file. The most common cases typically involve applying a patch twice, or at an incorrect location in the file. If \command{patch} or \hgcmd{qpush} ever mentions an offset or fuzz factor, you should make sure that the modified files are correct afterwards. It's often a good idea to refresh a patch that has applied with an offset or fuzz factor; refreshing the patch generates new context information that will make it apply cleanly. I say often,'' not always,'' because sometimes refreshing a patch will make it fail to apply against a different revision of the underlying files. In some cases, such as when you're maintaining a patch that must sit on top of multiple versions of a source tree, it's acceptable to have a patch apply with some fuzz, provided you've verified the results of the patching process in such cases. \subsection{Handling rejection} If \hgcmd{qpush} fails to apply a patch, it will print an error message and exit. If it has left \sfilename{.rej} files behind, it is usually best to fix up the rejected hunks before you push more patches or do any further work. If your patch \emph{used to} apply cleanly, and no longer does because you've changed the underlying code that your patches are based on, Mercurial Queues can help; see section~\ref{seq:mq:merge} for details. Unfortunately, there aren't any great techniques for dealing with rejected hunks. Most often, you'll need to view the \sfilename{.rej} file and edit the target file, applying the rejected hunks by hand. If you're feeling adventurous, Neil Brown, a Linux kernel hacker, wrote a tool called \command{wiggle}~\cite{web:wiggle}, which is more vigorous than \command{patch} in its attempts to make a patch apply. Another Linux kernel hacker, Chris Mason (the author of Mercurial Queues), wrote a similar tool called \command{rej}~\cite{web:rej}, which takes a simple approach to automating the application of hunks rejected by \command{patch}. \command{rej} can help with four common reasons that a hunk may be rejected: \begin{itemize} \item The context in the middle of a hunk has changed. \item A hunk is missing some context at the beginning or end. \item A large hunk might apply better--either entirely or in part--if it was broken up into smaller hunks. \item A hunk removes lines with slightly different content than those currently present in the file. \end{itemize} If you use \command{wiggle} or \command{rej}, you should be doubly careful to check your results when you're done. \section{Updating your patches when the underlying code changes} \label{sec:mq:merge} XXX. \section{Managing patches in a repository} Because MQ's \sdirname{.hg/patches} directory resides outside a Mercurial repository's working directory, the underlying'' Mercurial repository knows nothing about the management or presence of patches. This presents the interesting possibility of managing the contents of the patch directory as a Mercurial repository in its own right. This can be a useful way to work. For example, you can work on a patch for a while, \hgcmd{qrefresh} it, then \hgcmd{commit} the current state of the patch. This lets you roll back'' to that version of the patch later on. In addition, you can then share different versions of the same patch stack among multiple underlying repositories. I use this when I am developing a Linux kernel feature. I have a pristine copy of my kernel sources for each of several CPU architectures, and a cloned repository under each that contains the patches I am working on. When I want to test a change on a different architecture, I push my current patches to the patch repository associated with that kernel tree, pop and push all of my patches, and build and test that kernel. Managing patches in a repository makes it possible for multiple developers to work on the same patch series without colliding with each other, all on top of an underlying source base that they may or may not control. \subsection{MQ support for managing a patch repository} MQ helps you to work with the \sdirname{.hg/patches} directory as a repository; when you prepare a repository for working with patches using \hgcmdargs{qinit}, you can pass the \hgopt{qinit}{-c} option to create the \sdirname{.hg/patches} directory as a Mercurial repository. \begin{note} If you forget to use the \hgopt{qinit}{-c} option, you can simply go into the \sdirname{.hg/patches} directory at any time and run \hgcmd{init}. Don't forget to add an entry for the \filename{status} file to the \filename{.hgignore} file, though (\hgopt{qinit}{-c} does this for you automatically); you \emph{really} don't want to manage the \filename{status} file. \end{note} As a convenience, if MQ notices that the \dirname{.hg/patches} directory is a repository, it will automatically \hgcmd{add} every patch that you create and import. Finally, MQ provides a shortcut command, \hgcmd{qcommit}, that runs \hgcmd{commit} in the \sdirname{.hg/patches} directory. This saves some cumbersome typing. \subsection{A few things to watch out for} MQ's support for working with a repository full of patches is limited in a few small respects. MQ cannot automatically detect changes that you make to the patch directory. If you \hgcmd{pull}, manually edit, or \hgcmd{update} changes to patches or the \sfilename{series} file, you will have to \hgcmdargs{qpop}{-a} and then \hgcmdargs{qpush}{-a} in the underlying repository to see those changes show up there. If you forget to do this, you can confuse MQ's idea of which patches are applied. \section{Commands for working with patches} Once you've been working with patches for a while, you'll find yourself hungry for tools that will help you to understand and manipulate the patches you're dealing with. The \command{diffstat} command~\cite{web:diffstat} generates a histogram of the modifications made to each file in a patch. It provides a good way to get a sense of'' a patch--which files it affects, and how much change it introduces to each file and as a whole. (I find that it's a good idea to use \command{diffstat}'s \texttt{-p} option as a matter of course, as otherwise it will try to do clever things with prefixes of file names that inevitably confuse at least me.) The \package{patchutils} package~\cite{web:patchutils} is invaluable. It provides a set of small utilities that follow the Unix philosophy;'' each does one useful thing with a patch. The \package{patchutils} command I use most is \command{filterdiff}, which extracts subsets from a patch file. For example, given a patch that modifies hundreds of files across dozens of directories, a single invocation of \command{filterdiff} can generate a smaller patch that only touches files whose names match a particular glob pattern. %%% Local Variables: %%% mode: latex %%% TeX-master: "00book" %%% End: