I mix tabs and spaces.
We are all familiar with the three endless debates. In increasing order of importance, they are:
- faith vs. empiricism
- vim vs. emacs
- tabs vs. spaces
Today I will tackle the most important debate: tabs vs. spaces. In most languages, the decision essentially boils down to an aesthetic one, and many, many paragraphs have been spent discussing the aesthetic tradeoff. However, some languages, like Haskell, are also whitespace sensitive, which adds some technical and social reasons to choose one or the other. Many people believe that tabs are fundamentally incompatible with pretty, functioning code in languages that are whitespace sensitive.
In this post, I will describe a whitespace style that is compatible with whitespace sensitivity; hence, the goal of this post is not to resolve the debate but merely to return its previous status as an aesthetic tradeoff on which reasonable people can disagree*.
- Existing tooling: the style should be usable with modern but already-existing text editors.
- Compatible with user-specified tab widths: one of the main draws of tabs is that they have some semantic content, and their display width can therefore be configured per user. The style must therefore "work" no matter how the user has configured their tabstops.
- Usable in Haskell source: the style should "work" when the compiler chooses 8-wide tabstops, and the style should keep Haskell's flexibility about where blocks can start (i.e. either on the same line as or on the line after a block herald).
- Obvious errors: it should be easy -- both for a human and, ideally, for a compiler -- to spot code that does not fit in the style, so that maintainers can reject or correct badly styled patches and so that compilers can warn about bad style.
In particular, while making errors obvious is a goal, catering to the lazy or inattentive is a non-goal: I will be assuming throughout this document that all core collaborators to a project will know and adhere to whatever whitespace convention is chosen for that project.
ExamplesThe simplest example looks like this:
A related example is this one, where we are interested in aligning, rather than indenting, our code:
main = do foo bar
If we want to align things on indented lines, we will need to have some tabs for the indentation followed by spaces for the alignment.
main = print [ foo , bar , baz ]
In fact, we may have arbitrary alternations of the two kinds of whitespace; for a more exciting example, suppose we wanted to introduce a deeper indentation level in our aligned thing.
main = print answer where answer = [ foo , bar , baz ]
You could also choose to use two more spaces on the
main = print answer where answer = [ foo , case bar of 0 -> quux n -> bismarck , baz ]
bismarcklines if you felt the patterns should be indented from the
caserather than the
,. Here is a very similar example, which includes a feature none of the others did: beginning a block on the same line as the block herald.
main = do foo case bar of 0 -> quux n -> bismarck baz
The Guiding Principle on Which Starfleet is Based
First and foremost: always fit in. Do not try to push this style on an existing project with another style. Do not advise switching a project to this style. Do not submit patches to an existing project with a chunk of code that uses this style in the middle of swathes of code that do not use this style. Just generally don't be an ass.
Secondly and middlemost†: tabs for indentation, spaces for alignment. In other words, use a tab when you want things to be a bit farther to the right, but don't care how much, and use spaces when you want to move a bit farther to the right and care that it be exactly a certain number of characters. This idea is covered well other places on the web, with the one caveat that you should ignore implications that there should never be tabs after spaces. This works well for languages whose syntax does not nest in interesting ways; but since Haskell allows almost any part of the language's grammar to appear as part of an expression, it is possible (and even somewhat common) to need to mix indentation and alignment.
Thirdly and hindmost: tabs and spaces are of incomparable width. A tab is not eight spaces. A tab is not four spaces. A tab is not two spaces. A tab does not jump to the next eight-character tabstop. There is not some number n such that tabs jump to the next column that is a multiple of n. Tabs are not at least one space wide. Tabs do not even necessarily move to the right! There is simply nothing you can do with spaces to correctly align yourself with a tab character‡. If you want to align things on two separate lines, and one line has a tab character as the nth character in the line, well, the other line will simply have to have a tab character as the nth character in the line, too.
Editor ConfigurationI recommend making tabs visible. I like using a simple
>to mark the beginning of a tab, which can be done in vim by adding
listcharsand turning on the
listoption. You may also like to turn on
preserveindent. The relevant parts of my
.vimrcfile looks like this:
David Christiansen suggests the following emacs configuration settings:
set noexpandtab set copyindent set preserveindent set listchars=tab:>\ ,trail:#,extends:> set list set shiftwidth=4 set softtabstop=0 set tabstop=4 set autoindent
I am always interested in including other configuration advice, so please feel free to send me information about how you configure your favorite editor when using this style.
(setq whitespace-style '(face trailing tabs space-before-tab tab-mark lines-tail)) (whitespace-mode 1)
Objections to Tabs
You must configure your text editor to align to 8 space multiples if you're going to use tabs. If you don't, your text editor will display some code as being different from how the compiler will read it.
It's true that if not all of your tabstops are at eight-space multiples, some code will be displayed in a way incompatible with the way the compiler sees it. However, if the difference matters, this code is broken, and, assuming you've configured your editor well, it will be obviously, visibly broken. Code written in the tabstop-agnostic style may be viewed slightly differently by the compiler than it is rendered in your editor -- but not importantly so; one need not force his editor to use the compiler's tabstops.
You're not interested in indentation in Haskell code. You're interested in precisely which column things start in.
In fact, one is interested in different things at different times: aesthetic indentation (as in the indentation of list elements in a multiline expression), alignment necessary for correct parsing of blocks (as for the patterns of a case statement), and aesthetic alignment (for example, aligning equals signs in a sequence of equations). For the aesthetic indentation and alignment, how the compiler views tabstops doesn't matter, so we should focus on block alignment. One can choose various tab styles; in the most liberal style that still guarantees acceptance by a Haskell compiler, it is true that one is concerned with the exact column number when considering block alignment. This consideration depends very much on how the compiler chooses its tabstops. However, in the more restrictive style described here, one is concerned instead with the sequence of "tab" and "not-tab" characters that have appeared in the line so far, and this consideration is not a function of the compiler's tabstops.
It is not easy to do by hand, nor is it easy to instruct an editor to do it for you automatically.
Your editor isn't smart enough to adjust the indentation of Haskell code correctly, unless you force yourself into a somewhat awkward style, or add a lot of mental burden to pressing some combination of space and tab in an appropriate way at the start of each line, and then editing those spaces and tabs just-so when modifying the code.
Empirically, I have not found this to be so. Check out my tips in the Editor Configuration section.
The argument for spaces-only is that it works everywhere.
I like "works everywhere". That is why I argue for a style of tab use that works everywhere.
I don't think about indentation any more.
No time like the present! Perhaps you will conclude that you were doing it right before... or perhaps, like me, you will try an experiment on a few personal projects.
There are existing tools for checking whether a patch to my project uses tabs, but not to check whether a patch is tabstop-agnostic.
For the moment, this is true, and perhaps the most damning objection of any listed here. Proper editor configuration can make it easy to spot mistakes, but it's an unfortunately manual process for now.
The most popular alternative is to rule out tabs entirely and use only spaces. This approach does allow you to align things on lines with different indentation levels, and allows you to be sure you are seeing the code in the exact same way it was originally written. However, it does take away a useful tool: without the semantic information provided by tabs, editors cannot reliably change the size of indentations when rendering the text. There can be good reasons to avoid certain tools, especially when they are dangerous. One must balance the danger and use of a tool; the discussions above are intended to defang tabs while retaining their usefulness. In terms of setup, neither spaces-only nor tabs-and-spaces seem to be clear winners: the major editors both need some tweaking to be comfortable to use in either way.A second alternative is to have the leading whitespace on any given line contain either only spaces or only tabs. This is more relaxed than "only use spaces", but still too restrictive for everyday Haskell usage. Superficially, it appears that we can align things with tabs.
However, this isn't reliable. Somebody who has configured different sized tabstops will see something mangled; for example, with larger tabstops, you might see this:
main = print answer where answer =[ foo , bar , baz ]
It is possible to be tabstop-agnostic; however, to do this, we must put in an additional newline whenever we want to do alignment:
main = print answer where answer =[ foo , bar , baz ]
Since block heralds signify an alignment point, this means in particular that this style requires a newline after every block herald. The only way to avoid these undesirable stylistic gymnastics is to mix tabs and spaces.
main = print answer where answer = [ foo , bar , baz ]
There is a Haskell' proposal discussing changes to the whitespace rules. It includes the correct solution outlined here; unfortunately, it is hidden between the two previously discussed solutions, namely, ruling out tabs entirely and forcing each line to use either only spaces or only tabs. Additionally, the correct solution has a complaint attached that the rule is difficult to specify and explain. However, I think this is untrue: whitespace s is at a deeper indentation level than t when t is a prefix of s. (As usual with such ordering relations, we can define a "same indentation level" relation simply by saying s and t are at the same level when they are deeper than each other, that is, when s=t.)One style is to demand that all users set their editors to eight space tabstops (or four, or two, or whatever). In this style, you can be certain that all users are seeing the same thing, even if you badly mangle what you do with tabs and spaces; and, with a judicious choice of dictator, you might even be sure that your compiler sees it the way you do. This style even allows some very interesting things, like aligning things on lines with different indentation levels; for example, one can write this:
Unfortunately, without the advantage of flexible rendering, the only argument supporting the use of tabs is as a disk space optimization. Especially given the price of disk space today, I consider this option in roughly the same league as "only use spaces": okay, but we can do better.
Perhaps one of my favorite proposed alternatives is elastic tabstops. There are at least two drawbacks to this approach, however. First, it requires some nontrivial tool support, and consequently documents prepared with an editor that supports elastic tabstops will look distorted in editors without that support. This makes collaboration difficult (you can pry my editor of choice from my cold, dead hands, are you with me?!). Second, the tool must choose tabs to align with each other before choosing appropriate tabstops; because tabs currently do not contain enough semantic information to reliably pair them with each other, the tool must use some heuristics to guess which tabs the user meant to align. The heuristic proposed there is very good, but experience shows it is not perfect: it is common to wish for adjacent lines with tabstops that don't align, and also common to wish for tabstops to align even across lines that do not have a tab. Using the mixed tabs and spaces strategy discussed above has neither drawback: no sophisticated tool support is necessary for rendering, and sophisticated alignment strategies are easily attainable.
I think in an ideal world, we would use something very like elastic tabstops, but with some user interface that would allow us to choose how to merge tabs, eliminating the need for a heuristic. In this world, tabs would have a minimum (rather than maximum) width -- e.g. tabs might reasonably go one em unless other connected tabs forced them to be wider. Unfortunately, this probably means scrapping all existing tools, right down to the level of file format, and starting over building new compilers, text editors, shells, etc. This is probably not feasible.