1 ====
2 YAPF
3 ====
4
5 .. image:: https://badge.fury.io/py/yapf.svg
6 :target: https://badge.fury.io/py/yapf
7 :alt: PyPI version
8
9 .. image:: https://travis-ci.org/google/yapf.svg?branch=master
10 :target: https://travis-ci.org/google/yapf
11 :alt: Build status
12
13 .. image:: https://coveralls.io/repos/google/yapf/badge.svg?branch=master
14 :target: https://coveralls.io/r/google/yapf?branch=master
15 :alt: Coverage status
16
17
18 Introduction
19 ============
20
21 Most of the current formatters for Python --- e.g., autopep8, and pep8ify ---
22 are made to remove lint errors from code. This has some obvious limitations.
23 For instance, code that conforms to the PEP 8 guidelines may not be
24 reformatted. But it doesn't mean that the code looks good.
25
26 YAPF takes a different approach. It's based off of 'clang-format', developed by
27 Daniel Jasper. In essence, the algorithm takes the code and reformats it to the
28 best formatting that conforms to the style guide, even if the original code
29 didn't violate the style guide. The idea is also similar to the 'gofmt' tool for
30 the Go programming language: end all holy wars about formatting - if the whole
31 codebase of a project is simply piped through YAPF whenever modifications are
32 made, the style remains consistent throughout the project and there's no point
33 arguing about style in every code review.
34
35 The ultimate goal is that the code YAPF produces is as good as the code that a
36 programmer would write if they were following the style guide. It takes away
37 some of the drudgery of maintaining your code.
38
39 Try out YAPF with this `online demo <https://yapf.now.sh>`_.
40
41 .. footer::
42
43 YAPF is not an official Google product (experimental or otherwise), it is
44 just code that happens to be owned by Google.
45
46 .. contents::
47
48
49 Installation
50 ============
51
52 To install YAPF from PyPI:
53
54 .. code-block:: shell
55
56 $ pip install yapf
57
58 (optional) If you are using Python 2.7 and want to enable multiprocessing:
59
60 .. code-block:: shell
61
62 $ pip install futures
63
64 YAPF is still considered in "alpha" stage, and the released version may change
65 often; therefore, the best way to keep up-to-date with the latest development
66 is to clone this repository.
67
68 Note that if you intend to use YAPF as a command-line tool rather than as a
69 library, installation is not necessary. YAPF supports being run as a directory
70 by the Python interpreter. If you cloned/unzipped YAPF into ``DIR``, it's
71 possible to run:
72
73 .. code-block:: shell
74
75 $ PYTHONPATH=DIR python DIR/yapf [options] ...
76
77
78 Python versions
79 ===============
80
81 YAPF supports Python 2.7 and 3.6.4+. (Note that some Python 3 features may fail
82 to parse with Python versions before 3.6.4.)
83
84 YAPF requires the code it formats to be valid Python for the version YAPF itself
85 runs under. Therefore, if you format Python 3 code with YAPF, run YAPF itself
86 under Python 3 (and similarly for Python 2).
87
88
89 Usage
90 =====
91
92 Options::
93
94 usage: yapf [-h] [-v] [-d | -i] [-r | -l START-END] [-e PATTERN]
95 [--style STYLE] [--style-help] [--no-local-style] [-p]
96 [-vv]
97 [files [files ...]]
98
99 Formatter for Python code.
100
101 positional arguments:
102 files
103
104 optional arguments:
105 -h, --help show this help message and exit
106 -v, --version show version number and exit
107 -d, --diff print the diff for the fixed source
108 -i, --in-place make changes to files in place
109 -r, --recursive run recursively over directories
110 -l START-END, --lines START-END
111 range of lines to reformat, one-based
112 -e PATTERN, --exclude PATTERN
113 patterns for files to exclude from formatting
114 --style STYLE specify formatting style: either a style name (for
115 example "pep8" or "google"), or the name of a file
116 with style settings. The default is pep8 unless a
117 .style.yapf or setup.cfg file located in the same
118 directory as the source or one of its parent
119 directories (for stdin, the current directory is
120 used).
121 --style-help show style settings and exit; this output can be saved
122 to .style.yapf to make your settings permanent
123 --no-local-style don't search for local style definition
124 -p, --parallel Run yapf in parallel when formatting multiple files.
125 Requires concurrent.futures in Python 2.X
126 -vv, --verbose Print out file names while processing
127
128
129 ------------
130 Return Codes
131 ------------
132
133 Normally YAPF returns zero on successful program termination and non-zero otherwise.
134
135 If ``--diff`` is supplied, YAPF returns zero when no changes were necessary, non-zero
136 otherwise (including program error). You can use this in a CI workflow to test that code
137 has been YAPF-formatted.
138
139
140 Formatting style
141 ================
142
143 The formatting style used by YAPF is configurable and there are many "knobs"
144 that can be used to tune how YAPF does formatting. See the ``style.py`` module
145 for the full list.
146
147 To control the style, run YAPF with the ``--style`` argument. It accepts one of
148 the predefined styles (e.g., ``pep8`` or ``google``), a path to a configuration
149 file that specifies the desired style, or a dictionary of key/value pairs.
150
151 The config file is a simple listing of (case-insensitive) ``key = value`` pairs
152 with a ``[style]`` heading. For example:
153
154 .. code-block:: ini
155
156 [style]
157 based_on_style = pep8
158 spaces_before_comment = 4
159 split_before_logical_operator = true
160
161 The ``based_on_style`` setting determines which of the predefined styles this
162 custom style is based on (think of it like subclassing).
163
164 It's also possible to do the same on the command line with a dictionary. For
165 example:
166
167 .. code-block:: shell
168
169 --style='{based_on_style: chromium, indent_width: 4}'
170
171 This will take the ``chromium`` base style and modify it to have four space
172 indentations.
173
174 YAPF will search for the formatting style in the following manner:
175
176 1. Specified on the command line
177 2. In the `[style]` section of a `.style.yapf` file in either the current
178 directory or one of its parent directories.
179 3. In the `[yapf]` section of a `setup.cfg` file in either the current
180 directory or one of its parent directories.
181 4. In the `~/.config/yapf/style` file in your home directory.
182
183 If none of those files are found, the default style is used (PEP8).
184
185
186 Example
187 =======
188
189 An example of the type of formatting that YAPF can do, it will take this ugly
190 code:
191
192 .. code-block:: python
193
194 x = { 'a':37,'b':42,
195
196 'c':927}
197
198 y = 'hello ''world'
199 z = 'hello '+'world'
200 a = 'hello {}'.format('world')
201 class foo ( object ):
202 def f (self ):
203 return 37*-+2
204 def g(self, x,y=42):
205 return y
206 def f ( a ) :
207 return 37+-+a[42-x : y**3]
208
209 and reformat it into:
210
211 .. code-block:: python
212
213 x = {'a': 37, 'b': 42, 'c': 927}
214
215 y = 'hello ' 'world'
216 z = 'hello ' + 'world'
217 a = 'hello {}'.format('world')
218
219
220 class foo(object):
221 def f(self):
222 return 37 * -+2
223
224 def g(self, x, y=42):
225 return y
226
227
228 def f(a):
229 return 37 + -+a[42 - x:y**3]
230
231
232 Example as a module
233 ===================
234
235 The two main APIs for calling yapf are ``FormatCode`` and ``FormatFile``, these
236 share several arguments which are described below:
237
238 .. code-block:: python
239
240 >>> from yapf.yapflib.yapf_api import FormatCode # reformat a string of code
241
242 >>> FormatCode("f ( a = 1, b = 2 )")
243 'f(a=1, b=2)\n'
244
245 A ``style_config`` argument: Either a style name or a path to a file that contains
246 formatting style settings. If None is specified, use the default style
247 as set in ``style.DEFAULT_STYLE_FACTORY``.
248
249 .. code-block:: python
250
251 >>> FormatCode("def g():\n return True", style_config='pep8')
252 'def g():\n return True\n'
253
254 A ``lines`` argument: A list of tuples of lines (ints), [start, end],
255 that we want to format. The lines are 1-based indexed. It can be used by
256 third-party code (e.g., IDEs) when reformatting a snippet of code rather
257 than a whole file.
258
259 .. code-block:: python
260
261 >>> FormatCode("def g( ):\n a=1\n b = 2\n return a==b", lines=[(1, 1), (2, 3)])
262 'def g():\n a = 1\n b = 2\n return a==b\n'
263
264 A ``print_diff`` (bool): Instead of returning the reformatted source, return a
265 diff that turns the formatted source into reformatter source.
266
267 .. code-block:: python
268
269 >>> print(FormatCode("a==b", filename="foo.py", print_diff=True))
270 --- foo.py (original)
271 +++ foo.py (reformatted)
272 @@ -1 +1 @@
273 -a==b
274 +a == b
275
276 Note: the ``filename`` argument for ``FormatCode`` is what is inserted into
277 the diff, the default is ``<unknown>``.
278
279 ``FormatFile`` returns reformatted code from the passed file along with its encoding:
280
281 .. code-block:: python
282
283 >>> from yapf.yapflib.yapf_api import FormatFile # reformat a file
284
285 >>> print(open("foo.py").read()) # contents of file
286 a==b
287
288 >>> FormatFile("foo.py")
289 ('a == b\n', 'utf-8')
290
291 The ``in-place`` argument saves the reformatted code back to the file:
292
293 .. code-block:: python
294
295 >>> FormatFile("foo.py", in_place=True)
296 (None, 'utf-8')
297
298 >>> print(open("foo.py").read()) # contents of file (now fixed)
299 a == b
300
301
302 Knobs
303 =====
304
305 ``ALIGN_CLOSING_BRACKET_WITH_VISUAL_INDENT``
306 Align closing bracket with visual indentation.
307
308 ``ALLOW_MULTILINE_LAMBDAS``
309 Allow lambdas to be formatted on more than one line.
310
311 ``ALLOW_MULTILINE_DICTIONARY_KEYS``
312 Allow dictionary keys to exist on multiple lines. For example:
313
314 .. code-block:: python
315
316 x = {
317 ('this is the first element of a tuple',
318 'this is the second element of a tuple'):
319 value,
320 }
321
322 ``ALLOW_SPLIT_BEFORE_DICT_VALUE``
323 Allow splits before the dictionary value.
324
325 ``BLANK_LINE_BEFORE_NESTED_CLASS_OR_DEF``
326 Insert a blank line before a ``def`` or ``class`` immediately nested within
327 another ``def`` or ``class``. For example:
328
329 .. code-block:: python
330
331 class Foo:
332 # <------ this blank line
333 def method():
334 pass
335
336 ``BLANK_LINE_BEFORE_MODULE_DOCSTRING``
337 Insert a blank line before a module docstring.
338
339 ``BLANK_LINE_BEFORE_CLASS_DOCSTRING``
340 Insert a blank line before a class-level docstring.
341
342 ``BLANK_LINES_AROUND_TOP_LEVEL_DEFINITION``
343 Sets the number of desired blank lines surrounding top-level function and
344 class definitions. For example:
345
346 .. code-block:: python
347
348 class Foo:
349 pass
350 # <------ having two blank lines here
351 # <------ is the default setting
352 class Bar:
353 pass
354
355 ``COALESCE_BRACKETS``
356 Do not split consecutive brackets. Only relevant when
357 ``DEDENT_CLOSING_BRACKETS`` is set. For example:
358
359 .. code-block:: python
360
361 call_func_that_takes_a_dict(
362 {
363 'key1': 'value1',
364 'key2': 'value2',
365 }
366 )
367
368 would reformat to:
369
370 .. code-block:: python
371
372 call_func_that_takes_a_dict({
373 'key1': 'value1',
374 'key2': 'value2',
375 })
376
377
378 ``COLUMN_LIMIT``
379 The column limit (or max line-length)
380
381 ``CONTINUATION_ALIGN_STYLE``
382 The style for continuation alignment. Possible values are:
383
384 - SPACE: Use spaces for continuation alignment. This is default behavior.
385 - FIXED: Use fixed number (CONTINUATION_INDENT_WIDTH) of columns
386 (ie: CONTINUATION_INDENT_WIDTH/INDENT_WIDTH tabs) for continuation
387 alignment.
388 - VALIGN-RIGHT: Vertically align continuation lines with indent characters.
389 Slightly right (one more indent character) if cannot vertically align
390 continuation lines with indent characters.
391
392 For options ``FIXED``, and ``VALIGN-RIGHT`` are only available when
393 ``USE_TABS`` is enabled.
394
395 ``CONTINUATION_INDENT_WIDTH``
396 Indent width used for line continuations.
397
398 ``DEDENT_CLOSING_BRACKETS``
399 Put closing brackets on a separate line, dedented, if the bracketed
400 expression can't fit in a single line. Applies to all kinds of brackets,
401 including function definitions and calls. For example:
402
403 .. code-block:: python
404
405 config = {
406 'key1': 'value1',
407 'key2': 'value2',
408 } # <--- this bracket is dedented and on a separate line
409
410 time_series = self.remote_client.query_entity_counters(
411 entity='dev3246.region1',
412 key='dns.query_latency_tcp',
413 transform=Transformation.AVERAGE(window=timedelta(seconds=60)),
414 start_ts=now()-timedelta(days=3),
415 end_ts=now(),
416 ) # <--- this bracket is dedented and on a separate line
417
418 ``DISABLE_ENDING_COMMA_HEURISTIC``
419 Disable the heuristic which places each list element on a separate line if
420 the list is comma-terminated.
421
422 ``EACH_DICT_ENTRY_ON_SEPARATE_LINE``
423 Place each dictionary entry onto its own line.
424
425 ``I18N_COMMENT``
426 The regex for an internationalization comment. The presence of this comment
427 stops reformatting of that line, because the comments are required to be
428 next to the string they translate.
429
430 ``I18N_FUNCTION_CALL``
431 The internationalization function call names. The presence of this function
432 stops reformatting on that line, because the string it has cannot be moved
433 away from the i18n comment.
434
435 ``INDENT_DICTIONARY_VALUE``
436 Indent the dictionary value if it cannot fit on the same line as the
437 dictionary key. For example:
438
439 .. code-block:: python
440
441 config = {
442 'key1':
443 'value1',
444 'key2': value1 +
445 value2,
446 }
447
448 ``INDENT_WIDTH``
449 The number of columns to use for indentation.
450
451 ``JOIN_MULTIPLE_LINES``
452 Join short lines into one line. E.g., single line ``if`` statements.
453
454 ``SPACES_AROUND_POWER_OPERATOR``
455 Set to ``True`` to prefer using spaces around ``**``.
456
457 ``NO_SPACES_AROUND_SELECTED_BINARY_OPERATORS``
458 Do not include spaces around selected binary operators. For example:
459
460 .. code-block:: python
461
462 1 + 2 * 3 - 4 / 5
463
464 will be formatted as follows when configured with ``*,/``:
465
466 .. code-block:: python
467
468 1 + 2*3 - 4/5
469
470 ``SPACES_AROUND_DEFAULT_OR_NAMED_ASSIGN``
471 Set to ``True`` to prefer spaces around the assignment operator for default
472 or keyword arguments.
473
474 ``SPACES_BEFORE_COMMENT``
475 The number of spaces required before a trailing comment.
476
477 ``SPACE_BETWEEN_ENDING_COMMA_AND_CLOSING_BRACKET``
478 Insert a space between the ending comma and closing bracket of a list, etc.
479
480 ``SPLIT_ARGUMENTS_WHEN_COMMA_TERMINATED``
481 Split before arguments if the argument list is terminated by a comma.
482
483 ``SPLIT_ALL_COMMA_SEPARATED_VALUES``
484 If a comma separated list (dict, list, tuple, or function def) is on a
485 line that is too long, split such that all elements are on a single line.
486
487 ``SPLIT_BEFORE_BITWISE_OPERATOR``
488 Set to ``True`` to prefer splitting before ``&``, ``|`` or ``^`` rather
489 than after.
490
491 ``SPLIT_BEFORE_CLOSING_BRACKET``
492 Split before the closing bracket if a list or dict literal doesn't fit on
493 a single line.
494
495 ``SPLIT_BEFORE_DICT_SET_GENERATOR``
496 Split before a dictionary or set generator (comp_for). For example, note
497 the split before the ``for``:
498
499 .. code-block:: python
500
501 foo = {
502 variable: 'Hello world, have a nice day!'
503 for variable in bar if variable != 42
504 }
505
506 ``SPLIT_BEFORE_EXPRESSION_AFTER_OPENING_PAREN``
507 Split after the opening paren which surrounds an expression if it doesn't
508 fit on a single line.
509
510 ``SPLIT_BEFORE_FIRST_ARGUMENT``
511 If an argument / parameter list is going to be split, then split before the
512 first argument.
513
514 ``SPLIT_BEFORE_LOGICAL_OPERATOR``
515 Set to ``True`` to prefer splitting before ``and`` or ``or`` rather than
516 after.
517
518 ``SPLIT_BEFORE_NAMED_ASSIGNS``
519 Split named assignments onto individual lines.
520
521 ``SPLIT_COMPLEX_COMPREHENSION``
522 For list comprehensions and generator expressions with multiple clauses
523 (e.g multiple "for" calls, "if" filter expressions) and which need to be
524 reflowed, split each clause onto its own line. For example:
525
526 .. code-block:: python
527
528 result = [
529 a_var + b_var for a_var in xrange(1000) for b_var in xrange(1000)
530 if a_var % b_var]
531
532 would reformat to something like:
533
534 .. code-block:: python
535
536 result = [
537 a_var + b_var
538 for a_var in xrange(1000)
539 for b_var in xrange(1000)
540 if a_var % b_var]
541
542 ``SPLIT_PENALTY_AFTER_OPENING_BRACKET``
543 The penalty for splitting right after the opening bracket.
544
545 ``SPLIT_PENALTY_AFTER_UNARY_OPERATOR``
546 The penalty for splitting the line after a unary operator.
547
548 ``SPLIT_PENALTY_BEFORE_IF_EXPR``
549 The penalty for splitting right before an ``if`` expression.
550
551 ``SPLIT_PENALTY_BITWISE_OPERATOR``
552 The penalty of splitting the line around the ``&``, ``|``, and ``^``
553 operators.
554
555 ``SPLIT_PENALTY_COMPREHENSION``
556 The penalty for splitting a list comprehension or generator expression.
557
558 ``SPLIT_PENALTY_EXCESS_CHARACTER``
559 The penalty for characters over the column limit.
560
561 ``SPLIT_PENALTY_FOR_ADDED_LINE_SPLIT``
562 The penalty incurred by adding a line split to the unwrapped line. The more
563 line splits added the higher the penalty.
564
565 ``SPLIT_PENALTY_IMPORT_NAMES``
566 The penalty of splitting a list of ``import as`` names. For example:
567
568 .. code-block:: python
569
570 from a_very_long_or_indented_module_name_yada_yad import (long_argument_1,
571 long_argument_2,
572 long_argument_3)
573
574 would reformat to something like:
575
576 .. code-block:: python
577
578 from a_very_long_or_indented_module_name_yada_yad import (
579 long_argument_1, long_argument_2, long_argument_3)
580
581 ``SPLIT_PENALTY_LOGICAL_OPERATOR``
582 The penalty of splitting the line around the ``and`` and ``or`` operators.
583
584 ``USE_TABS``
585 Use the Tab character for indentation.
586
587 (Potentially) Frequently Asked Questions
588 ========================================
589
590 --------------------------------------------
591 Why does YAPF destroy my awesome formatting?
592 --------------------------------------------
593
594 YAPF tries very hard to get the formatting correct. But for some code, it won't
595 be as good as hand-formatting. In particular, large data literals may become
596 horribly disfigured under YAPF.
597
598 The reasons for this are manyfold. In short, YAPF is simply a tool to help
599 with development. It will format things to coincide with the style guide, but
600 that may not equate with readability.
601
602 What can be done to alleviate this situation is to indicate regions YAPF should
603 ignore when reformatting something:
604
605 .. code-block:: python
606
607 # yapf: disable
608 FOO = {
609 # ... some very large, complex data literal.
610 }
611
612 BAR = [
613 # ... another large data literal.
614 ]
615 # yapf: enable
616
617 You can also disable formatting for a single literal like this:
618
619 .. code-block:: python
620
621 BAZ = {
622 (1, 2, 3, 4),
623 (5, 6, 7, 8),
624 (9, 10, 11, 12),
625 } # yapf: disable
626
627 To preserve the nice dedented closing brackets, use the
628 ``dedent_closing_brackets`` in your style. Note that in this case all
629 brackets, including function definitions and calls, are going to use
630 that style. This provides consistency across the formatted codebase.
631
632 -------------------------------
633 Why Not Improve Existing Tools?
634 -------------------------------
635
636 We wanted to use clang-format's reformatting algorithm. It's very powerful and
637 designed to come up with the best formatting possible. Existing tools were
638 created with different goals in mind, and would require extensive modifications
639 to convert to using clang-format's algorithm.
640
641 -----------------------------
642 Can I Use YAPF In My Program?
643 -----------------------------
644
645 Please do! YAPF was designed to be used as a library as well as a command line
646 tool. This means that a tool or IDE plugin is free to use YAPF.
647
648
649 Gory Details
650 ============
651
652 ----------------
653 Algorithm Design
654 ----------------
655
656 The main data structure in YAPF is the ``UnwrappedLine`` object. It holds a list
657 of ``FormatToken``\s, that we would want to place on a single line if there were
658 no column limit. An exception being a comment in the middle of an expression
659 statement will force the line to be formatted on more than one line. The
660 formatter works on one ``UnwrappedLine`` object at a time.
661
662 An ``UnwrappedLine`` typically won't affect the formatting of lines before or
663 after it. There is a part of the algorithm that may join two or more
664 ``UnwrappedLine``\s into one line. For instance, an if-then statement with a
665 short body can be placed on a single line:
666
667 .. code-block:: python
668
669 if a == 42: continue
670
671 YAPF's formatting algorithm creates a weighted tree that acts as the solution
672 space for the algorithm. Each node in the tree represents the result of a
673 formatting decision --- i.e., whether to split or not to split before a token.
674 Each formatting decision has a cost associated with it. Therefore, the cost is
675 realized on the edge between two nodes. (In reality, the weighted tree doesn't
676 have separate edge objects, so the cost resides on the nodes themselves.)
677
678 For example, take the following Python code snippet. For the sake of this
679 example, assume that line (1) violates the column limit restriction and needs to
680 be reformatted.
681
682 .. code-block:: python
683
684 def xxxxxxxxxxx(aaaaaaaaaaaa, bbbbbbbbb, cccccccc, dddddddd, eeeeee): # 1
685 pass # 2
686
687 For line (1), the algorithm will build a tree where each node (a
688 ``FormattingDecisionState`` object) is the state of the line at that token given
689 the decision to split before the token or not. Note: the ``FormatDecisionState``
690 objects are copied by value so each node in the graph is unique and a change in
691 one doesn't affect other nodes.
692
693 Heuristics are used to determine the costs of splitting or not splitting.
694 Because a node holds the state of the tree up to a token's insertion, it can
695 easily determine if a splitting decision will violate one of the style
696 requirements. For instance, the heuristic is able to apply an extra penalty to
697 the edge when not splitting between the previous token and the one being added.
698
699 There are some instances where we will never want to split the line, because
700 doing so will always be detrimental (i.e., it will require a backslash-newline,
701 which is very rarely desirable). For line (1), we will never want to split the
702 first three tokens: ``def``, ``xxxxxxxxxxx``, and ``(``. Nor will we want to
703 split between the ``)`` and the ``:`` at the end. These regions are said to be
704 "unbreakable." This is reflected in the tree by there not being a "split"
705 decision (left hand branch) within the unbreakable region.
706
707 Now that we have the tree, we determine what the "best" formatting is by finding
708 the path through the tree with the lowest cost.
709
710 And that's it!
711