Home | History | Annotate | Download | only in doc
      1 RE2 regular expression syntax reference
      2 -------------------------------------
      3 
      4 Single characters:
      5 .	any character, possibly including newline (s=true)
      6 [xyz]	character class
      7 [^xyz]	negated character class
      8 \d	Perl character class
      9 \D	negated Perl character class
     10 [:alpha:]	ASCII character class
     11 [:^alpha:]	negated ASCII character class
     12 \pN	Unicode character class (one-letter name)
     13 \p{Greek}	Unicode character class
     14 \PN	negated Unicode character class (one-letter name)
     15 \P{Greek}	negated Unicode character class
     16 
     17 Composites:
     18 xy	x followed by y
     19 x|y	x or y (prefer x)
     20 
     21 Repetitions:
     22 x*	zero or more x, prefer more
     23 x+	one or more x, prefer more
     24 x?	zero or one x, prefer one
     25 x{n,m}	n or n+1 or ... or m x, prefer more
     26 x{n,}	n or more x, prefer more
     27 x{n}	exactly n x
     28 x*?	zero or more x, prefer fewer
     29 x+?	one or more x, prefer fewer
     30 x??	zero or one x, prefer zero
     31 x{n,m}?	n or n+1 or ... or m x, prefer fewer
     32 x{n,}?	n or more x, prefer fewer
     33 x{n}?	exactly n x
     34 x{}	(== x*) NOT SUPPORTED vim
     35 x{-}	(== x*?) NOT SUPPORTED vim
     36 x{-n}	(== x{n}?) NOT SUPPORTED vim
     37 x=	(== x?) NOT SUPPORTED vim
     38 
     39 Possessive repetitions:
     40 x*+	zero or more x, possessive NOT SUPPORTED
     41 x++	one or more x, possessive NOT SUPPORTED
     42 x?+	zero or one x, possessive NOT SUPPORTED
     43 x{n,m}+	n or ... or m x, possessive NOT SUPPORTED
     44 x{n,}+	n or more x, possessive NOT SUPPORTED
     45 x{n}+	exactly n x, possessive NOT SUPPORTED
     46 
     47 Grouping:
     48 (re)	numbered capturing group
     49 (?P<name>re)	named & numbered capturing group
     50 (?<name>re)	named & numbered capturing group NOT SUPPORTED
     51 (?'name're)	named & numbered capturing group NOT SUPPORTED
     52 (?:re)	non-capturing group
     53 (?flags)	set flags within current group; non-capturing
     54 (?flags:re)	set flags during re; non-capturing
     55 (?#text)	comment NOT SUPPORTED
     56 (?|x|y|z)	branch numbering reset NOT SUPPORTED
     57 (?>re)	possessive match of re NOT SUPPORTED
     58 re@>	possessive match of re NOT SUPPORTED vim
     59 %(re)	non-capturing group NOT SUPPORTED vim
     60 
     61 Flags:
     62 i	case-insensitive (default false)
     63 m	multi-line mode: ^ and $ match begin/end line in addition to begin/end text (default false)
     64 s	let . match \n (default false)
     65 U	ungreedy: swap meaning of x* and x*?, x+ and x+?, etc (default false)
     66 Flag syntax is xyz (set) or -xyz (clear) or xy-z (set xy, clear z).
     67 
     68 Empty strings:
     69 ^	at beginning of text or line (m=true)
     70 $	at end of text (like \z not \Z) or line (m=true)
     71 \A	at beginning of text
     72 \b	at word boundary (\w on one side and \W, \A, or \z on the other)
     73 \B	not a word boundary
     74 \G	at beginning of subtext being searched NOT SUPPORTED pcre
     75 \G	at end of last match NOT SUPPORTED perl
     76 \Z	at end of text, or before newline at end of text NOT SUPPORTED
     77 \z	at end of text
     78 (?=re)	before text matching re NOT SUPPORTED
     79 (?!re)	before text not matching re NOT SUPPORTED
     80 (?<=re)	after text matching re NOT SUPPORTED
     81 (?<!re)	after text not matching re NOT SUPPORTED
     82 re&	before text matching re NOT SUPPORTED vim
     83 re@=	before text matching re NOT SUPPORTED vim
     84 re@!	before text not matching re NOT SUPPORTED vim
     85 re@<=	after text matching re NOT SUPPORTED vim
     86 re@<!	after text not matching re NOT SUPPORTED vim
     87 \zs	sets start of match (= \K) NOT SUPPORTED vim
     88 \ze	sets end of match NOT SUPPORTED vim
     89 \%^	beginning of file NOT SUPPORTED vim
     90 \%$	end of file NOT SUPPORTED vim
     91 \%V	on screen NOT SUPPORTED vim
     92 \%#	cursor position NOT SUPPORTED vim
     93 \%'m	mark m position NOT SUPPORTED vim
     94 \%23l	in line 23 NOT SUPPORTED vim
     95 \%23c	in column 23 NOT SUPPORTED vim
     96 \%23v	in virtual column 23 NOT SUPPORTED vim
     97 
     98 Escape sequences:
     99 \a	bell (== \007)
    100 \f	form feed (== \014)
    101 \t	horizontal tab (== \011)
    102 \n	newline (== \012)
    103 \r	carriage return (== \015)
    104 \v	vertical tab character (== \013)
    105 \*	literal *, for any punctuation character *
    106 \123	octal character code (up to three digits)
    107 \x7F	hex character code (exactly two digits)
    108 \x{10FFFF}	hex character code
    109 \C	match a single byte even in UTF-8 mode
    110 \Q...\E	literal text ... even if ... has punctuation
    111 
    112 \1	backreference NOT SUPPORTED
    113 \b	backspace NOT SUPPORTED (use \010)
    114 \cK	control char ^K NOT SUPPORTED (use \001 etc)
    115 \e	escape NOT SUPPORTED (use \033)
    116 \g1	backreference NOT SUPPORTED
    117 \g{1}	backreference NOT SUPPORTED
    118 \g{+1}	backreference NOT SUPPORTED
    119 \g{-1}	backreference NOT SUPPORTED
    120 \g{name}	named backreference NOT SUPPORTED
    121 \g<name>	subroutine call NOT SUPPORTED
    122 \g'name'	subroutine call NOT SUPPORTED
    123 \k<name>	named backreference NOT SUPPORTED
    124 \k'name'	named backreference NOT SUPPORTED
    125 \lX	lowercase X NOT SUPPORTED
    126 \ux	uppercase x NOT SUPPORTED
    127 \L...\E	lowercase text ... NOT SUPPORTED
    128 \K	reset beginning of $0 NOT SUPPORTED
    129 \N{name}	named Unicode character NOT SUPPORTED
    130 \R	line break NOT SUPPORTED
    131 \U...\E	upper case text ... NOT SUPPORTED
    132 \X	extended Unicode sequence NOT SUPPORTED
    133 
    134 \%d123	decimal character 123 NOT SUPPORTED vim
    135 \%xFF	hex character FF NOT SUPPORTED vim
    136 \%o123	octal character 123 NOT SUPPORTED vim
    137 \%u1234	Unicode character 0x1234 NOT SUPPORTED vim
    138 \%U12345678	Unicode character 0x12345678 NOT SUPPORTED vim
    139 
    140 Character class elements:
    141 x	single character
    142 A-Z	character range (inclusive)
    143 \d	Perl character class
    144 [:foo:]	ASCII character class foo
    145 \p{Foo}	Unicode character class Foo
    146 \pF	Unicode character class F (one-letter name)
    147 
    148 Named character classes as character class elements:
    149 [\d]	digits (== \d)
    150 [^\d]	not digits (== \D)
    151 [\D]	not digits (== \D)
    152 [^\D]	not not digits (== \d)
    153 [[:name:]]	named ASCII class inside character class (== [:name:])
    154 [^[:name:]]	named ASCII class inside negated character class (== [:^name:])
    155 [\p{Name}]	named Unicode property inside character class (== \p{Name})
    156 [^\p{Name}]	named Unicode property inside negated character class (== \P{Name})
    157 
    158 Perl character classes:
    159 \d	digits (== [0-9])
    160 \D	not digits (== [^0-9])
    161 \s	whitespace (== [\t\n\f\r ])
    162 \S	not whitespace (== [^\t\n\f\r ])
    163 \w	word characters (== [0-9A-Za-z_])
    164 \W	not word characters (== [^0-9A-Za-z_])
    165 
    166 \h	horizontal space NOT SUPPORTED
    167 \H	not horizontal space NOT SUPPORTED
    168 \v	vertical space NOT SUPPORTED
    169 \V	not vertical space NOT SUPPORTED
    170 
    171 ASCII character classes:
    172 [:alnum:]	alphanumeric (== [0-9A-Za-z])
    173 [:alpha:]	alphabetic (== [A-Za-z])
    174 [:ascii:]	ASCII (== [\x00-\x7F])
    175 [:blank:]	blank (== [\t ])
    176 [:cntrl:]	control (== [\x00-\x1F\x7F])
    177 [:digit:]	digits (== [0-9])
    178 [:graph:]	graphical (== [!-~] == [A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~])
    179 [:lower:]	lower case (== [a-z])
    180 [:print:]	printable (== [ -~] == [ [:graph:]])
    181 [:punct:]	punctuation (== [!-/:-@[-`{-~])
    182 [:space:]	whitespace (== [\t\n\v\f\r ])
    183 [:upper:]	upper case (== [A-Z])
    184 [:word:]	word characters (== [0-9A-Za-z_])
    185 [:xdigit:]	hex digit (== [0-9A-Fa-f])
    186 
    187 Unicode character class names--general category:
    188 C	other
    189 Cc	control
    190 Cf	format
    191 Cn	unassigned code points NOT SUPPORTED
    192 Co	private use
    193 Cs	surrogate
    194 L	letter
    195 LC	cased letter NOT SUPPORTED
    196 L&	cased letter NOT SUPPORTED
    197 Ll	lowercase letter
    198 Lm	modifier letter
    199 Lo	other letter
    200 Lt	titlecase letter
    201 Lu	uppercase letter
    202 M	mark
    203 Mc	spacing mark
    204 Me	enclosing mark
    205 Mn	non-spacing mark
    206 N	number
    207 Nd	decimal number
    208 Nl	letter number
    209 No	other number
    210 P	punctuation
    211 Pc	connector punctuation
    212 Pd	dash punctuation
    213 Pe	close punctuation
    214 Pf	final punctuation
    215 Pi	initial punctuation
    216 Po	other punctuation
    217 Ps	open punctuation
    218 S	symbol
    219 Sc	currency symbol
    220 Sk	modifier symbol
    221 Sm	math symbol
    222 So	other symbol
    223 Z	separator
    224 Zl	line separator
    225 Zp	paragraph separator
    226 Zs	space separator
    227 
    228 Unicode character class names--scripts:
    229 Arabic	Arabic
    230 Armenian	Armenian
    231 Balinese	Balinese
    232 Bengali	Bengali
    233 Bopomofo	Bopomofo
    234 Braille	Braille
    235 Buginese	Buginese
    236 Buhid	Buhid
    237 Canadian_Aboriginal	Canadian Aboriginal
    238 Carian	Carian
    239 Cham	Cham
    240 Cherokee	Cherokee
    241 Common	characters not specific to one script
    242 Coptic	Coptic
    243 Cuneiform	Cuneiform
    244 Cypriot	Cypriot
    245 Cyrillic	Cyrillic
    246 Deseret	Deseret
    247 Devanagari	Devanagari
    248 Ethiopic	Ethiopic
    249 Georgian	Georgian
    250 Glagolitic	Glagolitic
    251 Gothic	Gothic
    252 Greek	Greek
    253 Gujarati	Gujarati
    254 Gurmukhi	Gurmukhi
    255 Han	Han
    256 Hangul	Hangul
    257 Hanunoo	Hanunoo
    258 Hebrew	Hebrew
    259 Hiragana	Hiragana
    260 Inherited	inherit script from previous character
    261 Kannada	Kannada
    262 Katakana	Katakana
    263 Kayah_Li	Kayah Li
    264 Kharoshthi	Kharoshthi
    265 Khmer	Khmer
    266 Lao	Lao
    267 Latin	Latin
    268 Lepcha	Lepcha
    269 Limbu	Limbu
    270 Linear_B	Linear B
    271 Lycian	Lycian
    272 Lydian	Lydian
    273 Malayalam	Malayalam
    274 Mongolian	Mongolian
    275 Myanmar	Myanmar
    276 New_Tai_Lue	New Tai Lue (aka Simplified Tai Lue)
    277 Nko	Nko
    278 Ogham	Ogham
    279 Ol_Chiki	Ol Chiki
    280 Old_Italic	Old Italic
    281 Old_Persian	Old Persian
    282 Oriya	Oriya
    283 Osmanya	Osmanya
    284 Phags_Pa	'Phags Pa
    285 Phoenician	Phoenician
    286 Rejang	Rejang
    287 Runic	Runic
    288 Saurashtra	Saurashtra
    289 Shavian	Shavian
    290 Sinhala	Sinhala
    291 Sundanese	Sundanese
    292 Syloti_Nagri	Syloti Nagri
    293 Syriac	Syriac
    294 Tagalog	Tagalog
    295 Tagbanwa	Tagbanwa
    296 Tai_Le	Tai Le
    297 Tamil	Tamil
    298 Telugu	Telugu
    299 Thaana	Thaana
    300 Thai	Thai
    301 Tibetan	Tibetan
    302 Tifinagh	Tifinagh
    303 Ugaritic	Ugaritic
    304 Vai	Vai
    305 Yi	Yi
    306 
    307 Vim character classes:
    308 \i	identifier character NOT SUPPORTED vim
    309 \I	\i except digits NOT SUPPORTED vim
    310 \k	keyword character NOT SUPPORTED vim
    311 \K	\k except digits NOT SUPPORTED vim
    312 \f	file name character NOT SUPPORTED vim
    313 \F	\f except digits NOT SUPPORTED vim
    314 \p	printable character NOT SUPPORTED vim
    315 \P	\p except digits NOT SUPPORTED vim
    316 \s	whitespace character (== [ \t]) NOT SUPPORTED vim
    317 \S	non-white space character (== [^ \t]) NOT SUPPORTED vim
    318 \d	digits (== [0-9]) vim
    319 \D	not \d vim
    320 \x	hex digits (== [0-9A-Fa-f]) NOT SUPPORTED vim
    321 \X	not \x NOT SUPPORTED vim
    322 \o	octal digits (== [0-7]) NOT SUPPORTED vim
    323 \O	not \o NOT SUPPORTED vim
    324 \w	word character vim
    325 \W	not \w vim
    326 \h	head of word character NOT SUPPORTED vim
    327 \H	not \h NOT SUPPORTED vim
    328 \a	alphabetic NOT SUPPORTED vim
    329 \A	not \a NOT SUPPORTED vim
    330 \l	lowercase NOT SUPPORTED vim
    331 \L	not lowercase NOT SUPPORTED vim
    332 \u	uppercase NOT SUPPORTED vim
    333 \U	not uppercase NOT SUPPORTED vim
    334 \_x	\x plus newline, for any x NOT SUPPORTED vim
    335 
    336 Vim flags:
    337 \c	ignore case NOT SUPPORTED vim
    338 \C	match case NOT SUPPORTED vim
    339 \m	magic NOT SUPPORTED vim
    340 \M	nomagic NOT SUPPORTED vim
    341 \v	verymagic NOT SUPPORTED vim
    342 \V	verynomagic NOT SUPPORTED vim
    343 \Z	ignore differences in Unicode combining characters NOT SUPPORTED vim
    344 
    345 Magic:
    346 (?{code})	arbitrary Perl code NOT SUPPORTED perl
    347 (??{code})	postponed arbitrary Perl code NOT SUPPORTED perl
    348 (?n)	recursive call to regexp capturing group n NOT SUPPORTED
    349 (?+n)	recursive call to relative group +n NOT SUPPORTED
    350 (?-n)	recursive call to relative group -n NOT SUPPORTED
    351 (?C)	PCRE callout NOT SUPPORTED pcre
    352 (?R)	recursive call to entire regexp (== (?0)) NOT SUPPORTED
    353 (?&name)	recursive call to named group NOT SUPPORTED
    354 (?P=name)	named backreference NOT SUPPORTED
    355 (?P>name)	recursive call to named group NOT SUPPORTED
    356 (?(cond)true|false)	conditional branch NOT SUPPORTED
    357 (?(cond)true)	conditional branch NOT SUPPORTED
    358 (*ACCEPT)	make regexps more like Prolog NOT SUPPORTED
    359 (*COMMIT)	NOT SUPPORTED
    360 (*F)	NOT SUPPORTED
    361 (*FAIL)	NOT SUPPORTED
    362 (*MARK)	NOT SUPPORTED
    363 (*PRUNE)	NOT SUPPORTED
    364 (*SKIP)	NOT SUPPORTED
    365 (*THEN)	NOT SUPPORTED
    366 (*ANY)	set newline convention NOT SUPPORTED
    367 (*ANYCRLF)	NOT SUPPORTED
    368 (*CR)	NOT SUPPORTED
    369 (*CRLF)	NOT SUPPORTED
    370 (*LF)	NOT SUPPORTED
    371 (*BSR_ANYCRLF)	set \R convention NOT SUPPORTED pcre
    372 (*BSR_UNICODE)	NOT SUPPORTED pcre
    373 
    374