Ruby Regular Expressions

Regular expressions are a special character sequence, which through the use of a special syntax patterns to match or find a collection of strings.

Regular expressions are used a number of pre-defined combinations of certain characters, and these particular characters, form a "string rule", the "rule string" is used to express a string filtering logic.


Regular expressions literally is a slash, or somewhere in between after the% r mode with any delimiter between, as follows:

/pattern/im    # 可以指定选项
%r!/usr/local! # 使用分隔符的正则表达式



line1 = "Cats are smarter than dogs";
line2 = "Dogs also like meat";

if ( line1 =~ /Cats(.*)/ )
  puts "Line1 contains Cats"
if ( line2 =~ /Cats(.*)/ )
  puts "Line2 contains  Dogs"

Run the above example output is:

Line1 contains Cats

Regular expression modifiers

Regular expressions literally may contain an optional modifier for controlling various aspects of the match. Modifier after the second slash character designation, as shown in the example above. Index lists the possible modifiers:

Modifiers description
i Ignore case when matching text.
o Performed only once # {} interpolation, the regular expression the first time when he is judged.
x Ignore spaces permitted in whitespace and comments in the entire expression.
m Matching multiple lines, the newline character recognized as a normal character.
u, e, s, n The regular expression is interpreted as Unicode (UTF-8), EUC, SJIS or ASCII. If no modifier is considered a regular expression using the source code.

Like strings separated by% Q Like, Ruby allows you to% r as the beginning of the regular expression, followed by an arbitrary delimiter. This description contains a large number of very useful when you do not want to escape the slash character.

# 下面匹配单个斜杠字符,不转义

# Flag 字符可通过下面的语法进行匹配

Regular expression pattern

In addition to the controlcharacters, (+ * ^ $ () [] {} |?. \), All other characters match themselves.You can control the character before placing a backslash to escape the control characters.

The following table lists the regular expression syntax Ruby available.

.匹配除了换行符以外的任意单字符。使用 m 选项时,它也可以匹配换行符。
re{ n}匹配前面的子表达式 n 次。
re{ n,}匹配前面的子表达式 n 次或 n 次以上。
re{ n, m}匹配前面的子表达式至少 n 次至多 m 次。
a| b匹配 a 或 b。
(?imx)暂时打开正则表达式内的 i、 m 或 x 选项。如果在圆括号中,则只影响圆括号内的部分。
(?-imx)暂时关闭正则表达式内的 i、 m 或 x 选项。如果在圆括号中,则只影响圆括号内的部分。
(?: re)对正则表达式进行分组,但不记住匹配文本。
(?imx: re)暂时打开圆括号内的 i、 m 或 x 选项。
(?-imx: re)暂时关闭圆括号内的 i、 m 或 x 选项。
(?= re)使用模式指定位置。没有范围。
(?! re)使用模式的否定指定位置。没有范围。
(?> re)匹配无回溯的独立模式。
\s匹配空白字符。等价于 [\t\n\r\f]。
\d匹配数字。等价于 [0-9]。
\n, \t, etc.匹配换行符、回车符、制表符,等等。
\1/en.\9匹配第 n 个分组子表达式。
\10如果已匹配过,则匹配第 n 个分组子表达式。否则指向字符编码的八进制表示。

Examples of regular expressions


Examples description
/ Ruby / Match "ruby"
¥ Yen symbol match. Ruby 1.9 and Ruby 1.8 supports multiple characters.

Character Classes

/[Rr]uby/ 匹配 "Ruby" 或 "ruby"
/rub[ye]/ 匹配 "ruby" 或 "rube"
/[0-9]/ 匹配任何一个数字,与 /[0123456789]/ 相同
/[a-z]/匹配任何一个小写 ASCII 字母
/[A-Z]/匹配任何一个大写 ASCII 字母
/[^aeiou]/ 匹配任何一个非小写元音字母的字符

Special character classes

/./ 匹配除了换行符以外的其他任意字符
/./m 在多行模式下,也能匹配换行符
/\d/匹配一个数字,等同于 /[0-9]/
/\D/ 匹配一个非数字,等同于 /[^0-9]/
/\s/匹配一个空白字符,等同于 /[ \t\r\n\f]/
/\S/ 匹配一个非空白字符,等同于 /[^ \t\r\n\f]/
/\w/ 匹配一个单词字符,等同于 /[A-Za-z0-9_]/
/\W/匹配一个非单词字符,等同于 /[^A-Za-z0-9_]/


/ruby?/ 匹配 "rub" 或 "ruby"。其中,y 是可有可无的。
/ruby*/ 匹配 "rub" 加上 0 个或多个的 y。
/ruby+/匹配 "rub" 加上 1 个或多个的 y。
/\d{3}/刚好匹配 3 个数字。
/\d{3,}/匹配 3 个或多个数字。
/\d{3,5}/匹配 3 个、4 个或 5 个数字。

Non-greedy repeat

This will match the minimum number of repetition.

/<.*>/贪婪重复:匹配 "<ruby>perl>"
/<.*?>/ 非贪婪重复:匹配 "<ruby>perl>" 中的 "<ruby>"

Grouped by parentheses

/\D\d+/ 无分组: + 重复 \d
/(\D\d)+/ 分组: + 重复 \D\d 对
/([Rr]uby(, )?)+/匹配 "Ruby"、"Ruby, ruby, ruby",等等


This match before the match had grouped again.

/([Rr])uby&\1ails/匹配 ruby&rails 或 Ruby&Rails
/(['"])(?:(?!\1).)*\1/单引号或双引号字符串。\1 匹配第一个分组所匹配的字符,\2 匹配第二个分组所匹配的字符,依此类推。


/ruby|rube/匹配 "ruby" 或 "rube"
/rub(y|le))/匹配 "ruby" 或 "ruble"
/ruby(!+|\?)/ "ruby" 后跟一个或多个 ! 或者跟一个 ?


This requires matching the specified location.

/^Ruby/匹配以 "Ruby" 开头的字符串或行
/Ruby$/ 匹配以 "Ruby" 结尾的字符串或行
/\ARuby/ 匹配以 "Ruby" 开头的字符串
/Ruby\Z/匹配以 "Ruby" 结尾的字符串
/\bRuby\b/匹配单词边界的 "Ruby"
/\brub\B/\B 是非单词边界:匹配 "rube" 和 "ruby" 中的 "rub",但不匹配单独的 "rub"
/Ruby(?=!)/如果 "Ruby" 后跟着一个感叹号,则匹配 "Ruby"
/Ruby(?!!)/ 如果 "Ruby" 后没有跟着一个感叹号,则匹配 "Ruby"

Parentheses special syntax

/R(?#comment)/ 匹配 "R"。所有剩余的字符都是注释。
/R(?i)uby/ 当匹配 "uby" 时不区分大小写。
/R(?i:uby)/ 与上面相同。
/rub(?:y|le))/只分组,不进行 \1 反向引用

Search and Replace

sub and gsubsubstitution variables and theirsub!andgsub!is important when using a regular expression string method.

All of these methods are using regular expression pattern to perform search and replace operations.sub and sub!first replacement pattern ofoccurrence,gsub andgsub!replace all occurrences of pattern.

sub and gsubreturn a new string leaving the original string is not modified, andsub!andgsub!modify the strings they will call.

Here is an example:

# -*- coding: UTF-8 -*-

phone = "138-3453-1111 #这是一个电话号码"

# 删除 Ruby 的注释
phone = phone.sub!(/#.*$/, "")   
puts "电话号码 : #{phone}"

# 移除数字以外的其他字符
phone = phone.gsub!(/\D/, "")    
puts "电话号码 : #{phone}"

Run the above example output is:

电话号码 : 138-3453-1111 
电话号码 : 13834531111

Here is another example:

# -*- coding: UTF-8 -*-

text = "rails 是 rails,  Ruby on Rails 非常好的 Ruby 框架"

# 把所有的 "rails" 改为 "Rails"
text.gsub!("rails", "Rails")

# 把所有的单词 "Rails" 都改成首字母大写
text.gsub!(/\brails\b/, "Rails")

puts "#{text}"

Run the above example output is:

Rails 是 Rails,  Ruby on Rails 非常好的 Ruby 框架
