IT Questions and Answers :)

Thursday, January 23, 2020

This question is categorized "General Linux" because it has been tested on Linux with "grep". Which of the following strings is NOT matched by the regular expression 'ca*t'

This question is categorized "General Linux" because it has been tested on Linux with "grep". Which of the following strings is NOT matched by the regular expression 'ca*t'

  • cart
  • cat
  • caat
  • ct 

This question is categorized "General Linux" because it has been tested on Linux with "grep". Which of the following strings is NOT matched by the regular expression 'ca*t'

EXPLANATION

cart  does NOT match the regular expression  ca*t  because  the  "r"   in "cart" is not matched.   

caat  matches the "c", the "a" (two times), and the "t" in the regular expression.
cat  matches the  "c", the  "a" (one time), and the  "t"  in the regular expression.
ct  matches the  "c",  the  "a"  (zero times),  and the "t"  in the regular expression.

From:  https://linux.die.net/man/1/grep
"Repetition
A regular expression may be followed by one of several repetition operators:

?
The preceding item is optional and matched at most once.

*
The preceding item will be matched zero or more times.

+
The preceding item will be matched one or more times. "
Note that "bash" will use  *  on the command line to match any string of characters in filenames.  

To match any string of characters with a regex use:     .*    
The period is a regex metacharacter matching any character except newline, and the asterisk metacharacter will expand matches to any length.

Since the asterisk is a globbing character to "bash", it should be escaped when entered on the bash command line with "grep".

Note that the asterisk is escaped in the command below so that bash does not expand the  '*'

$ grep  ca'*'t regex_test
ct
cat
caat
The asterisk could also be escaped to bash via  'ca*t'  or  ca\*t       Escaping passes the  to grep, instead of bash using it to glob filenames in the current directory.

Absent escaping, you could get unexpected results if the asterisk globs a filename:

$ ls
cannot  regex_test  test4
 
$ cat regex_test
ct
cat
caat
cannot
cart
chat

Below, the asterisk is NOT escaped, so bash expanded it to a filename:
$ grep ca*t regex_test
cannot





In the unescaped command above, filename "cannot" was globbed from  ca*t  and passed to grep, making the grep command expand to   "grep cannot regex_test"
Since the file "regex_test" contains the string "cannot" in its data, the results of "grep" are correct for the given command, but that command may have been unintended.

Compare the output of the unescaped  *  above, to the escaped  '*'  in the command below:
$ grep 'ca*t' regex_test
ct
cat
caat

If an unescaped asterisk is parsed by bash, but does not expand to a filename, bash will pass the asterisk to grep (or whatever other command was entered on the command line).  But don't count on bash not  globbing a filename when that's not what you want--always escape characters that are special to bash if you need them passed to your command.

From https://www.tldp.org/LDP/abs/html/globbingref.html
"Bash itself cannot recognize Regular Expressions. Inside scripts, it is commands and utilities -- such as sed and awk -- that interpret RE's.
Bash does carry out filename expansion [1] -- a process known as globbing -- but this does not use the standard RE set. Instead, globbing recognizes and expands wild cards. Globbing interprets the standard wild card characters [2] -- * and ?, character lists in square brackets, and certain other special characters (such as ^ for negating the sense of a match). There are important limitations on wild card characters in globbing, however. Strings containing * will not match filenames that start with a dot, as, for example, .bashrc. [3] Likewise, the ? has a different meaning in globbing than as part of an RE."

Lastly, this is old (but maintained), and a very good writeup on regular expressions: 
http://www.grymoire.com/Unix/Regular.html#TOC


SOURCE

https://linux.die.net/man/1/grep
Share:

Related Posts:

0 comments:

Post a Comment

 On 30 september 2021, at 20:01, Dominick commented on which of following is not cloud
 On 20 july 2021, at 11:50, Anonymous commented on which of following is false about ddr2
 On 03 march 2021, at 13:49, Anonymous commented on which of following layers is layer 4 in
 On 02 january 2021, at 20:23, Anonymous commented on when open source software is used in
 On 30 october 2020, at 01:42, Anonymous commented on which of following is true regarding
 On 16 october 2020, at 18:27, Anonymous commented on which of following modules cannot be
 On 15 october 2020, at 13:54, Myles commented on in javascript which of following
 On 24 june 2020, at 05:50, Anonymous commented on in which layer of osi model would you
 On 26 may 2020, at 15:01, Myles commented on in javascript which of following
 On 24 february 2020, at 10:56, Anonymous commented on what is difference between tacacs and
 On 18 february 2020, at 11:56, Anonymous commented on what video conferencing application
 On 20 november 2019, at 18:18, Ranjitkumar commented on irq 1 is commonly assigned to the
 On 20 november 2019, at 12:36, RMS commented on irq 1 is commonly assigned to the
 On 06 september 2019, at 14:40, Ranjitkumar commented on what is acronym for management system
 On 06 september 2019, at 11:07, RMS commented on what is acronym for management system
 On 05 september 2019, at 00:48, Anonymous commented on how do you block user from opening
 On 27 august 2019, at 16:47, Anonymous commented on at what location in microsoft windows
 On 23 july 2019, at 22:31, Anonymous commented on in cryptography initialization vector
 On 17 july 2019, at 10:30, Anonymous commented on which of these is not use case of
 On 28 june 2019, at 06:38, Anonymous commented on which of following is not computer

Popular Posts