Ravaan Techky

Ravaan Techky Group invites all Techkies.

Python

Regular Expression

  • Python regular expression module available in re package.

  • Special characters used in regular expression, -

Character Description Example pattern code Example match
\d A digit file_\d\d file_24
\w Aplhanumeric (Also include underscore characters in match) \w-\w\w\w A-b_1
\s Whilespace a\sb\s\c a b c
\D A non digit \D\D\D\D aBcD
\W Non-alphanumeric \W\W\W\W\W *_+=)
\S Non-whitespace \S\S\S\S Yoyo
  • Occurences related special characters, -
Character Description Example pattern code Example match
+ Occurs one or more times Version \w-\w+ Version A-B_1
{3} Occurs three times \D{3} 123
{2,4} Occurs 2 to 4 times \d{2,4} 123
{3,} Occurs 3 or more times \w{3,} sdaw3rfwe4
* Occurs zero or more times ABC* AAACCC
? Occur Once or None plurals? plural
  • Condition based regular expression characters, -
Character Description Example pattern code Example match  
pipe sign OR condition in regular expression r’cat dog’ Statement 1: This is a dog!
Statement 2: This is a cat!
. (wild char) period sign indicate wild char for get given position r’…at’ Statement : The cat in the hat went splat
Result will be - [‘e cat’, ‘e hat’, ‘splat’]
 
^ (power sign) First occurence should match with pattern r’^\d’ Statement : 2 is Number.
Result will be - [2]
 
$ (dollar sign) Last occurence should match with pattern r’$\d’ Statement : The Number is 2
Result will be - [2]
 

IN keyword example:

Input:

my_var = 'bhushan' in 'bhushan is my name'
print(my_var)

Output:

True

Simple Expression search function example:

Input:

import re
my_statement = "Agent Hitman has phone Number (999)-999-9999 You can call at any time on this phone number!!!"
pattern = 'NO PATTERN SET'
re.search(pattern, my_statement)

Output: No output will be available. If we try to print output it will be None type

Input:

import re
my_statement = "Agent Hitman has phone Number (999)-999-9999 You can call at any time on this phone number!!!"

#return first find object from statement.
pattern = 'phone'
re.search(pattern, my_statement)

Output:

<re.Match object; span=(17, 22), match='phone'>

Methods from Match object

Method Description
span() return start and end attribute of first match in tuple
start() return start index
end() return end index
group() return search text

Input:

import re
my_statement = "Agent Hitman has phone Number (999)-999-9999 You can call at any time on this phone number!!!"

#return first find object from statement.
pattern = 'phone'
match = re.search(pattern, my_statement)
print(match.span())
print(match.start())
print(match.end())
print(match.group())

Output:

(17, 22)
17
22
phone

Simple Expression findall function example:

Input:

import re
my_statement = "Agent Hitman has phone Number (999)-999-9999 You can call at any time on this phone number!!!"

#return first find object from statement.
pattern = 'phone'
match_list = re.findall(pattern, my_statement)
print(f'Total search count : {len(match_list)}')
print(f'Search Text        : {match_list}')
for (index, match) in enumerate(match_list):
    print(f'Search Number {index}')
    print(f'\t\t found at: {match}')

Output:

Total search count : 2
Search Text        : ['phone', 'phone']
Search Number 0
		 found at: phone
Search Number 1
		 found at: phone

Simple Expression finditer function example:

Input:

import re
my_statement = "Agent Hitman has phone Number (999)-999-9999 You can call at any time on this phone number!!!"

#return first find object from statement.
pattern = 'phone'
for match in re.finditer(pattern, my_statement):
    print(f'{match.group()} found at: {match.start()}, {match.end()}')

Output:

phone found at: 17, 22
phone found at: 78, 83

Note: finditer function iterate over match object while findall function return only occurences

Regular Expression search function example:

Input:

import re
my_statement = "Agent Hitman has phone Number (999)-999-9999 You can call at any time on this phone number!!!"

#return first find object from statement.
pattern = r'\W\d{3}\W-\d{3}-\d{4}'
for match in re.finditer(pattern, my_statement):
    print(f'{match.group()} found at: {match.start()}, {match.end()}')

Output:

(999)-999-9999 found at: 30, 44

Regular Expression compile and search function example:

compile function from re package compile regular expression and divide it into groups. For this purpose it uses open and closing brackets.

Input:

import re
my_statement = "Agent Hitman has phone Number (999)-999-9999 You can call at any time on this phone number!!!"

#return first find object from statement.
pattern = re.compile('\W(\d{3})\W-(\d{3})-(\d{4})') #Each open and closing bracket represent group.
result = re.search(pattern, my_statement)
print(result.span())
print(f'Group 1 {result.group(1)}')
print(f'Group 2 {result.group(2)}')
print(f'Group 3 {result.group(3)}')
print(f'Group 4 {result.group(4)}') # ==> Generate Error as there is no 4th group available

Output:

(30, 44)
Group 1 999
Group 2 999
Group 3 9999

Multiple Regular Expression with OR condition (PIPE separator) example:

Input:

import re
friend_name = 'Harshal'
statement = f'{friend_name} is my friend and He is my best friend and his contact detail is 838-009-3898'
pattern = re.compile('Abhay|Harshal|Sanjay|Abhijeet')
search_result = re.search(pattern, statement)
if search_result != None:
    print(f'Result found with start index {search_result.start()} and end index {search_result.end()}')
    print(f'Available friend is {search_result.group()}')
else:
    print('No result found!')

Output:

Result found with start index 0 and end index 7
Available friend is Harshal

Wild char Regular Expression (period sign) example:

Input:

import re
friend_name = 'Sanket'
statement = f'{friend_name} is my friend and He is my good friend and his contact detail is 838-009-3898'
pattern = re.compile('my......friend')
search_result = re.search(pattern, statement)
if search_result != None:
    print(f'Result found with start index {search_result.start()} and end index {search_result.end()}')
    print(f'Available friend is {search_result.group()}')
else:
    print('No result found!')

Output:

Result found with start index 30 and end index 44
Available friend is my good friend

Exclusion operation with Regular Expression example:

import re
#exclusion syntax with the help of regular expression
statement = 'there are 3 numbers 34 inside 5 this sentence'
pattern = r'[^\d]'  # Generate list of characters from above statement
pattern = r'[^\d]+' # Generate list of words which gets separated with number
re.findall(pattern, statement)



Back


python-documentation is maintained by ravaan-techky.