In our daily life we found lots of problem of pattern search. like find all mobile numbers, or emails etc from given web page or from any file.
Writing manual code for that is not efficient and also very messy. Regular Expression is very popular technique used for pattern search (All compiler & interpreter use it ) & it is very easy to implement & it is very efficient .
I suggest you to write a code for extracting all emails from a webpage without using regular expression & test it .
Emails found on a webpage may follow some of these pattern like
Then run on any big webpage then you realize, your hours of hard work & hundreds of lines of code produce such a inefficient mess .
Now lets learn little bit of reguler expression in python , I hope you are familiar with Ipython notebook , If not take a look on that for few minute its very simple.
Ipython Notebook : Pattern Search , follow this link to see step wise working. you can download run on your local pc for better experience . if you face any problem with that link follow this [Pattern_Search [Github_link]]
Three main steps is
Writing manual code for that is not efficient and also very messy. Regular Expression is very popular technique used for pattern search (All compiler & interpreter use it ) & it is very easy to implement & it is very efficient .
I suggest you to write a code for extracting all emails from a webpage without using regular expression & test it .
Emails found on a webpage may follow some of these pattern like
Then run on any big webpage then you realize, your hours of hard work & hundreds of lines of code produce such a inefficient mess .
Now lets learn little bit of reguler expression in python , I hope you are familiar with Ipython notebook , If not take a look on that for few minute its very simple.
Ipython Notebook : Pattern Search , follow this link to see step wise working. you can download run on your local pc for better experience . if you face any problem with that link follow this [Pattern_Search [Github_link]]
Three main steps is
- import re Module
- write regular expression for your pattern
- search
Above link provide enough information to understand basics.
Now lets move to our email extractor regular expression . Email_harvester , open this link to see Ipython notes & Demonstration for that.
Note:For more detail information follow this book , Automate the boring stuff
you can use This technique for harvesting any information from web, or any files. Here are few exmples or models of email harvester .Email_Collection
No comments:
Post a Comment
THANKS FOR UR GREAT COMMENT