Python - Videos and Questions

70 / 78

Previous Index Next

Regular Expressions in Python

As part of this session, we will introduce you to Regular Expressions

Slides

Code Repository for the course on GitHub

Previous Index Next

Please login to comment

235 Comments

Tripti Pandey

7 months ago

#Regular Expressions
#. Matches any character
import re
print('This is AA and this is AAA and this is AAAA.My email id is XYZ@gmail.com')

y=re.findall('.','This is AA and this is AAA and this is AAAA.My email id is XYZ@gmail.com')
#l1=list()
#l1=y
print(y)
#print(l1)

Basically in the ablove code findall function is returning all the characters of the string as a list?

Upvote Share

Shubh Tripathi

7 months ago

Yes, you are correct!

The re.findall() function with the pattern . matches any character except a newline. Since the pattern . is used, it will match each individual character in the string, and the findall() function returns a list of all these matches.

For the given string, the function returns all characters, including spaces and punctuation, as separate elements in a list.

1 Upvote Share

Rajat Mishra

2 years ago

Hey Admin,

I'm getting the following error:

FileNotFoundError: [Errno 2] No such file or directory: 'mbox-short.txt'

I tried cloning the repository into my home folder ,however ,I'm getting the "write error ' while cloning . I've attached the snippet of the same above.

Please help resolve the issue.

Upvote Share

Shubh Tripathi

2 years ago

Hi Rajat,

To resolve that,

you can clone the repo in the /tmp directory and then move it to your home directory. You can move to the tmp directory by using:

cd /tmp

After that, create a directory with your username to avoid ownership issues, such as:

mkdir <<your_username>>

After that, clone the repo inside the directory as told in the tutorial. After the repo gets cloned, move the webserver directory to your home directory by-

mv source destination

where, source is the path of the directory which you want to move, and destination is the path of the directory where you want to move the source directory.

That may do your work.

Upvote Share

This comment has been removed.

Suel Ahmed

2 years ago

I don't see the ml folder and data in my home directory. Please help?

Upvote Share

Shubh Tripathi

2 years ago

Hi,

Can you elaborate on your doubt?

Upvote Share

Nirav Raj

3 years ago

"[0-9]"

what is used for??

Upvote Share

Deekshitha Pawar

3 years ago

Hi,

[0-9] is used to match a single character in the range 0 - 9.

Thanks.

Upvote Share

Ramanesh Iyer

4 years ago

where can i get the python programs which Sandeep is showing in his video of Regular expression .I am currently using Lab and i cant find anywhere the file which he has all the programs he is showing to teach and asking to practice .

Upvote Share

Vagdevi K

4 years ago

Hi,

For your reference: https://github.com/cloudxlab/ml/tree/master/python

Thanks.

Upvote Share

Ramanesh Iyer

4 years ago

Upvote Share

Vagdevi K

4 years ago

Hi,

For your reference: https://github.com/cloudxlab/ml/tree/master/python

Thanks.

Upvote Share

Mayank Chaubey

4 years ago

#Alphanumeric Number
re.findall("[0-9A-Za-z]", "My 2aFrZaB favorite numbers are 19.2 ,4x2")

I want to find the exact '2aFrZaB' from the above line. What code should I use for it.

Upvote Share

Abhinav Singh

4 years ago

Did the above code work? I will say try and experiment. You will find the working code yourself.

If not, then please let us know.

Upvote Share

Mayank Chaubey

4 years ago

How to solve this error?

Upvote Share

This comment has been removed.

Abhinav Singh

4 years ago

Please clone this repository in your home folder using the below command. There you will find the above-mentioned file

git clone https://www.github.com/cloudxlab/ml ~/ml

Upvote Share

Mayank Chaubey

4 years ago

Even after clonning it the error is same.

Upvote Share

Vagdevi K

4 years ago

Hi,

After cloning, we also need to give the correct path to be able to use the fiile:

So try the following path:

ml/python/mbox-short.txt

Thanks.

Upvote Share

Mayank Chaubey

4 years ago

Am I missing something?

Upvote Share

Mayank Chaubey

4 years ago

Which Permission I have to give?

Upvote Share

Vagdevi K

4 years ago

Hi,

The ml folder is outside you current folder. So use: '../ml/python/mbox-short.txt'. This should work.

Thanks.

Upvote Share

This comment has been removed.

Vagdevi K

4 years ago

Hi,

Please try:

open('../ml/python/mbox-short.txt')

This should work.

Thanks.

Upvote Share

Mayank Chaubey

4 years ago

It worked. But didn't understand this problem.

Upvote Share

Vagdevi K

4 years ago

Hi,

By doing git clone https://www.github.com/cloudxlab/ml ~/ml
, you cloned the repo into a folder called ml, which is outside of your present working directory. Your present working directory and the ml folder are in the same level. But you are inside you present working directory. So by doing ../, we have moved one level up, where the ml is there. Then we have written the following path needed. Feel free to go through the linux course for more understanding. https://cloudxlab.com/assessment/displayslide/12/the-directory-structure?course_id=124&playlist_id=2

Thanks.

Upvote Share

Zanbaz Ahmed Khan

4 years ago

Hey admin, could you just notice 1 and 2 and answer the query

# 3 dots

1) re.findall('t...' , 'this is a and that is aa')

['this', 'that']

# 2 dots

2) re.findall('t..' , 'this is a and that is aa')

['thi', 'tha', 't i']

shoudn't the output for 1 be ['this', 'that' , 't is']??

Thanks!

Upvote Share

Vagdevi K

4 years ago

Hi,

. means any character. Space is also a character, and hence considered.

Thanks.

Upvote Share

Zanbaz Ahmed Khan

4 years ago

hello admin,

Is there a command in linux by which we can straight away find the path of a file be it in any directory ?

Upvote Share

Vagdevi K

4 years ago

Hi,

Feel free to look at https://superuser.com/a/327764

Thanks.

Upvote Share

Amit Kumar

4 years ago

Upvote Share

Amit Kumar

4 years ago

here is my file

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

If try the following path:

~/cloudxlab_jupyter_notebooks/mbox-short.txt

Thanks.

1 Upvote Share

Amit Kumar

4 years ago

Hi,

please tell guide, Why am i not able to open the file.

Upvote Share

Sandeep Akode

4 years ago

You have given the url. Please give file path.

Upvote Share

Anagha Pawar

4 years ago

Whats difference between * and + regular expressions?

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

A period followed by an asterisk (. *) matches zero or more instances, while a period followed by a plus (. +) matches one or more instances.

Thanks.

Upvote Share

Ravimohan Kaivar

4 years ago

I am getting an error "No such file or directory" when I execute "fhand = open('/home/rmmk212922/ml/python/mbox-short.txt')".

Also tried "ml/python/mbox-short.txt" but getting the same error. I tried dir command and could not find the ml directory.

Command "!dir ml/python/" gives error - dir: cannot access ml/python/: No such file or directory

I am able to open mbox.txt but not mbox-short.txt

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

Please clone our GitHub repository to get access to that file by using the following command in a web console:

git clone https://www.github.com/cloudxlab/ml ~/ml

Thanks.

1 Upvote Share

Ravimohan Kaivar

4 years ago

Thank you Rajtilak, it is working now..

Upvote Share

Yashin Mehta

4 years ago

I COULD NOT SEE FILE mbox-short.txt.

WHEN I executed the command hand=open('mbox-short.txt') gave me error .

FileNotFoundError: [Errno 2] No such file or directory: 'mbox-short.txt'

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

Try the following path instead:

ml/python/mbox-short.txt

Thanks.

Upvote Share

Chitra Bhatia

4 years ago

The file shows stephen.marquard@uct.ac.za and louis@media.berkeley.edu email addresses twice. However the below command returns these names only once.

import re
hand = open('/home/chits154575/ml/python/mbox-short.txt')
for line in hand:
line = line.rstrip()
if re.search('^From:',line) :
print(line)

From: stephen.marquard@uct.ac.za
From: louis@media.berkeley.edu
From: zqian@umich.edu
From: rjlowe@iupui.edu
From: zqian@umich.edu
From: rjlowe@iupui.edu
From: cwen@iupui.edu
From: cwen@iupui.edu
From: gsilver@umich.edu
From: gsilver@umich.edu
From: zqian@umich.edu
From: gsilver@umich.edu
From: wagnermr@iupui.edu
From: zqian@umich.edu
From: antranig@caret.cam.ac.uk
From: gopal.ramasammycook@gmail.com
From: david.horwitz@uct.ac.za
From: david.horwitz@uct.ac.za
From: david.horwitz@uct.ac.za

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

There is a project related to this in the next topic, I would urge you to try your hands on that.

Thanks.

Upvote Share

Rohit Kumar

4 years ago

Why I am getting a blank list while compiling the below regular expression

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

I cannot see the complete string, so please ensure the expression you have provided matches the string. If it does not, it will show a blank list like above.

Thanks.

Upvote Share

Vikas Gupta

4 years ago

Hi Rohit,

If you are looking at all the words which start from 't' then your pattern should be 't...' here in your pattern it looks it should starts with 't' in the line variable as you use '^'.

For all the word which starts with the 't' and is return the 4 char then below is an example.

import re
line = "I have two phones this that +rohit"

re.findall('t...', line)

Hope this helps you.

Happy learning

1 Upvote Share

Rohit Kumar

4 years ago

The output is not giving email ID separating with @. can you please help

Upvote Share

Vikas Gupta

4 years ago

Hi Rohit,

you can think a couple of things for this and then it will give you the result.

1. lines which started 'From'

2. for email create pattern there should only 1 @ and before that it should be alphanumeric, underscore and a dot. Also, string will not start with the dot.

3. String after the @ any alphanumeric with underscore and dot, same here it should not end with a dot. Also in the general case, only one dot should be present.

You can try your regex here both these below link gives you what your regex is doing and you can put different test cases as well.

https://regex101.com/

https://regexr.com/

Hope this help

Happy coding :)

Warm regards

1 Upvote Share

This comment has been removed.

Rahul Seth

4 years ago

Sir,

Why it is considering the 'a' of 'and' in case of the program wrtitten

import re
re.findall("a+", "this is a aa and that is a aaa")

Even if it's finding it why it is showing "a" and not "and" ?

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

This is because we are trying to find the character "a" here and anything other character associated with it.

Thanks.

Upvote Share

Anuradha .

4 years ago

Hi,

what is the exact meaning of non whitespace here ? Is it the character itself?

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

Good question!

whitespace is a pre-initialized string used as string constant. In Python, string. whitespace will give the characters space, tab, linefeed, return, formfeed, and vertical tab.

In Python, isspace() is a built-in method used for string handling. The isspace() methods returns “True” if all characters in the string are whitespace characters, Otherwise, It returns “False”. This function is used to check if the argument contains all whitespace characters such as : ' ' – Space.

Thanks.

Upvote Share

Anuradha .

4 years ago

Kindly extend my course validity..I had corona that's why couldn't complete the course.

Upvote Share

Jayateertha Rao D

4 years ago

import re

hand=open('mbox-short.txt')
for line in hand:
line=line.rstrip()
if re.search('From:',line):
print(line)

output

No such file or directory: 'mbox-short.txt'

pl help

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

First, check if the file exists in that location. If it does not, please clone our GitHub repository from the link give below the lecture slides.

Thanks.

Upvote Share

Jayateertha Rao D

4 years ago

pl guide how to clone

thanks

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

Please use the following command in your Jupyter notebook:

!git clone https://www.github.com/cloudxlab/ml ~/ml

Thanks.

Upvote Share

Sanny Jain

4 years ago

It is still giving the same message after performing the clone command.

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi Sanny,

First, check if the file exists after the cloning. If it does, provide the complete path to the file.

Thanks.

Upvote Share

Sanny Jain

4 years ago

File does exist after cloning but it is still giving the same error.

Upvote Share

Sanny Jain

4 years ago

path: root folder / ml / python

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

Please share a screenshot of the location of the file, and your code.

Thanks.

Upvote Share

Dhruvang Suthar

4 years ago

first

second

Upvote Share

Vagdevi K

4 years ago

Hi,

There is no screenshot under second. Is there any issue?

Thanks.

Upvote Share

Vipin Sharma

4 years ago

One question - why we applied rstrip in the code below:

for line in hand:
line = line.rstrip()
if line.startswith('From:'):
print(line)

Please advise. Thanks

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

The rstrip() method removes any trailing characters (characters at the end a string), space is the default trailing character to remove.

Thanks.

Upvote Share

Manjushri Thorve - Kapse

4 years ago

# print integer or decimals
import re
x = 'this is 34a5 a aa 35.0 and that is a12 aaa'
re.findall('[0-9]+\\.?[0-9]*',x)

['34', '5', '35.0', '12']

Upvote Share

Bharadwaz Sripada

4 years ago

Hi,
Is it possible to get the jupyter notebook which is being explained in the video?

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

Yes, absolutely! You can get access to these notebooks from our GitHub repository at the below link:

https://github.com/cloudxlab/ml/tree/master/python

Thanks.

1 Upvote Share

Shashwat Verma

4 years ago

Some problem uploading the screenshot. I am pasting the code and the o/p below:

*************************************************************************************************************

import re
# to capture end of line ($)
re.findall("...a$", "this is a aa and aaa test")

Out [5] : [ ]

Upvote Share

Sandeep Akode

4 years ago

It would suggest you to look again at what $ does in regex in the slides. Let me know if you still have a doubt.

Upvote Share

Shashwat Verma

4 years ago

Got it Sandeep. Thanks!

Upvote Share

Shashwat Verma

4 years ago

Ataching the screenshot again:

Upvote Share

Shashwat Verma

4 years ago

Hello,

Could you please let me know why am I not getting any value in the o/p as shown in the screenshot usingt the '...$a'?

Upvote Share

Shashwat Verma

4 years ago

What is the difference between 'fhandle' and 'hand' command used for to open and read file? When and in what use case do we use these alternatively? Thanks!

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

These are not commands but variable names. We usually use open() function to open the file.

Thanks.

Upvote Share

Shashwat Verma

4 years ago

Got it.....thanks!

Upvote Share

Shashwat Verma

4 years ago

Hello,

I am unable to open the file. I have tried to open the files using 2 approaches as below but it dis not work. Could you please help locate the file?

hand = open( '../ml/python/mbox-shor.txt')

hand = open( '../mbox-shor.txt')

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

You need to clone our GitHub repository first to get access to this file. Please use the following command in a console for the same:

git clone https://github.com/cloudxlab/ml

Once done, you will find the file under the following path:

../ml/python/

Please check where you have cloned the file, and then use the path accordingly.

Thanks.

Upvote Share

Shashwat Verma

4 years ago

Done it...Thanks!

Upvote Share

Nagamani Sharmila

5 years ago

hi,

if i try with regular expression 'a+' i am getting o/p as expeceted. But if try using "t+" why extra t is coming in the o/p. Please find the below o/p s For reference.

import re
re.findall('a+','this is a aa and that is a aaa')

o/p ['a', 'aa', 'a', 'a', 'a', 'aaa']

import re
re.findall('t+','this is a aa and that is a aaa')

o/p ['t', 't', 't']

Upvote Share

Sachin Giri

5 years ago

Hi Nagamani,

Because there are two 't' in 'that' and there are different characters("ah") in between, therefore both 't' are separated in output.

Upvote Share

Nagamani Sharmila

5 years ago

understood. Thanks

Upvote Share

BINDU SWETHA PASULURI

5 years ago

Hi, i didn't understood the operation of ' .* ' operator .could you explain how it is working? moreover i have not understnad why a null charechter is generated at the end of the list?

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

. means any character, * means zero or more occurences. Could you tell me more about the null character, maybe share a screenshot or give an example.

Thanks.

Upvote Share

BINDU SWETHA PASULURI

5 years ago

in the below one , the last element of the list is null, why it is being generated ?

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

This is because of the newline character present in the end.

Thanks.

Upvote Share

BINDU SWETHA PASULURI

5 years ago

But in the above example there is no newline charechter at the end of the string?

Upvote Share

Sandeep Akode

5 years ago

* means 0 or more occurrence and . matches empty string also. So, your regex matches empty string at the end. If you want to match the string without empty string you can try ".+"

1 Upvote Share

Mahesh Sharma

5 years ago

Hello,

I am getting the below erroe as file - 'mbox-short.txt' is not available, can you assist?

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-13-70ab0e4db102> in <module>
      1 import re
----> 2 hand = open('mbox-short.txt')
      3 for line in hand:
      4     line=line.rstrip()
      5     if re.search('From:',line):

FileNotFoundError: [Errno 2] No such file or directory: 'mbox-short.txt'

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

Please provide the relative path to the file.

Thanks.

Upvote Share

This comment has been removed.

BINDU SWETHA PASULURI

5 years ago

Hi, i to have same issue.I have mbox-short.txt file in ml/python but while i am running the code i am getting that no file is found. could you explain in brief how to specify the path to the file. Thanks in advance.

Upvote Share

Sachin Giri

5 years ago

Hi Bindu,

The default notebooks are in "cloudxlab_jupyter_notebooks) so you have to go a direcotory up which is '../ml/python/mbox-shor.txt'.

Upvote Share

BINDU SWETHA PASULURI

5 years ago

ok, thank you

Upvote Share

Nagamani Sharmila

5 years ago

hi,

i tried with the same path you mentioned. i am still facing problem

f=open( '../ml/python/sample2.txt')

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-6-f3f5b2f60237> in <module>
----> 1 f=open( '../ml/python/sample2.txt')

FileNotFoundError: [Errno 2] No such file or directory: '../ml/python/sample2.txt'

Upvote Share

Sachin Giri

5 years ago

Hi Nagamani,

Path might be different for you, please check the path using file browser present in jupyter.

Upvote Share

Nagamani Sharmila

5 years ago

how to use file browser.

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

I just replied to your mail, please check. Also you might want to go over the lectures on relative file paths before trying this.

Thanks.

Upvote Share

kranti sanglam

5 years ago

Hi Raj,

Can U pls expln slide number 31, which is not explained in the video...

similar code as shown in the slide...

## Spam Confidence

import re
hand = open('mbox-short.txt')
numlist = list()
for line in hand:
line = line.rstrip()
stuff = re.findall('^X-DSPAM-Confidence: ([0-9.]+)', line)
if len(stuff) != 1 : continue
print(line)
num = float(stuff[0])
numlist.append(num)
print('Maximum:', max(numlist))
print('AVG:', sum(numlist)/len(numlist))

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

Here we are opening a file and counting all lines in that file which starts with X-DSPAM-Confidence.

Thanks.

1 Upvote Share

kranti sanglam

5 years ago

00:33:50

why not 'a a' is counted/ shown multiple times, pls explain?

00:38:20 to 00:38:40

'a +' & 'a\s' both are not the same, 1st results all number of spaces present after 'a', it can be up to n numbers if such thing present but on the other hand second results 'a' but only single space,

isn't it , pls explain?

Upvote Share

Sachin Giri

5 years ago

Hi Kranti,
At 00:30:50, the 'a' after 'is ' is already counted and the findall function will start scanning the next character so that 'a' won't be count this time.
Please execute the following code to get a better idea:
import re
re.findall(".\s.","this is a aa and that is aa aaa")

and yes the difference between 'a +' & 'a\s' is that the 'a +' will print all the white spaces after 'a' and 'a\s' will only print one white space which is after 'a'.

1 Upvote Share

kranti sanglam

5 years ago

Just tried the code again it skipped the one 'a a'

Upvote Share

This comment has been removed.

kranti sanglam

5 years ago

Upvote Share

Sachin Giri

5 years ago

Hi Kranti,

No, it left nothing, single 'a' is already counted in 's a' with 's' so now the algorithm will start searching from next position which is whitespace and 're.findall' will ignore this whitespace because the character before this whitespace is already counted, and if it is counted again, there will be a redundancy search result.

1 Upvote Share

kranti sanglam

5 years ago

it means as per u, there are only 7 whitespaces present in the string, m I rt?

Upvote Share

Sachin Giri

5 years ago

Hi Kranti,
No i didn't mean that, to print only white spaces we have to use '\s' only then you would be able to see whitespaces only.

Here we are printing ".\s." which means any character followed by whitespace followed by any character.

Please try executing different variations such as ".\s" , "\s." , "\s" , ".\s.*" .

1 Upvote Share

kranti sanglam

5 years ago

then the purpose of mine to count the whitespaces is failed, and I got the wrong answer,,, I did multiple combination in my notebook, I've this confusion which is not clear yet so I'm dicussing it here. That is , if I want to count the whitespaces in the line I'll get wrong answer if above situation arises.

Upvote Share

Sandeep Giri

5 years ago

Very Good point

1 Upvote Share

Sandeep Giri

5 years ago

If you do this:

import re
re.findall('.\s.', 'this is aa aa and that is aa aaa')

You will get:

['s i', 's a', 'a a', 'a a', 'd t', 't i', 's a', 'a a']

Upvote Share

kranti sanglam

5 years ago

Okay, so this is how it is calculating,,, don't you feel it is an error? Or is it universal like that only, if yes then how can I calculate this missed space?

Upvote Share

This comment has been removed.

kranti sanglam

5 years ago

yeah, this is okay. but in my original string the code is failed... so what's the remedy for counting all the spaces in the original string, any help?

Upvote Share

Sachin Giri

5 years ago

I think this will solve the purpose.

re.findall("\s.*","<string>")

Upvote Share

kranti sanglam

5 years ago

Sorry Sachin,,, this code means different

Upvote Share

kranti sanglam

5 years ago

pic

Upvote Share

Sachin Giri

5 years ago

Hi kranti,

I think the picture shows all the whitespaces. Let me know if i am missing something.

Upvote Share

kranti sanglam

5 years ago

i think , u r missing a lot , pls call me so that we can discuss

Upvote Share

Sandeep Giri

5 years ago

Kranti,

If you are trying to count the number of spaces, you can use the following approach:

import re
len(re.findall('\s', 'this is aa aa and that is aa aaa'))

1 Upvote Share

This comment has been removed.

kranti sanglam

5 years ago

Yeah, Gotcha. Thanks. the result is 8 now, it's okay but in my string, it was 7.

I'm still concerned because whether in our practical cases do we required to count spaces or not, if yes then I've to check my code again that why not it is valid for universal strings, why do we have to change the string to get the desired result or this is the only option to get the desired result or we can do something to get the desired result?

Upvote Share

Rajtilak Bhattacharjee

5 years ago

It depends on what problem you are working on. Yes, at times we do require to count spaces.

Thanks.

Upvote Share

kranti sanglam

5 years ago

Okay, great, so when I'm trying to count the spaces in my string then the code is counting only 7 instead of 8 so here, is this the limitation of the code?

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

Can you share a screenshot of your code which is counting space?

Thanks.

Upvote Share

kranti sanglam

5 years ago

it's already in the long trails of comments : -), well here u go .. the last one , below encircled...

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

Here you are not counting spaces but identifying them. There are other better methods to counting spaces, and you can use one of them. For example, you can run a for-loop till the end of the line and count " " characters.

Thanks.

1 Upvote Share

kranti sanglam

5 years ago

00:27:58 to 00:28:48

in python, we're putting a single slash to escape the meaning of special character so why 2 slashes here, pls explain?

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

\\ counts the slash. The first slash is to escape the meaning of special character, the second slash is the special character.

Thanks.

Upvote Share

kranti sanglam

5 years ago

please have look of result,, I didn't find any difference,,,,

Upvote Share

kranti sanglam

5 years ago

pic

Upvote Share

kranti sanglam

5 years ago

Raj, any suggestions?

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

Regular expressions use the backslash character ('\') to indicate special forms or to allow special characters to be used without invoking their special meaning. This collides with Python’s usage of the same character for the same purpose in string literals; for example, to match a literal backslash, one might have to write '\\\\' as the pattern string, because the regular expression must be \\, and each backslash must be expressed as \\ inside a regular Python string literal.

The solution is to use Python’s raw string notation for regular expression patterns; backslashes are not handled in any special way in a string literal prefixed with 'r'. So r"\n" is a two-character string containing '\' and 'n', while "\n" is a one-character string containing a newline. Usually patterns will be expressed in Python code using this raw string notation.

You can find more from the below link:

https://docs.python.org/3.4/library/re.html

Thanks.

Upvote Share

This comment has been removed.

kranti sanglam

5 years ago

00:26:57 to 00:27:22

why the code is returning empty line as well as the entire line, pls explain?

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

Every line comprises of that line along with a newline character. This is returning both of them since they are treated as separate lines.

Thanks.

Upvote Share

kranti sanglam

5 years ago

Sorry , didn't get the logic behind it ,, if code has given the entire line the what is the requirement of blank/empty line and I as a coder don't want that which is trash for me,,, can you pls suggest some code to get only line , and a different code to get the empty line separately, don't want together?

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

This is feature of Python. Also, it is not a blank/empty line. It is a newline character which helps print more lines in new separate lines. Without it everything would get printed in a single paragraph.

If you want it without that, simply tell your code to ignore the newline character.

Thanks.

1 Upvote Share

kranti sanglam

5 years ago

Okay, I'll try to find that code that tells my code to ignore the newline character, and then I'll try again. Thanks

Upvote Share

Habib Rajbar

5 years ago

I am still facing the problem on reading the txt file created at my files

FileNotFoundError                         Traceback (most recent call last)
<ipython-input-5-82c8dff31b9f> in <module>
      1 import re
----> 2 hand = open('python.txt')
      3 for line in hand:
      4     line = line.rstrip()
      5     if line.find('From:') >= 0:

FileNotFoundError: [Errno 2] No such file or directory: 'python.txt'

please gaide here

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

Did you clone our GitHub repository?

Thanks.

Upvote Share

Habib Rajbar

5 years ago

No ,can you let me know how to do it ? if you send instructions i will do it

here

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

Use the below command in your Jupyter notebook:

!git clone https://github.com/cloudxlab/ml

Thanks.

Upvote Share

Abhinav Singh

5 years ago

Hi,

Aren't '*' and '?' doing the same thing?

Thanks.

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

* refers to multiple characters, ? refers toa single character.

Thanks.

Upvote Share

Dr. Ritu Jain

5 years ago

Why I am not getting ['s i', 's a', 'a a', 'a a', 'd t', 't i', 's a', 'a a'] for the above string search operation?

Upvote Share

Anupam Singh Vishal

5 years ago

Hi Ritu,

The output you're expecting is possible only if the characters repeat which is not the case. See the second, third and fourth character in your expected output - 's a', 'a a', 'a a'. It will only happen if the 'a' in-between is and aa in the string is detected twice. The same case for the 'a' between is and aaa in the last.

Hope this clears your doubt.

Thanks

Upvote Share

Govind Raj

5 years ago

Hi, please clone the Github repository for my access, as i could not access the files such as mbox-short.txt

Upvote Share

This comment has been removed.

Rajtilak Bhattacharjee

5 years ago

Hi,

Please use the following command in your Jupyter notebook and not the console to clone the repository:

!git clone https://github.com/cloudxlab/ml

Thanks.

Upvote Share

apratimkumar pandey

5 years ago

why i am getting 't i' in the output. if the whole sentence is having only two words with 't' ?

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

Regular expression that you are using is not checking words by letter. So here it is considering "...that is...".

Thanks.

Upvote Share

DORAISWAMY R HARSHAVARDHAN

5 years ago

In re.findall('.\s.','this is a aa and that is a aaa'). Why didn't it return (a aaa)??

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

This is because 'is a' already considers the letter 'a', so when it comes to 'aaa' it does not satisfies the regex.

Thanks.

Upvote Share

Vishal Bachchan

5 years ago

Hi Team,

I am not able to find the git hub location and not able to pull the slides. please support.

regards.

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

Below is the link to our repository:

https://github.com/cloudxlab/ml/tree/master/python

Thanks.

Upvote Share

Kartik Enumula

5 years ago

Hello,

How do I get the mbox-short.txt file? I do not see it in my folder

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

These files are located at our GitHub repository, you need to clone them using the following command in your web console:

git clone https://github.com/cloudxlab/ml

Thanks.

Upvote Share

This comment has been removed.

Dr. Ritu Jain

5 years ago

I have enrolled in Course on Machine Learning Specialization. But, in my server tab, I can see only one folder i.e., cloudxlab_jupyter_notebooks. I am not able to find the other folders as well as files (e.g. mbox-short.txt, puthon_sample_file) as shown in the video . Kindly look into the matter.

My Lab username is ritujainmca8920.

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

Those files and folders are inside that folder. You can check using the ls command in web console.

Let me know if you still face any challenges.

Thanks.

Upvote Share

This comment has been removed.

Dr. Ritu Jain

5 years ago

Dear Sir,

I have checked the folder cloudxlab_jupyter_notebooks. I have checked using ls command in web console also.

For your reference, I am pasting the screenshots. Kindly help me.

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

The files that you are looking for are available in our Github repository, I have cloned the same for you. You will find them in the cloudxlab_jupyter_notebooks folder/ml/python folder. Please check.

Thanks.

Upvote Share

Aditya Goyat

5 years ago

In slide 8 and 9, why use re.search() intead of line.startswith() ?

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

This is done so that we can use a regular expression in the search.

Thanks.

Upvote Share

Sagan Gupta

5 years ago

Do the expressions only return the value in the parenthesis()

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

It depends on the expression.

Thanks.

Upvote Share

Sagan Gupta

5 years ago

'@([^ ]*)' I didnt understand this expression

Upvote Share

Anupam Singh Vishal

5 years ago

Hi Sagan,

'@' matches @, you will use this generally during extracting emails.

'(',')' - marks the beginning and end of the string extraction respectively.

'[^ ]' - Matches everything except space

* - Repeats a character zero or more times

So in total the expression @([^ ]*) means - start extracting after the @ character and extract everything except space that is it will extract all characters until it encounters a space.

Hope this clears your doubt.

Thanks

Upvote Share

Prasenjit Basu

5 years ago

I could not able to log in the Lab. After every attempt in every possible way, it is saying invalid username and password. Please help as early as possible. I am wasting my lab hours as well as my time

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

I already replied to your mail, please check.

Thanks.

Upvote Share

Amarnath Chakladar

5 years ago

not able to sign in jupyter notebook.

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

Your Jupyter notebook is working fine. Please type in the password instead of copy-pasting it.

Thanks.

Upvote Share

CloudxLab

5 years ago

Hi Harmeet,

It was nice talking to you. Thank you for all your feedback. As discussed, if you have any issues with the topic of the course, you can reach out to us either by commenting here, or using the email id or at the discussion forum listed below:

https://discuss.cloudxlab.com/
reachus@cloudxlab.com

Thanks and happy learning!

-- Rajtilak Bhattacharjee

Upvote Share

Harmeet Randhawa

5 years ago

The professor is struggling to make understand the concepts....didnt get it after the whole lecture....really disaapointed.

Upvote Share

CloudxLab

5 years ago

Hi,

If you can point out which part of the lecture you were unable to understand, we can clarify the same for you.

Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Harmeet Randhawa

5 years ago

didn't get any single concept concretely....i surfed over youtube and got better explanations.

Upvote Share

CloudxLab

5 years ago

Hi,

Would request you to share your contact details by mailing us so that we can understand your issues better.

Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Sagan Krishna Gupta

5 years ago

My jupyter is not working.Its shows and message that:
jupyter.f.cloudxlab.com took too long to respond.
And I am getting that frequently.
Please help

Upvote Share

Sachin Giri

5 years ago

Hi Sagan,
Sorry for inconvenience, we were upgrading the machine and there was a downtime for few minutes.
It should be working fine now.

Upvote Share

Sagan Krishna Gupta

5 years ago

I did not understand the * expression

Upvote Share

CloudxLab

5 years ago

Hi,

You can find explanation for all the regex operators in the below link:
https://docs.python.org/3/l...

Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Rakhi Bajpayi

5 years ago

could find file "mbox-short.txt" also I couldnt download slides and your help file for code copying

Upvote Share

CloudxLab

5 years ago

Hi,

The file "mbox-short.txt" is present in /cxldata/datasets/project/

For slides, you can click on the pop-up button appearing on the top-right of the slide content.

I hope this helps. For any further queries feel free to contact us.

Thanks.

-- Mayank Sharma

Upvote Share

Shipra Kumari

5 years ago

I am not getting mbox-short.txt in file list, how to find it?

Upvote Share

CloudxLab

5 years ago

Hi Shipra,
It is present in /cxldata/datasets/project/

-- Sachin Giri

Upvote Share

Rachana Sharma

5 years ago

print(re.findall('@([^ ]*)',line)) how this Regular expression work?

Upvote Share

CloudxLab

5 years ago

Rachana,

I suggest you try this command in jupyter for different values of line. Or move continue the conversation on discuss.cloudxlab.com.

Praveen

-- Praveen Pavithran

Upvote Share

Souvik Biswas

5 years ago

Hello,
I'm running this code but no output is comming. Please help me.

Upvote Share

CloudxLab

5 years ago

Hi,

Would request you to let us know which assessment you are trying to attempt, also you need to mention the path of the file.

Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Souvik Biswas

5 years ago

Hello,
This is Regular Expressions session. The path of the file is given below

https://jupyter.f.cloudxlab...

Upvote Share

Souvik Biswas

5 years ago

I have uploaded this file ('mbox-short.txt') in my Jupyter notebook then I'm running this code but no result is found.

s found.

Upvote Share

CloudxLab

5 years ago

Hi,

Can you try this path while opening the file:

'../cloudxlab_jupyter_notebooks/ml/python/mbox-short.txt'

Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Souvik Biswas

5 years ago

Hello Sir,
I'm trying this but same problem occurred.
'../cloudxlab_jupyter_notebooks/ml/python/mbox-short.txt'

Upvote Share

CloudxLab

5 years ago

Hi Souvik,

From your previous screenshot I can see that the file is on the home folder. Could you please set the correct path to the using the pwd command.
Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Harsh Bhagat

5 years ago

sir i am not able to run code on jupyter notepad
what acn i do to run it?

Upvote Share

CloudxLab

5 years ago

Hi,

I am so sorry to hear that. Could you please tell us whether you are getting any errors, or are you not able to open a Jupyter notebook? Do share a screenshot if possible.

Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Ashutosh Mishra

5 years ago

--------------------------------

even file is present in files-------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-20-6b7c8beec896> in <module>
----> 1 hand = open('mbox-short.txt')

FileNotFoundError: [Errno 2] No such file or directory: 'mbox-short.txt'

Upvote Share

CloudxLab

5 years ago

Hi,

Could you please share a screenshot of your code and the error that you are getting.

Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Ashutosh Mishra

5 years ago

Not able to see python getting started file in github.

Upvote Share

CloudxLab

5 years ago

Hi,

That would be the Python - Part I notebook in out GitHub repository.

Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Shivam Srivastava

5 years ago

Hello sir,
There is one extra "t" is coming and in above example also just few sec back of this one one extra "t" was coming why it's so.
Can u please help me out in this.
Thanks

Upvote Share

CloudxLab

5 years ago

Hi,

Can you tell me which findall code you are talking about in this example.
Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Shivam Srivastava

5 years ago

In output 30 of the image shared sir.

Upvote Share

CloudxLab

5 years ago

Hi,

In output 30 there are three codes, could you please tell me which code you are referring to?

Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Shivam Srivastava

5 years ago

Upvote Share

CloudxLab

5 years ago

Hi,

The last t is for the third findall. The expression here is t. which signifies any character after t except a linebreak.

Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Mohini Singhal

5 years ago

import re

hand=open('radha.txt')
for line in hand:
line=line.rstrip()
if re.search('New',line):
print(line)
when i am running this expression then i am getting all the line in which New is there but not only line starting with New
can you tell me why?

Upvote Share

CloudxLab

5 years ago

Hello Disqus,

Thanks for contacting CloudxLab!

This automatic reply is just to let you know that we received your message and we’ll get back to you with a response as quickly as possible. During business hours (9am-5pm IST, Monday-Friday) we do our best to reply within a few hours. Evenings and weekends may take us a little bit longer.

If you have a general question about using CloudxLab, you’re welcome to browse our below Knowledge Base for walkthroughs of all of our features and answers to frequently asked questions.

- Tech FAQ <https: cloudxlab.com="" faq="" support="">
- General FAQ <https: cloudxlab.com="" faq=""/>

If you have any additional information that you think will help us to assist you, please feel free to reply to this email. We look forward to chatting soon!

Cheers,
The CloudxLab Team

Upvote Share

CloudxLab

5 years ago

Hi Mohini,

Could you please tell me that this is a part of which assignment?

Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Mohini Singhal

5 years ago

sir,this is the part of regular expression

Upvote Share

CloudxLab

5 years ago

Hello Disqus,

Thanks for contacting CloudxLab!

If you have a general question about using CloudxLab, you’re welcome to browse our below Knowledge Base for walkthroughs of all of our features and answers to frequently asked questions.

- Tech FAQ <https: cloudxlab.com="" faq="" support="">
- General FAQ <https: cloudxlab.com="" faq=""/>

If you have any additional information that you think will help us to assist you, please feel free to reply to this email. We look forward to chatting soon!

Cheers,
The CloudxLab Team

Upvote Share

CloudxLab

5 years ago

Hi, Mohini.

If you want to find the line starts with New, then you have use the startswith() method present in Python.

This is the syntax.
str.startswith(prefix[, start[, end]])

re.search() will give the matched object if it find anywhere in the string irrespective to the function.

All the best!

-- Satyajit Das

Upvote Share

Shubham Purohit

5 years ago

hello sir,

how to access the files which you uploadeed on github ???

Upvote Share

Rachit Shah

5 years ago

at 30:30 in the video , Just a suggestion:
this can be used
import re

re.findall("^[a-z]+", "this is a aa and that is a aaa")

instead of

re.findall("^t.....", "this is a aa and that is a aaa")

to find the first word in the line :)

Upvote Share

Ritesh Singh

5 years ago

Hi , I am getting error while opening the new python3 notebook'
"Unexpected error while saving file: Untitled8.ipynb disk I/O error'

Upvote Share

Sachin Giri

5 years ago

Hi ritesh
Are you still facing error
if yes, Can you please post screenshot?

Upvote Share

Pruthvi Chaitanya Chinnapillai

5 years ago

The Sessions looks like pretty old,around 2 years older they,can you please put latest videos and notebooks into Github.

Upvote Share

Manish Bhoge

5 years ago

Hi, is it possible to send the slides of python programming course.

Upvote Share

Anuj Singh

5 years ago

is it possible to download the slide?

Upvote Share

Satyajit Das

5 years ago

Hi, Anuj.

As of now the Python slides are not available for download, we are working on this.
But you can always download the .ipynb file from our GITHUB repository here :- https://github.com/cloudxla...

All the best!

Upvote Share

Prashant Borkar

5 years ago

There is no mbox-short.txt file present in files section.
from where i will get this file?

Upvote Share

Cheshta Pandita

5 years ago

Hi, I am also facing the same problem.
Please could you let me know where to find this file?

Upvote Share

Bharath V N

5 years ago

The file is in the github repository. you can clone the entire repository or you can download the zip file.
1 . To download the zip file, go to linux console and enter the command
wget https://github.com/cloudxla...
2. Unzip the file with the command unzip master.zip
3. you can find mbox-short.txt inside python folder.

Upvote Share

Swaraj

5 years ago

Thank You !!

Upvote Share

Aamir Karim

6 years ago

I saved a file in my root directory in jupyter but i can not acces it. How do I access it to perforn RE OPERATIONS ON IT

Upvote Share

Tom T

6 years ago

Your jupyter files are stored under /cloudxlab_jupyter_notebooks . You can navigate to the location from the files tab - as shown in the image below. There would be some .ipynb files saved under that directory. Please check if your files are present in that location .

Upvote Share

Priya Ranjan

6 years ago

re.findall('[0-9]+.[0.9]+', "i am 19.5 and u r 56.8 kg in weight")
output is:
Out[59]:
['19.', '56.']
expected is: ['19.5', '56.8']

Upvote Share

Tom T

6 years ago

Try this
re.findall('[0-9]+.[0-9]', "i am 19.5 and u r 56.8 kg in weight")
That should give you the expected results that you mentioned.

Upvote Share

Python - Videos and Questions

Regular Expressions in Python

Slides

Code Repository for the course on GitHub

XP

Please login to comment

235 Comments