Sankranti/Pongal Offer | 80% + 10% Off & 30-days Extra Lab | Use Coupon Code MSD on Checkout |

  Enroll Now

Python - Videos and Questions

79 / 80

Python Programming - Project - Part 1

Objective: The aim of this project is to extract all the emails from your inbox.

The end objective is to be able to find and query emails. The major work is to extract the emails from Gmail (or any other service provider.)

For this example, we are going to use GYB (https://github.com/jay0lee/got-your-back). You are free to use any other tool to download data. Please note that this is going to download your emails to your CloudxLab home directory. Make sure you use it with some non-sensitive email account. Or maybe create a temporary email account.

INSTRUCTIONS
  • Follow the installation instructions as given here: https://github.com/jay0lee/got-your-back bash <(curl -s -S -L https://git.io/gyb-install)
  • The emails get downloaded to a folder called /home/sandeepgiri9034/GYB-GMail-Backup-YOUREMAILID

  • Write a function having the following signatures:

    def listemails(path, select, where, matches):
            return [(field1, field2), (field1, field2)]

Description:

path is the folder having the email. In my case, it was /home/sandeepgiri9034/GYB-GMail-Backup-YOUREMAILID

select is a list of field names to select e.g: ['to', 'from', 'subject']. This parameter can not be empty or None.

where is single string representing field name to be matched such as 'to','subject' or 'from'. If the this parameter is empty or None, then don't apply any filter. Get all of the fields in select

* matches* is the regular expression that should match with the field in where.

This function should return a list of tuples where each tuple have the value of the fields from select parameter.

Example function call is:

listemails('/home/sandeepgiri9034/GYB-GMail-Backup-sandeepg12817@gmail.com', select=['to', 'from', 'subject'], where='subject', matches='.+')

This should return a list of tuples where each tuple has three values 'to', 'from' and 'subject' from the email whose subject is matching '.+' (which means at least one character)

You can use the eml_parser module. Here is an example use:

import eml_parser
with open('/home/sandeepgiri9034/GYB-GMail-Backup-sandeepg12817@gmail.com/2014/10/1/148cdc0e51a2d5fe.eml', 'rb') as fhdl:
    raw_email = fhdl.read()

parsed_eml = eml_parser.eml_parser.decode_email_b(raw_email)

Once you have completed both Project1 and Project2 (the next one), please send your submissions to reachus@cloudxlab.com.

Note: Only those who complete these two Projects and all the exercises of this course will be eligible to get the course completion certificate from CloudxLab.