import json import requests import re r = requests.get('http://www.gutenberg.org/cache/epub/1041/pg1041.txt') list1 = [] dict1 = {} dictionary_copy = dict1.copy() list1.append(dictionary_copy) match =...

You are to write code that will extract text from the urlhttp://www.gutenberg.org/cache/epub/1041/pg1041.txt, process it using regular expressions andprint out a JSON formatted string. Below is an example of some of the json and the complete json is available athttps://it630.netlify.app/sonnets.json. You are to use regular expressions to extract out title, author, last_update, and the sonnet text.


import json import requests import re r = requests.get('http://www.gutenberg.org/cache/epub/1041/pg1041.txt') list1 = [] dict1 = {} dictionary_copy = dict1.copy() list1.append(dictionary_copy) match = re.search(r'Title:\s(.*)\r',r.text) match2 = re.search(r'Author:\s(.*)\r',r.text) match3 = re.search(r'Last Updated:\s(.*)\r',r.text) #match4 = re.search(r'',r.text) response = {"title": match.groups()[0], "author": match2.groups()[0], 'last updated': match3.groups()[0], "sonnets": list1 } print(json.dumps(response))
Oct 16, 2021
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here