Description:In this course, you have worked with relational and non-relational databases and learned the difference between them with regard to design, structure, and how you access data from them. In this final project you will develop a python program that will query the data from the Classic Models database you worked with in module 13, and convert all of its data into to be stored in a MongoDB database. You will then have to answer the queries asked in module 13 along with a few more.
Purpose:To gain further experience building software applications that interact with databases and to demonstrate knowledge of the differences between relational and non-relational databases.
Challenge Tasks
- Review the schema for the Classic ModelDatabase and develop a strategy to query the data then import it into MongoDB database.
- Develop two python programs:
- The first program will connect to your MySQL and MongoDB servers to query and export the SQL data and insert it into your classicmodels MongoDB database. Your program will also write the MongoDB data to four JSON files called "[collection].json", one fore each collection.
- The second program is where you will implement the queries for the classicmodels MongoDB database.
- Package the python files, images, and json files to a zip file and submit.
Step 1. Understanding the Non Relational Schema for the MongoDB Classic Models Database
Your MongoDB Classic Models database will have four (4) collections.
- Employees
- Customers
- Products
- Orders
The data from your Mongo Database collections will map to your SQL tables in the following manner:
- employees collection
- employees table: All columns except officeCode. employeeNumber will be the _id field.
- offices table: All columns will appear in a nested document in a field called “office”.
{ "employeeNumber": , "lastName": , "firstName": , "extension": , "email" : , "reportsTo": , "jobTitle": , "office": { "officeCode": , "city":, "phone":, "addressLine1":, "addressLine2":, "state":, "country":, "postalCode":, "territory":, } }
- customer collection
- customers table: All columns from the customers table except salesRepEmployeeNumber. customerNumber will be the _id field.
- payments table: All columns except customerNumber will appear in a nested documents in a field called “payments”.
Note that a customer can have multiple payments
{ "customerNumber": , "customerName": , "contactLastName" :, "contactFirstName": , "phone": , "addressLine1" : , "addressLine2" : , "city": , "state": , "postalCode": , "country": , "salesRepEmployeeNumber": , "creditLimit": , "payments:{[ { "checkNumber": , "paymentDate": , "amount" }, { "checkNumber": , "paymentDate": , "amount": , }] } }
- products collection
- products table: All columns from the products table except productLine. productCode will be the _id field.
- productlines table: All columns will appear in a nested document in a field called “productLine”. The value from the productLine column will be held in a field called "line" in the nested document.
{ "productCode": , "productName": , "productScale": , "productVendor": , "productDescription": , "quantityInStock": , "buyPrice": , "MSRP": , "productLine": [{
"line":, "textDescription": , "htmlDescription": , "image": , }] }
- orders collection
- orders table: All columns from the orders table but customerNumber will be replaced with the customer’s first and last names. orderNumber will be the _id field
- orderDetails: all columns from orderDetails will appear in a nested document with the field name “orderDetails”. productCode will be replaced with productName.
- products table: productName
- customer table: firstName and lastName
- employees table: firstName and lastName
{ orderNumber: , orderDate: , requiredDate: , shippeddDate: , status: , comments: , customerName: , employeeName: , orderDetais: { productName: , quantityOrdered: , priceEach: , } }
Step 2. Set Up Your Environment
Create a folder called “finalproject” and a sub folder called images in your environments folder on you DigitalOcean server. From your environments folder type the following commands
mkdir finalproject
mkdir finalproject/images
You should create your python files in your finalproject folder.
Step 3. Plan and Develop Your Queries
You will need to develop queries to pull data from the SQL database to write into JSON format based on the document format above. Your queries will need to connect tables in the database. This will be accomplished:
- in WHERE Clauses, and
- using JOINS
Step 4. Write Your Python Code and Get Help Along the Way
Incrementally build and develop your code. My suggestion is to begin with the employees and products collections first since they are the simplest. You can then move on to the customer and orders collections.
As you are developing this project reach out to the PLAs or the professor for assictance.
Creating the Employees Collection
The code below provides an example of how to query data from your MySQL database and then import that data into a MongoDB database. The code queries the MySQL classicmodels database, creates MongoDB documents using python dictionaries, and imports the documents into the MongoDB finalproject database. You may use the code as a guide and template.
finalproject.py
#import mysql connector and pygal import mysql.connector import pymongo import pygal import json ##################### SETUP DATABASE CONNECTIONS ############### #connect to your mysql datbase mydb = mysql.connector.connect( host="localhost", user="user", passwd="pass", database="classicmodels" ) #connect to the mongo database mongo_client = pymongo.MongoClient(host = "mongodb://localhost:27017/", serverSelectionTimeoutMS = 3000, username="user", password='pass') #Select mongo datacbase and collections mongo_db = mongo_client["finalproject"] employee_collection = mongo_db["employees"] #The cursor will execute queries on your MySql Datbase mycursor = mydb.cursor() ################# QUERIES TO POPULATE EMPLOYEE COLLECTION #################### #sql query generates a list of managers. People who have other employees report to them sql_query = '''select employeeNumber, firstName, lastName from employees where employeeNumber IN (select reportsTo from employees);''' #Execute the query mycursor.execute(sql_query) #Get the query result query_result = mycursor.fetchall() #Create managers dictionary managers = {} '''query_result is a list of tuples. The loop below converts the list of tuples into a dictionary where the dictionary key is the employeeNumber and the value is the first and last name''' for result in query_result: managers[result[0]] = result[1] + " " + result[2] #Select all data from the employees and offices tables sql_query = '''SELECT * FROM employees e, offices o WHERE e.officeCode = o.officeCode;''' #Execute the query mycursor.execute(sql_query) #Get the query results query_result = mycursor.fetchall() #################### WRITE DOCUMENTS TO INSERT INTO EMPLOYEE COLLECTIONS ################ #list to store employee documents employees = [] #loop through tuples in query_result. Write data to json format to store in our json file for record in query_result: office_document = { "officeCode": record[8], "city": record[9], "phone": record[10], "addressLine1": record[11], "addressLine2": record[12], "state": record[13], "country" : record[14], "postalCode": record[15], "territory": record[16] }
'''for employees who don't report to anyone, set the reportsTo value to N/A
for other employees with a manager get the manager name from the managers dictionary
using the the employeeNumber as the dictionary key''' if record[6] == None: manager = "N/A" else: manager = managers[int(record[6])] employee_document = { "_id": record[0], "lastName": record[1], "firstName": record[2], "extension": record[3], "email": record[4], "reportsTo": manager, "jobTitle": record[7], "office": office_document } employees.append(employee_document) #insert employee documents into the employees collection in the mongo database employee_collection.drop() employee_collection.insert_many(employees) #write to mongo formated data to a json file json_file = open("employees.json", "w") #open the file json_file.write("[\n") #write the opening bracket forthe list counter = 1 #use counter to determine id the last document is being written #loop through the list of employee documents for employee_doc in employees: json_file.write(json.dumps(employee_doc)) #convert a python dictionary to a json object #write a comma after the document if it is not the last in the list if counter != len(employees): json_file.write(",\n") counter += 1 else: #don't write a comma after the document if it is the last in the list json_file.write("\n") json_file.write("]\n") json_file.close() print("\nScript executed successfully")
Step 5. Questions
You should create charts for the following questions
- Create a bar chart to display the number of customers each employee has
- Create a bar chart to show the top 10 products by name that appear in the most orders. Not units ordered, but number of orders. For example, if product A had orders of 5, 5, and 10 units, that’s three orders because three orders were placed.
- Create a bar chart to show the top 10 products by name with have the highest dollar value in orders?
- Create a Bar Chart showing the order numbers with the 10 highest totals?
- Create a horizontal bar graph showing which customers (by name) placed the most orders.
- Create a pie chart showing payments collected by each year
- Create a bar graph showing payments collected each by month in 2004
- Create a line chart showing payments received by day in December 2004. Note, there can be multiple payments received in a day
- Create a bar graph showing the names of the 10 customers Classic Model Company has collected the most payments from.
- Create a pie chart showing customers by state
- Create a bar chart showing which employees manages the most employees
- Show the top 7 employee names (first and last) who have taken the most orders
- Create a bar chart to display total payments received by country.
- Display in a bar chart the employee names and the total dollar value of orders they accounted for.
You do not need to create charts for the following questions. You only need to develop the queries and submit them in a file named project-queries.js
- How many employees report Mary Patterson and William Patterson?
- Calculate the total payments made by Marseille Mini Autos
- Show the dollar value (buyPrice * quantityInStock)of each product in inventory
- Show the total dollar value (buyPrice * quantityInStock)of all products in inventory
- Dislpay the customer name and the contact person's first and last name for customers with total payments of $100,000 or greater
- Display the customer name, employee name, and order amount for orders of $30,000 or greater
- Display the names of employees who work in offices outside of the United States.
- Display the customer name and order number for orders that shipped within 5 days of the required date
Step 6. What to Submit
Submit a zip folder of your python files, image files, json files, and .js files to Canvas. Your zip folder should contain:
- Four JSON files. These files should contain the data for each mongodb collection.
- employees.json, customers.json, products.json, orders.json
- Two python files
- finalproject.py: This file should contain the python code that coverts the data from your MySQL database into MongoDB format and inserts it into the MongoDB database
- finalcharts.py: This file should contain the python code to implement the queries and produce the charts for questions 1 to 14
- One javascript file
- project-queries.js: This file should contain the MongoDB queries to answer the second group of eight questions that charts are no required for.
- Fourteen svg files
- These files represent the charts that you would have produced to answer questions 1 to 14.