Part I :
SQL Review
Complete the exercises in the following lab (see attached file)
Part II:
Dimensions and Facts
Scenario:A car rental company needs to improve its business operations over the next 5 years. They would like to determine the profitability of the organization by assessing their busiest time period for rental of their vehicles.
- Design a data mart with the associated dimension and fact tables to measure potential rental profitability based on specified time period (month, quarter,year)
- The grain of the data should be at the customer level
I needed the complete Visual Studio ETL package for the solution.
CIS 345 Java Programming BCS 425 Business Intelligence& Data Warehousing - Week 2 – LAB 1 Today’s Lecture Create new schemas, load data into schemas REVIEW – Joins, Primary/Foreign Keys Views Backup/Restore an SQL Server Database LAB - Create Full Database Review - Create Tables The schemas for the tables based on the schemas defined in the next slides… SQL Server Components Create a table that will hold customer data. Name the table “Customer”. The table should use the following schema: Column NameData TypeAllow Nulls CustomerNumIntNo FirstNameNvarchar(50)No LastNameNvarchar(50)No DVD Invoice Invoice Detail Column NameData TypeAllows Nulls InvoiceDetailNumIntNo InvoiceNumIntNo DVDSKUIntNo QuantityIntNo PriceOfDVDIntNo TotalAmountIntNo Column NameData TypeAllows Nulls SKUIntNo NameNvarchar(200)No PriceDecimal(18,2)No Column NameData TypeAllows Nulls InvoiceNumIntNo CustomerNumIntNo DateDatetimeNo Load the Database Run the following sql: SELECT * FROM InvoiceDetail What’s wrong with the data in this table? Load the Database The PriceOfDVD and TotalAmount columns in the InvoiceDetail table were created as INTEGER columns, so the cents after the dollar were lost. 7.99 became 7 40.47 became 40 To fix this, uncomment and run the ALTER statements at the bottom of the CREATE script, then rerun the LOAD script. Load the Database Create Two additional tables using the following schemas: Load the tables with data (see Slide 10) EMPLOYEES HOURS WORKED Column NameData TypeAllows Nulls EmpIdIntNo DateDateTimeNo HoursIntNo Column NameData TypeAllows Nulls EmpIdIntNo FirstNameNvarchar(200)No LastNameNvarchar(200)No Joins - REVIEW INNER Joins A inner join selects only rows with common values in their common columns. For example… What would be the result of an inner join of these two tables? Which column would you join on? Employees HoursWorked Joins - REVIEW EmpIdDateHours 1009/12/20138 1009/13/201310 1019/12/20139 1019/13/20137 EmpIdFirstNameLastName 100RyanWilson 101JohnSmith 102JamesDoe Joins - REVIEW Employees INNER JOIN HoursWorked EmpIdFirstNameLastNameEmpIdDateHours 100RyanWilson1009/12/20138 100RyanWilson1009/13/201310 101JohnSmith1019/12/20139 101JohnSmith1019/13/20137 Joins - REVIEW Employees INNER JOIN Grades Go to the first row of the left hand side table. Match that row against every row of the right hand side table. Output every row where there is a match in the common columns. Employees HoursWorked Match against all rows! EmpIdDateHours 1009/12/20138 1009/13/201310 1019/12/20139 1019/13/20137 EmpIdFirstNameLastName 100RyanWilson 101JohnSmith 102JamesDoe Joins - REVIEW Employees INNER JOIN Grades Go to the second row of the left hand side table. Match that row against every row of the right hand side table. Output every row where there is a match in the common columns. Employees HoursWorked Match against all rows! EmpIdFirstNameLastName 100RyanWilson 101JohnSmith 102JamesDoe EmpIdDateHours 1009/12/20138 1009/13/201310 1019/12/20139 1019/13/20137 Joins - REVIEW Employees INNER JOIN Grades Go to the third row of the left hand side table. Match that row against every row of the right hand side table. Output every row where there is a match in the common columns. Done!!! Employees HoursWorked Match against all rows! EmpIdDateHours 1009/12/20138 1009/13/201310 1019/12/20139 1019/13/20137 EmpIdFirstNameLastName 100RyanWilson 101JohnSmith 102JamesDoe Joins - REVIEW Employees INNER JOIN HoursWorked Question Why doesn’t the employee with EmpID = 102 appear in the output? EmpIdFirstNameLastNameEmpIdDateHours 100RyanWilson1009/12/20138 100RyanWilson1009/13/201310 101JohnSmith1019/12/20139 101JohnSmith1019/13/20137 Joins - REVIEW No match for EmpId= 102. What if we wanted the output to show all employees, even if they didn’t work any hours? Is there a join for that case? Employees HoursWorked EmpIdDateHours 1009/12/20138 1009/13/201310 1019/12/20139 1019/13/20137 EmpIdFirstNameLastName 100RyanWilson 101JohnSmith 102JamesDoe Joins - REVIEW Yes! You can use a Left Outer Join. SQL Queries - REVIEW How do we write a query that will show a students’ names and grades (all columns may appear)? Problem: Students’ names and grades are in different tables. What can we do? SQL Queries - REVIEW Need to use an inner join between the Students and Grades tables. You need to join on the StudentId field. SELECT * FROM Students s INNER JOIN Grades g ON s.StudentId = g.StudentId SQL Queries Write a query that will show the total number of classes offered by the Computer Systems department. Hint: How do you count the number of items using SQL? SQL Queries The COUNT aggregate operator can be used to find the number of items in a column. SELECT COUNT(ClassId) AS ‘Total' FROM Classes SQL Queries Write a query that will show the total number of classes that are 300 level or above offered by the Computer Systems department. SQL Queries Only those rows that meet the “WHERE” condition are included in the count. SELECT COUNT(c.ClassId) AS ‘Total' FROM Classes c WHERE c.ClassId >= 300 SQL Queries Write a query that will count the number of classes taken by each student. SQL Queries “GROUP BY” separates the classes into groups using the StudentId. COUNT will calculate a total for each group. SELECT g.StudentId, COUNT(g.ClassId) AS 'Total Courses Taken' FROM Grades g GROUP BY g.StudentId SQL Queries Write a query that will count the number of classes taken by each student showing the student’s last and first name. SQL Queries Looks complicated but all we did was add an INNER JOIN between Students and Grades. Need that INNER JOIN to get the student’s first and last names otherwise we only have the StudentID. SELECT s.FirstName, s.LastName, COUNT(g.ClassId) as 'Total' FROM Grades c INNER JOIN Students s ON g.StudentId = s.StudentId GROUP BY s.FirstName, s.LastName Primary and Foreign Keys - REVIEW Joins are often done on Key columns. Primary Key – A column (or set of columns) that uniquely identifies any given row in the table. What do we mean by “uniquely”? Primary and Foreign Keys - REVIEW What should the primary key be for the following table? Employees EmpIdFirstNameLastName 100RyanWilson 101JohnSmith 102JamesDoe Primary and Foreign Keys - REVIEW EmpId is the primary key. Employees Could LastName be the primary key in this table? Why don’t we use the SocSecNumber as the key? PK stands for “Primary Key” SocSecNumber is a “natural” key EmpId (PK)FirstNameLastNameSocSecNumber 100RyanWilson123-55-6789 101JohnSmith133-42-3987 102JamesDoe245-38-2090 Primary and Foreign Keys What should the primary key be for this table? Classes ClassIdNameCredits 101Program…3 260Database3 345Java…3 425Decision…3 Primary and Foreign Keys ClassId is the primary key. Classes ClassId (PK)NameCredits 101Program…3 260Database3 345Java…3 425Decision…3 Primary and Foreign Keys - REVIEW What should the primary key be for this table? HoursWorked EmpIdDateHours 1009/12/20138 1009/13/201310 1019/12/20139 1019/13/20137 Primary and Foreign Keys - REVIEW (EmpId, Date) is a composite primary key. HoursWorked Together they uniquely identify a row EmpId (PK)Date (PK)Hours 1009/12/20138 1009/13/201310 1019/12/20139 1019/13/20137 Primary and Foreign Keys - REVIEW May choose to use a SURROGATE primary key. HoursWorked Used instead of EmpId + Date HoursWorkedSk (PK)EmpIdDateHours 11009/12/20138 21009/13/201310 31019/12/20139 41019/13/20137 Primary and Foreign Keys You have been asked to put together a table to store student grades. What would be the primary key in the following table? Would this design meet the need? Grades StudentIdClassIdGrade 1101A 1260C+ 1345B 1425A 2101B+ 2260A Primary and Foreign Keys (StudentId, ClassId) is the primary key. Grades Do they uniquely identify a row? StudentIdClassIdGrade 1101A 1260C+ 1345B 1425A 2101B+ 2260A Primary and Foreign Keys (StudentId, ClassId) is the primary key. Grades What happens if a student retakes a class? Will the PK still work? Retake? StudentIdClassIdGrade 1101A 1260C+ 1345B 1425A 2101B+ 2260A Primary and Foreign Keys (StudentId, ClassId) as the primary key will not work if a student retakes a class! Grades Now what do we do? PK no longer uniquely identifies a row! StudentId(PK)ClassId(PK)Grade 1101A 1260D+ 1260A 1345B 1425A 2101B+ 2260A Primary and Foreign Keys (StudentId, ClassId, Semester, Year) is the primary key. Grades Add new columns to the table. StudentId(PK)ClassId(PK)Semester(PK)Year(PK)Grade 1101Spring2013A 1260Spring2013D+ 1260Fall2013A 1345Spring2013B 1425Spring2013A 2101Fall2012B+ 2260Fall2012A Primary and Foreign Keys - REVIEW Foreign Key – A column whose values match the primary key values in another table. Foreign Key Constraint – You can only insert values in a foreign key column if there is a matching value in the table containing the primary key column. Note: Foreign key columns may contain duplicate values. Primary and Foreign Keys Which table has a foreign key and which column is it? StudentsGrades StudentId (PK)FirstNameLastName 1JohnDoe 2JaneSmith 3DerekJeter StudentId (PK)ClassId (PK)Grade 1101A 1260D+ 1345B 1425A 2101B+ 2260A Primary and Foreign Keys StudentId is a foreign key into the Students table. Any other foreign keys? StudentsGrades StudentId in Grades is a Foreign Key (FK) StudentId (PK)FirstNameLastName 1JohnDoe 2JaneSmith 3DerekJeter StudentId (PK,FK)ClassId (PK)Grade 1101A 1260D+ 1345B 1425A 2101B+ 2260A Primary and Foreign Keys ClassId is also a foreign key. StudentsGrades Classes StudentId (PK)FirstNameLastName 1JohnDoe 2JaneSmith 3DerekJeter StudentId (PK,FK)ClassId (PK,FK)Grade 1101A 1260D+ 1345B 1425A 2101B+ 2260A ClassId (PK)NameCredits 101Prog…3 260Database3 345Java…3 425Decision…3 Questions Can you add the following row to the Students table? Why or why not? (1, ‘Maria’, ‘Lees’) Students StudentId (PK)FirstNameLastName 1JohnDoe 2JaneSmith 3DerekJeter Questions No. This would violate the primary key constraint. No duplicates are allowed in the primary key column. (1, ‘Maria’, ‘Lees’) Students StudentId (PK)FirstNameLastName 1JohnDoe 2JaneSmith 3DerekJeter Primary and Foreign Keys - REVIEW Which table has a foreign key and which column is it? HoursWorked Employees HoursWorkedSk (PK)EmpIdDateHours 11009/12/20138 21009/13/201310 31019/12/20139 41019/13/20137 EmpId (PK)FirstNameLastName 100RyanWilson 101JohnSmith 102JamesDoe Primary and Foreign Keys - REVIEW HoursWorked Employees EmpId in HoursWorked is a Foreign Key (FK) from the Employees table. EmpId (PK)FirstNameLastName 100RyanWilson 101JohnSmith 102JamesDoe HoursWorkedSk (PK)EmpIdDateHours 11009/12/20138 21009/13/201310 31019/12/20139 41019/13/20137 Primary and Foreign Keys - REVIEW Questions HoursWorked Employees Can we insert the row (5, 102, 9/12/2013, 8) into the HoursWorked table? Can we insert the row (6, 103, 9/12/2013, 8) into the HoursWorked table? EmpId (PK)FirstNameLastName 100RyanWilson 101JohnSmith 102JamesDoe HoursWorkedSk (PK)EmpIdDateHours 11009/12/20138 21009/13/201310 31019/12/20139 41019/13/20137 Primary and Foreign Keys - REVIEW Questions HoursWorked Employees Yes. (5, 102, 9/12/2013, 8) 102 exists in the PK column. No. (6, 103, 9/12/2013, 8) 103 does NOT exist in the PK column. EmpId (PK)FirstNameLastName 100RyanWilson 101JohnSmith 102JamesDoe HoursWorkedSk (PK)EmpIdDateHours 11009/12/20138 21009/13/201310 31019/12/20139 41019/13/20137 Primary and Foreign Keys - REVIEW HoursWorked Employees The foreign key constraint makes sure that certain data exists in other tables. Make sure that the employee exists before listing hours for him/her. EmpId (PK)FirstNameLastName 100RyanWilson 101JohnSmith 102JamesDoe HoursWorkedSk (PK)EmpIdDateHours 11009/12/20138 21009/13/201310 31019/12/20139 41019/13/20137 Foreign Key To create a Foreign Key Constraint in SQL Server Management Studio: Right-click the table with the foreign key(s) Click Design On the menu bar click Table Designer Relationships… Click to the right of “Tables and Columns Specification” then click the … button Select the Primary key table on the left and indicate the primary key column, then select the Foreign key table on the right and indicate the foreign key column (delete any unneeded columns) Click OK, Close, and save your table design SQL Queries - Aggregates How would you write a query that shows the total amount paid by customers for all invoices (add up all the invoice amounts)? Hint: How do you add up a column using SQL? SQL Queries - Aggregates The SUM aggregate operator can be used to find the total in a column. SELECT SUM(TotalAmount) AS 'Total' FROM InvoiceDetail Insert Data Into Tables InvoiceDetail The aggregate SUM adds up all of the values in the TotalAmount column. The Total Amount column adds up to $368.80 InvoiceDetailNumInvoice NumDVDSKUQuantityPriceOf DVDTotal Amount 11517.997.99 12517.997.99 131211.4922.98 14527.9915.98 153113.9913.99 212534.99174.95 233213.9927.98 242134.9934.99 254313.4940.47 334113.4913.49 35517.997.99 SQL Queries - Aggregates Write a query that will show the total amount for each customer by First and Last Name. SQL Queries -