For this assignment, you will use the following three data sets. US_airlines.csv, US_airports.csv, US_airrecord.csv Using R you will prepare and explore the datasets using data cleaning and analysis...

For this assignment, you will use the following three data sets. US_airlines.csv, US_airports.csv, US_airrecord.csv

Using R you will prepare and explore the datasets using data cleaning and analysis techniques and will discuss the discovered trends and points of interest.

Steps you will complete should include:

Inspect and summarise your data.

Clean and combine datasets where appropriate

· Check for and handle missing values

· Remove any unnecessary variables

· Transform any variables that you would like to use in a different form

Plot data and identify trends and/or points of interest

Perform data analysis to investigate

· The airlines which experience the most delays

· The busiest routes

· The relationship between distance between airports and flying time

Predicting flying time based on distance

Discuss your findings, comment your code and prepare explanatory visualisations.
Some observations i have made

·
The times are as per the 24 hour clock so 10 is 00.10 and 1542 is 15.42.

·
Any time differences are in minutes.

·
There are 19 flights have a wheels off time but have a cancellation reason what do we do with them. All reasons relate to the weather and airline.

·
TAIL_NUMBER – One value needs an N put in front of it 7819A

·
Elapsed time NA values need to be calculated by AIR_TIME+TAXI_IN+TAXI_OUT however first the NA values in AIR_TIME need to be replaced with a calculation of time duration between WHEELS_OFF and WHEELS_ON

·
Need to add relevant data from

US_Airlines.csv and US_airports.csv using the IATA_COD

·
The NA values for these variables below relate to where there was no delay. The 0 values relate to where there was a delay but not for that reason. Values other than these signify how long of a delay there was for each reason. Some delays can be for more than one reason. Eg Air system and airline delay. These need some transformation.

o
AIR_SYSTEM_DELAY

o
SECURITY_DELAY

o
AIRLINE_DELAY

o
LATE_AIRCRAFT_DELAY

o
WEATHER_DELAY

the-assignment-s45nkgam-ixzhxb0u.pdf usairrecords14-v4kg14ke-joloy0ix.csv

Answered Same DayFeb 15, 2021

Answer To: For this assignment, you will use the following three data sets. US_airlines.csv, US_airports.csv,...

Rohith answered on Feb 18 2021

139 Votes

SOLUTION.PDF

Sun	Mon	Tue	Wed	Thu	Fri	Sat
30	31	1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	1	2	3

For this assignment, you will use the following three data sets. US_airlines.csv, US_airports.csv, US_airrecord.csv Using R you will prepare and explore the datasets using data cleaning and analysis...

Answer To: For this assignment, you will use the following three data sets. US_airlines.csv, US_airports.csv,...

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment