Youwill be provided a data set that will create a full data pipeline in Python. This will include ingest, storage, cleansing, preprocessing, and utilization. The utilization of these data may be in a...

1 answer below »



Youwill be provided a data set that will create a full data pipeline in Python. This will include ingest, storage, cleansing, preprocessing, and utilization. The utilization of these data may be in a descriptive manner (e.g., simple report) or predictive manner (e.g., machine learning algorithm).




During your final IP, you will utilize the data set provided byData.Gov. This data set provides demographics by zip code and is available in many formats, such as CSV, JSON, XML, and RDF. For this assignment, complete a Python program that covers the following:







  • Ingest the data through 1 of the provided formats.



  • Create a data structure to store the data.



  • Conduct any cleansing and preprocessing needed



  • Create either a descriptive report or a predictive algorithm using the data.






Once the Python program is complete, create a Microsoft Word document that describes the data pipeline and how this code could be used in the future, such as in utility planning, safety services, or many other viable options.

Answered Same DayNov 27, 2022

Answer To: Youwill be provided a data set that will create a full data pipeline in Python. This will include...

Baljit answered on Nov 28 2022
47 Votes
In [ ]:
Reading of the File into data frame ¶
In [16]:
In [7]:
Different Datatypes In Data set
Out[7]:
JURISDICTION
NAME
COUNT
PARTICIPANTS
COUNT
FEMALE
PERCENT
FEMALE
COUNT
MALE
PERCENT
MALE
COUNT
GENDER
UNKNOWN
PERCENT
GENDER
UNKNOWN
0 10001 44 22 0.50 22 0.50 0 0
1 10002 35 19 0.54 16 0.46 0 0
2 10003 1 1 1.00 0 0.00 0 0
3 10004 0 0 0.00 0 0.00 0 0
4 10005 2 2 1.00 0 0.00 0 0
5 rows × 46 columns
import pandas as pd
import matplotlib.pyplot as plt
data=pd.read_csv('Demographic_Statistics_By_Zip_Code.csv')
data.head()
In [11]:
Data Cleansing and Preprocessing
Out[11]: JURISDICTION NAME int64
COUNT PARTICIPANTS int64
COUNT FEMALE int64
PERCENT FEMALE float64
COUNT MALE int64
PERCENT MALE float64
COUNT GENDER UNKNOWN int64
PERCENT GENDER UNKNOWN int64
COUNT GENDER TOTAL int64
PERCENT GENDER TOTAL int64
COUNT PACIFIC ISLANDER int64
PERCENT PACIFIC ISLANDER float64
COUNT HISPANIC LATINO int64
PERCENT HISPANIC LATINO float64
COUNT AMERICAN INDIAN int64
PERCENT AMERICAN INDIAN float64
COUNT ASIAN NON HISPANIC int64
PERCENT ASIAN NON HISPANIC float64
COUNT WHITE NON HISPANIC int64
PERCENT WHITE NON HISPANIC float64
COUNT BLACK NON HISPANIC int64
PERCENT BLACK NON HISPANIC float64
COUNT OTHER ETHNICITY int64
PERCENT OTHER ETHNICITY float64
COUNT ETHNICITY UNKNOWN int64
PERCENT ETHNICITY UNKNOWN float64
COUNT ETHNICITY TOTAL int64
PERCENT ETHNICITY TOTAL int64
COUNT PERMANENT RESIDENT ALIEN int64
PERCENT...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here