I need to complete the following using Python; create a script that is going to do the following: 1. The script should be able to read files stored in a shared drive location extract them and process...

1 answer below »

I need to complete the following using Python; create a script that is going to do the following:


1. The script should be able to read files stored in a shared drive location extract them and process them. These files will likely be in the hundred on a daily basis and it will need to automatically read and process them every few hours.


2. Once extracted and processes I need this script to append them all together .


3. Once the files are appended I need them to go through some checks to validate the information on the files are correct with what is in our system .(PO validation, date validation, etc) We have an open orders report that is stored in a location we can leverage it join the data and find matches .


4. The Purchase Orders or request that do match with what's in the system we can save all together in a new excel file in the share drive .


5. The ones that don't match we would need to store in a separate excel document .




Sheet2 Customer PO Number (or):Niagara Pickup Number (or):Pickup Appointment Date:Pickup Appointment Time (Early):Pickup Appointment Time (Late):Carrier Name:Carrier Email:Ref Value (If Needed):Preload / Non Preload: 1234561234562021-03-28 Windows User: yyyy-mm-dd 14:00:00 Windows User: hh:mm:ss18:00:00 Windows User: hh:mm:ssNon Preload Instructions 1. for this point when I type read files I mean that there will be saved excel docs xlsx in a designated folder I need this script to process and read all these docs and show me what are the value in these cell 2. Once it extracts all these xlsx docs it will have to append them all together 3. Once appended the data found in the documents will have to go through some validation checks. I need it to check if the Purchase Order number is in my system or the delivery number which is the first two columns. I also need it to check if column 3,4,5 is in the correct format. the correct format for column 3 is (yyyy-mm-dd) then 4 & 5 is (hh:mm:ss) 4. Once we go through those checks I will send you the open orders data report that has this information to check it against. I need the ones that pass saved in an excel document separately. I only need the ones that passed the checks saved pretty much if that information on that request matches with what is in the open orders sheet then that is a pass and I would need it saved and appended in a single excel document 5. Anything that did not match as in the PO number or pickup number with what we have in the open orders then that would get saved in a separate excel file so we can communicate the customers back saying hey the information you gave to us is incorrect
Answered 63 days AfterMar 29, 2021

Answer To: I need to complete the following using Python; create a script that is going to do the following: 1....

Sandeep Kumar answered on Apr 15 2021
151 Votes
py_xls/append excel files.ipynb
{
"cells": [
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import glob \n",
"import os\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"2"
]
},
"metadata": {},
"execution_count": 29
}
],
"source": [
"path, dirs, files = next(os.walk(\"orders\"))\n",
"file_count = len(files)\n",
"file_count"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [],
"source": [
"path = 'orders'\n",
"files = glob.glob(path + \"/*.xlsx\")"
]
},
{
"cell_type": "code",
"execu
tion_count": 31,
"metadata": {},
"outputs": [],
"source": [
"files1 = []"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"output_type": "error",
"ename": "KeyboardInterrupt",
"evalue": "",
"traceback": [
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[1;31mKeyboardInterrupt\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mxlsm\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mpd\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mread_excel\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'orders/open-orders-data.xlsm'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0msheet_name\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;32mNone\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[1;32m~\\anaconda3\\lib\\site-packages\\pandas\\util\\_decorators.py\u001b[0m in \u001b[0;36mwrapper\u001b[1;34m(*args, **kwargs)\u001b[0m\n\u001b[0;32m 294\u001b[0m )\n\u001b[0;32m 295\u001b[0m \u001b[0mwarnings\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mwarn\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mmsg\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mFutureWarning\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mstacklevel\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mstacklevel\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 296\u001b[1;33m \u001b[1;32mreturn\u001b[0m \u001b[0mfunc\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m*\u001b[0m\u001b[0margs\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 297\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 298\u001b[0m \u001b[1;32mreturn\u001b[0m \u001b[0mwrapper\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m~\\anaconda3\\lib\\site-packages\\pandas\\io\\excel\\_base.py\u001b[0m in \u001b[0;36mread_excel\u001b[1;34m(io, sheet_name, header, names, index_col, usecols, squeeze, dtype, engine, converters, true_values, false_values, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, parse_dates, date_parser, thousands, comment, skipfooter, convert_float, mangle_dupe_cols)\u001b[0m\n\u001b[0;32m 302\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 303\u001b[0m \u001b[1;32mif\u001b[0m \u001b[1;32mnot\u001b[0m \u001b[0misinstance\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mio\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mExcelFile\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 304\u001b[1;33m \u001b[0mio\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mExcelFile\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mio\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mengine\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mengine\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 305\u001b[0m \u001b[1;32melif\u001b[0m \u001b[0mengine\u001b[0m \u001b[1;32mand\u001b[0m \u001b[0mengine\u001b[0m \u001b[1;33m!=\u001b[0m \u001b[0mio\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mengine\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 306\u001b[0m raise ValueError(\n",
"\u001b[1;32m~\\anaconda3\\lib\\site-packages\\pandas\\io\\excel\\_base.py\u001b[0m in \u001b[0;36m__init__\u001b[1;34m(self, path_or_buffer, engine)\u001b[0m\n\u001b[0;32m 865\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_io\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mstringify_path\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mpath_or_buffer\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 866\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 867\u001b[1;33m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_reader\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_engines\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mengine\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_io\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 868\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 869\u001b[0m \u001b[1;32mdef\u001b[0m \u001b[0m__fspath__\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m~\\anaconda3\\lib\\site-packages\\pandas\\io\\excel\\_xlrd.py\u001b[0m in \u001b[0;36m__init__\u001b[1;34m(self, filepath_or_buffer)\u001b[0m\n\u001b[0;32m 20\u001b[0m \u001b[0merr_msg\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;34m\"Install xlrd >= 1.0.0 for Excel support\"\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 21\u001b[0m \u001b[0mimport_optional_dependency\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"xlrd\"\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mextra\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0merr_msg\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 22\u001b[1;33m \u001b[0msuper\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m__init__\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfilepath_or_buffer\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 23\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 24\u001b[0m \u001b[1;33m@\u001b[0m\u001b[0mproperty\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m~\\anaconda3\\lib\\site-packages\\pandas\\io\\excel\\_base.py\u001b[0m in \u001b[0;36m__init__\u001b[1;34m(self, filepath_or_buffer)\u001b[0m\n\u001b[0;32m 351\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mbook\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mload_workbook\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfilepath_or_buffer\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 352\u001b[0m \u001b[1;32melif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfilepath_or_buffer\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mstr\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 353\u001b[1;33m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mbook\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mload_workbook\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfilepath_or_buffer\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 354\u001b[0m \u001b[1;32melif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfilepath_or_buffer\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mbytes\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 355\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mbook\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mload_workbook\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mBytesIO\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfilepath_or_buffer\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m~\\anaconda3\\lib\\site-packages\\pandas\\io\\excel\\_xlrd.py\u001b[0m in \u001b[0;36mload_workbook\u001b[1;34m(self, filepath_or_buffer)\u001b[0m\n\u001b[0;32m 35\u001b[0m \u001b[1;32mreturn\u001b[0m...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here