I need to complete the following using Python; create a script that is going to do the following:1....

Question

I need to complete the following using Python; create a script that is going to do the following:

1. The script should be able to read files stored in a shared drive location extract them and process them. These files will likely be in the hundred on a daily basis and it will need to automatically read and process them every few hours.

2. Once extracted and processes I need this script to append them all together .

3. Once the files are appended I need them to go through some checks to validate the information on the files are correct with what is in our system .(PO validation, date validation, etc) We have an open orders report that is stored in a location we can leverage it join the data and find matches .

4. The Purchase Orders or request that do match with what's in the system we can save all together in a new excel file in the share drive .

5. The ones that don't match we would need to store in a separate excel document .

Sheet2 Customer PO Number (or):Niagara Pickup Number (or):Pickup Appointment Date:Pickup Appointment Time (Early):Pickup Appointment Time (Late):Carrier Name:Carrier Email:Ref Value (If Needed):Preload / Non Preload: 1234561234562021-03-28 Windows User: yyyy-mm-dd 14:00:00 Windows User: hh:mm:ss18:00:00 Windows User: hh:mm:ssNon Preload Instructions 1. for this point when I type read files I mean that there will be saved excel docs xlsx in a designated folder I need this script to process and read all these docs and show me what are the value in these cell 2. Once it extracts all these xlsx docs it will have to append them all together 3. Once appended the data found in the documents will have to go through some validation checks. I need it to check if the Purchase Order number is in my system or the delivery number which is the first two columns. I also need it to check if column 3,4,5 is in the correct format. the correct format for column 3 is (yyyy-mm-dd) then 4 & 5 is (hh:mm:ss) 4. Once we go through those checks I will send you the open orders data report that has this information to check it against. I need the ones that pass saved in an excel document separately. I only need the ones that passed the checks saved pretty much if that information on that request matches with what is in the open orders sheet then that is a pass and I would need it saved and appended in a single excel document 5. Anything that did not match as in the PO number or pickup number with what we have in the open orders then that would get saved in a separate excel file so we can communicate the customers back saying hey the information you gave to us is incorrect

cpu-request-qbjnjuyy.xlsx inst-kghj13hq-32xxhqe1.docx

Sandeep Kumar · Accepted Answer

py_xls/append excel files.ipynb
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd
",
    "import glob 
",
    "import os
",
    "
"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "2"
      ]
     },
     "metadata": {},
     "execution_count": 29
    }
   ],
   "source": [
    "path, dirs, files = next(os.walk("orders"))
",
    "file_count = len(files)
",
    "file_count"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "path = 'orders'
",
    "files = glob.glob(path + "/*.xlsx")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
    "files1 = []"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "output_type": "error",
     "ename": "KeyboardInterrupt",
     "evalue": "",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mKeyboardInterrupt\u001b[0m                         Traceback (most recent call last)",
      "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m
\u001b[1;32m----> 1\u001b[1;33m \u001b[0mxlsm\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mpd\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mread_excel\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'orders/open-orders-data.xlsm'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0msheet_name\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;32mNone\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m
\u001b[0m",
      "\u001b[1;32m~\anaconda3\lib\site-packages\pandas\util\_decorators.py\u001b[0m in \u001b[0;36mwrapper\u001b[1;34m(*args, **kwargs)\u001b[0m
\u001b[0;32m    294\u001b[0m                 )
\u001b[0;32m    295\u001b[0m                 \u001b[0mwarnings\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mwarn\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mmsg\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mFutureWarning\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mstacklevel\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mstacklevel\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m
\u001b[1;32m--> 296\u001b[1;33m             \u001b[1;32mreturn\u001b[0m \u001b[0mfunc\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m*\u001b[0m\u001b[0margs\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m
\u001b[0m\u001b[0;32m    297\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m
\u001b[0;32m    298\u001b[0m         \u001b[1;32mreturn\u001b[0m \u001b[0mwrapper\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m
",
      "\u001b[1;32m~\anaconda3\lib\site-packages\pandas\io\excel\_base.py\u001b[0m in \u001b[0;36mread_excel\u001b[1;34m(io, sheet_name, header, names, index_col, usecols, squeeze, dtype, engine, converters, true_values, false_values, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, parse_dates, date_parser, thousands, comment, skipfooter, convert_float, mangle_dupe_cols)\u001b[0m
\u001b[0;32m    302\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m
\u001b[0;32m    303\u001b[0m     \u001b[1;32mif\u001b[0m \u001b[1;32mnot\u001b[0m \u001b[0misinstance\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mio\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mExcelFile\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m
\u001b[1;32m--> 304\u001b[1;33m         \u001b[0mio\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mExcelFile\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mio\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mengine\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mengine\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m
\u001b[0m\u001b[0;32m    305\u001b[0m     \u001b[1;32melif\u001b[0m \u001b[0mengine\u001b[0m \u001b[1;32mand\u001b[0m \u001b[0mengine\u001b[0m \u001b[1;33m!=\u001b[0m \u001b[0mio\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mengine\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m
\u001b[0;32m    306\u001b[0m         raise ValueError(
",
      "\u001b[1;32m~\anaconda3\lib\site-packages\pandas\io\excel\_base.py\u001b[0m in \u001b[0;36m__init__\u001b[1;34m(self, path_or_buffer, engine)\u001b[0m
\u001b[0;32m    865\u001b[0m         \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_io\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mstringify_path\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mpath_or_buffer\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m
\u001b[0;32m    866\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m
\u001b[1;32m--> 867\u001b[1;33m         \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_reader\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_engines\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mengine\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_io\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m
\u001b[0m\u001b[0;32m    868\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m
\u001b[0;32m    869\u001b[0m     \u001b[1;32mdef\u001b[0m \u001b[0m__fspath__\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m
",
      "\u001b[1;32m~\anaconda3\lib\site-packages\pandas\io\excel\_xlrd.py\u001b[0m in \u001b[0;36m__init__\u001b[1;34m(self, filepath_or_buffer)\u001b[0m
\u001b[0;32m     20\u001b[0m         \u001b[0merr_msg\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;34m"Install xlrd >= 1.0.0 for Excel support"\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m
\u001b[0;32m     21\u001b[0m         \u001b[0mimport_optional_dependency\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m"xlrd"\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mextra\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0merr_msg\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m
\u001b[1;32m---> 22\u001b[1;33m         \u001b[0msuper\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m__init__\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfilepath_or_buffer\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m
\u001b[0m\u001b[0;32m     23\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m
\u001b[0;32m     24\u001b[0m     \u001b[1;33m@\u001b[0m\u001b[0mproperty\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m
",
      "\u001b[1;32m~\anaconda3\lib\site-packages\pandas\io\excel\_base.py\u001b[0m in \u001b[0;36m__init__\u001b[1;34m(self, filepath_or_buffer)\u001b[0m
\u001b[0;32m    351\u001b[0m             \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mbook\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mload_workbook\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfilepath_or_buffer\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m
\u001b[0;32m    352\u001b[0m         \u001b[1;32melif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfilepath_or_buffer\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mstr\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m
\u001b[1;32m--> 353\u001b[1;33m             \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mbook\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mload_workbook\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfilepath_or_buffer\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;

I need to complete the following using Python; create a script that is going to do the following: 1. The script should be able to read files stored in a shared drive location extract them and process...

Answer To: I need to complete the following using Python; create a script that is going to do the following: 1....

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment