BUSA20001 Visualisation and Data Wrangling Assignment 3 BUSA20001 2020 S2 1 Assignment 3 Overview This assignment is worth 20% of the overall subject grade. . Only one person per group must upload one single zip file on behalf of the entire group. Your submission must consist of your summary report (*.pdf), and all supporting files used to prepare it: • Python Notebook (*.ipynb). • Tableau Packaged This assignment will allow you to develop your ability to analyse data and present information to support operational and strategic decision making using a variety of technologies such as databases, programming languages and visualisation software. More specifically: • Communicate, using a mix of visualizations, a coherent, well-sequenced summary of a data analysis to inform an audience of the key findings of that analysis. • Interpret and write a summary of the noteworthy patterns, trends and exceptions identified in a set of data. • Create simple line, bar and column charts to visually display data. • Select which chart types are relevant to each type of analytical inquiry. • Use data visualizations to compare and contrast aggregate measures between subsets of data, or over time, in order to identify patterns, trends and exceptions present in data. • Create a Jupyter Notebook project, and manage the files associated with that project. • Write Python code that loads data from storage, processing and transforming as necessary, to output summary tables and charts that meet a defined specification. • Write Python code to create estimated forecasts of future events. • Test written Python code to ensure it handles anomalies present in the data, such that the desired results are not distorted, skewed or otherwise contain confusing or misrepresented aspects. Preparation Download and install Tableau Desktop software. Obtain and enter a license key to register Tableau for educational use (see subject page in Canvas for more detailed instructions). Download and install the Anaconda integrated development environment. Install and configure Anaconda extensions in Visual Studio Code (see subject page in Canvas for more detailed instructions). BUSA20001 Visualisation and Data Wrangling Assignment 3 BUSA20001 2020 S2 2 Assignment 3 Required As the company heads towards the end of the year, the CEO of Muesli Ltd has asked you to assist with making month-end financing decisions – specifically cash flow forecasting to determine either the available cash that could be used to pay off debt, or the need to take on more debt to cover any cash shortage. As part of this analysis, they would also like to get a better understanding of customer payment behaviour. They would like you to deliver a 2-page (physical, including all charts and texts) summary report that addresses the following requirements: 1. Analysis 1.1. A review of weekly revenues earned for the 12 weeks preceding the report date (any Friday), for each of the three customer channels, including the average per week across this period. 1.2. A review of weekly cash related expenditures for the 12 weeks preceding the report date, for each expense category, including the average weekly expenditure. 1.3. An aging by week (0-7, 7-14, 14-21 days etc.) and channel, of accounts receivables outstanding as of the report date. 1.4. A review of the payment behaviour (days to pay) for each channel, including the average time to pay. Muesli Ltd currently offers Net 30 Days payment terms to all customers. Use all available data. 1.5. The balance of accounts payable outstanding at close of business on the report date. 1.6. The current balance of cash on hand at close of business on the report date. Assume that all journal postings for the financial year are in the data, and the opening balance for each account was 0. You may also assume that there will be no changes to the chart of accounts for the foreseeable future. 2. Cash Forecast 2.1. The forecast period starts at the report date, with the cash balance at close of business on that date as the opening cash balance. 2.2. The forecast period is for 9 weeks following the report date. For each forecast week, compute the estimated cash balance at end of week: 2.2.1. Use the average time to pay for each channel to estimate the payment date for each invoice outstanding. Then include these payments as cash inflows. 2.2.2. Use the average revenues and average time to pay for each channel to estimate the average weekly cash inflows from future sales. 2.2.3. Use the total average weekly cash expenditure to estimate weekly cash outflows. 2.2.4. Assume that outstanding accounts payable will be paid the first forecast week. Ignore any other future purchases. 2.3. Present the forecast graphically, using date labels (as opposed to numbers such as 1 to X for each of the forecast weeks). 3. Recommendation With reference to your cash forecast chart, recommend to the CEO whether to repay, extend or leave unchanged the balance of the bank loan on November 27th. Your recommendation must include a specific amount (rounded down to nearest multiple of $100K), such that the cash balance won’t fall below 0 by the end of the year (allowing for uncertainty such as revenue variations, the need to purchase raw materials etc). BUSA20001 Visualisation and Data Wrangling Assignment 3 BUSA20001 2020 S2 3 Assignment 3 4. Technical Considerations It is likely that you will be asked to perform this task again; quarterly or even monthly. You will want everything to be as automated as possible, so that you can rerun the analysis and forecasting using the latest data in just a couple of minutes, and then after a quick review of the newly produced output, spend another 10 minutes updating your report and recommendations. All up, each time you repeat the task should take no more than 10-15 minutes. To that aim, all constraints such as the report date, analysis and forecast periods should be parameterized, and all computations automated. Parameterized dates will also allow for ease of development and testing (use most recent Friday). There should be minimal need for manual computation, and absolutely NO USE of spreadsheeting software such as Excel to perform any calculations. Outside of that, you are free to build your solution in any way you like, using any combination of SQL, Python and Tableau to perform the necessary operations and produce data visualizations. It is recommended that you try to do as much as possible with Tableau and SQL, and as little as possible with Python. Leverage the work done in the previous assignments (especially the SQL from assignment 1 to get cash balance, expenditures, accounts receivable etc.) On the other hand, you MUST at minimum, use Python to create your forecast based on the requirements given. Do not use any of the Tableau forecasting features. You should use Tableau to visualise the forecast data generated by your Python code. CSV data files produced by running Python scripts, that are then imported/linked into Tableau is acceptable and expected, provided those intermediate files are used as is, and not edited before being used in Tableau. Please see Appendix 1 – Chart of Accounts for information about the company’s hierarchy of accounts. Please see Appendix 2 – Database Documentation for the Entity Relationship Diagram and related information about the data that is maintained in the company’s systems. BUSA20001 Visualisation and Data Wrangling Assignment 3 BUSA20001 2020 S2 4 Assignment 3 Marking Guide Weight (%) Simplicity of End-to-End Solution • There is a high degree of automation, with minimal manual processing or intervention required. • The implementation is simple and tidy. The number of files is minimal. The number of data sources is minimal, and they are well named. Data sources do not contain overly redundant elements. Dimensions, measures, and parameters are organised and well named. Worksheet tabs are well organised and well named. 15 Summary Report – Design & Analysis • The sequence and flow of the report is logical and easy to digest, progressing in an orderly way that invites interest rather than mechanical progression. • The final recommendation is easily understood and directly linked to included charts and/or text. The final recommendation is reasonable, allowing for uncertainty in the forecasting model. • The report text is a fair summation or conclusion of the accompanying charts and tables - there are no misinterpretations. • The text is professionally written, and mostly free from spelling mistakes and grammatical errors. Sentences are well-constructed, and of reasonable length. • Placement of text and charts on individual pages assists the reader to assimilate the findings. A single readthrough should be all that is required to understand most of the key messages. 20 Summary Report – Charts and Tables • Highlight findings or conclusions, present them in a readily perceivable way, and not overloaded trying to support multiple purposes or messages. • Accurate, free of computational, aggregational or data importation / filtering errors. • Proportions are accurate and well chosen; intervals are equidistant for interval and ratio data. • 6 to 12-word descriptive titles are well chosen. Subtitles provide additional, useful information where necessary. • Font choices are similar, and text size is hierarchical, readable, and horizontal. • Data are labelled directly and intentionally ordered. • Look and feel is consistent. The colour scheme is intentional and chosen to convey semantic meaning, highlight key patterns or emphasize key elements to focus attention on specific areas. • Free from unneeded decoration; redundant labels; over-precision; distracting grid lines, tick marks or axis lines 30 BUSA20001 Visualisation and Data Wrangling Assignment 3 BUSA20001 2020 S2 5 Assignment 3 Python Notebook – Correctness • The program does what it is required to. • All analysis is produced correctly and free from computational error. • Data is acquired and stored in the correct structure, and free from error. Missing values, where not available or not applicable, are handled properly. • Data anomalies are accounted for. The logic does not make unreasonable assumptions. • Control variables to set parameters each time the program is to be run are together, early in the code, well named and well documented. 15 Python Notebook – Efficiency • Code is free from duplication, in that wherever possible data is fetched, prepared, and transformed just once and reused as needed to generate final outputs. • There is no evidence of copy and paste of code, or multiple sections of code that repeat remarkably similar logic with only minor differences. • Code is separated into blocks with well documented purpose, with data prep and analysis clearly distinguished from formatting and final output. 10 Python Notebook – Planning, Structure and Ease of Understanding • There is evidence of preliminary planning. Introductory mark-up or comments describe the overall purpose, strategy, and structure of the program. • Code and comments form part of a seamless whole. • Comments are used to describe the why, not the how of code, and aid understanding. • Code is a pleasure to read, and easy to understand. • Variables and functions are named concisely, but descriptively. • Thoughtful problem decomposition breaks the problem into independent pieces that can be solved independently and assembled into the final solution. 10 BUSA20001 Visualisation and Data Wrangling Assignment 3 BUSA20001 2020 S2 6 Assignment 3 Appendix 1 – Chart of Accounts Account Group Account Category Account Numbers Balance Sheet – Assets Fixed Assets 1000, 2000, 11000 Accumulated Depreciation 2010, 11010 Current Assets 113300, 140000, 300000, 792000 Balance Sheet – Liabilities Debts & Payables 113101, 160000 Balance Sheet – Equity Common Stock 70000 Balance Sheet – Misc. Clearing Accounts 191100 Income Statement – Income Sales Revenues 800000 Income Statement – Expenses Depreciation 211120, 211130 Manufacturing 400000, 500000, 510000 Logistics 472000, 478100 Financing 476900 Sales and Administration 477001 – 477036, 520000 Consulting 478000 Income Statement – Misc. Perpetual Inventory 893010, 895000 BUSA20001 Visualisation and Data Wrangling Assignment 3 BUSA20001 2020 S2 7 Assignment 3 Appendix 2 – Database Documentation The three customer channels, based on size, are Hypermarkets (10), Grocery Stores (12) and Corner Stores (14). Arrangements will be made such that you will be able to connect to live data in the company systems.