Answer To: DBST 665 Assignment Due date: Sunday November 8th, 2020 · Post one file, .doc or .docx · Put your...
Dilpreet answered on Nov 08 2021
Running Head: Solving Business Problems with Big Data Solutions 1
Solving Business Problems with Big Data Solutions 15
SOLVING BUSINESS PROBLEMS WITH BIG DATA SOLUTIONS
Table of Contents
Introduction 3
The Business Problem 3
Solving the Business Problem 5
Handling the Huge Volume of Data 5
Handling the Variety of Data 6
Gaining Insights into the Data Through Analytics 7
Requirements for Solving the Problem 10
Volume of Data 10
Variety of Data 11
Legacy Source Systems to Retrieve the Required Data 11
Data Warehouse Infrastructure to Solve the Business Problem 12
The ETL Process and Data Integration to Solve the Business Problem 13
Metadata Management to Solve the Problem 14
Techniques to Perform the Analysis 15
Conclusion 15
References 16
Introduction
In the modern business environment, businesses have been revolutionising tremendously based on latest technologies and increased usage of internet. Moreover, this has been enabling business organisations all across the globe to interact with each other and with their customers and therefore, the amount of data flowing in and out of the organisations has increased exponentially. This huge amount of data flowing in and out of the business organisations has created a need for managing the big data and to come up with tools and techniques that can be used to analyse this big data. Big data in simple words can be described as large volume of data, which can be structured or unstructured in nature.
This voluminous amount of data is then used by business organisations to gain insights through analysis of the data for making informed decisions and to come up with better strategic moves. Here, an effort has been made to discuss about a business problem, which can be shared with the help of big data solutions. Analysis will be done to come up with a method, which can be used to solve the described business problem. The data required to solve the problem will also be discussed along with the legacy source systems required to retrieve the data. Further, the data warehouse structure, which can be used to solve the problem, will also be discussed. ETL process to retrieve and load the data, methods to integrate the data and use of metadata to solve the problem will also be discussed here.
Figure 1: Big Data
The Business Problem
As interactions between different business organisations and the interactions between the businesses and their customers are increasing day by day, the amount of data, which needs to be handled is increasing rapidly. Therefore, managing voluminous data is turning out to be a major business problem in the modern business environment. The speed, at which the big data is being created is quickly surpassing the rate, at which the computer and storage systems being used at these organisations is being developed and therefore, is turning out to be a major problem for businesses all across the globe. A huge explosion can be seen in the available data (Nafis & Awang, 2018).
An exponential increase in the data can be seen, which business enterprises can access. This huge data includes everything from consumer data including the personal details, the buying patterns, the choices and preferences of the consumers, reactions of the consumers in different situations and many more to the details of the competitors, details of the suppliers. A number of analysts have come up with the argument that the amount of unstructured data has been growing by nearly 55 percent to 65 percent every year and unstructured data has increased from 31% in the year 2015 to 45% in the year 2016 and to 90% in the year 2019. Relevance, volume, quality and usability are turning out to be major issues while handling unstructured data.
Unstructured data gathered from a number of sources is slowly turning out to be essential for business organisations for evaluating their performance and for gaining insights into the available data to enhance the performance of the business. This data in the form of textual data, visual data, audio data and many other forms of data is proving to be beneficial for business organisations as it helps businesses to satisfy their customers as well as to gain an edge over the competition existing in the industry. However, it has been observed that most of the businesses all across the globe are unable to manage such huge volumes and variety of data flowing into the organisation. In simple words, the volume and variety of data is not a big problem in this case, however, managing the unstructured data to generate value from it is a major concern.
Data in the recent times has been exceeding the amount of data, which can be easily stored, computed and retrieved by the business organisations. The availability of such voluminous data is not a matter concern for the business organisations; rather management of such huge amount of data is a matter of concern for business organisations operating in the modern-day business environment. It has been claimed through a number of researches that the data is expected to increase 6.6 times the distance between the moon and Earth by the year 2020. Since, this data flows in and out from a number of sources, the number of data formats has increased immensely. Data including videos, audios, data flowing from social media platforms and smart devices are a few sources of data to name here (Bhattacharjya, Ellison, Pang & Gezdur, 2018).
Though unstructured is the need for most of the business organisations in the recent times, the rise in the unstructured data is turning out to be a huge problem for the business organisations all across the globe. The rise of unstructured data can be an outcome of the increase in machine learning initiatives, which is generating machine data. Unstructured data, which is difficult to manage through structured database format, is a matter of concern. Although, this unstructured data has its internal structure, it is not being predefined through data models. Gathering, organising, and storing the unstructured data into typical databases like SQL or Excel is not an easy task. Therefore, the business organisations must focus on incorporating tools and technologies, which can help to make optimum utilisation of the available unstructured data.
Figure 2: Rising Unstructured Data
The entire problem can be summarised as:
· The rising volume of unstructured data continues its unchecked and uncontrolled growth.
· Variety of unstructured data is arising at different points of time.
· Absence of methods and ways to manage and mine unstructured data and therefore, consumption of storage capacity is being done without adding value to it. It is a difficult task to retrieve unstructured data once it is being stored into the storage systems.
Solving the Business Problem
The issue of management and proper usage of voluminous unstructured data to gain benefits from the data available is one of the primary challenges being faced by business organisations all across the globe and therefore, it is essential to pay attention and analysis to comeup with solutions to this problem. Some of the approached, which can be adopted to solve this business problem, are:
Handling the Huge Volume of Data
In a data driven business environment, enormous amount of data is being generated by business organisations in different forms and types. Though this unstructured data plays a significant role in gaining insights, managing such huge volumes of data with greater efficiency and effectiveness is turning out to be a major issue for business organisation. In this case, most of the unstructured data by organisations is stored into databases. Also, enterprise content management systems can be used by business organisations in order to manage the complete life cycle of the content. This can be used to handle volumes of web content, document content and other forms of media.
Content management systems consists of methods, tools and strategies, which can be used for capturing, managing, storing and preserving of the content, which can be used for the purpose of optimising the operations and processes of the business. Enterprise content management systems have been providing the business organisations with a number of facilities like document management, records management, imaging, workflow management, web content management, and collaboration, therefore, assisting business organisations to manage enormous unstructured data with more efficiently.
Business organisations cannot control the generation of such voluminous amount of unstructured data as business organisations are becoming increasingly dependent on business analytics, machine learning and artificial intelligence needs to support the business decisions and to come up with strategic moves to lead the business on the path of success. In such a situation, it has become essential for business organisations to come up with methods and tools that can help the business to continuously scale up their storage systems to handle their ever-increasing needs for the data to meet the goals of their business. A number of companies have come up with purpose-built appliances to deal with huge volumes of unstructured data so that they can be used wisely for making effective decisions. These purpose-built appliances have been built making use of protocols like NFS, S3 and SMB.
Figure 3: Huge Volumes of Unstructured Data
Handling the Variety of Data
Since,...