Step 1 Please explain in a paragraph what is a No SQL database specifically Mongo DB and Neo4J u Step 2 Please explain the difference between Hadoop and Spark and provide examples of each one. How do...

1 answer below »













Step 1



Please explain in a paragraph what is a No SQL database specifically Mongo DB and Neo4J u



Step 2



Please explain the difference between Hadoop and Spark and provide examples of each one. How do these tools assist in the Big Data concept? You may refer to the Figure14.6 below from the textbook or use outside references to substantiate your answer.














Research Hadoop and MongoDB/RoboMongo and answer the following questions:



  • Would you recommend using MongoDB to a CEO? Justify your response.

  • Would you recommend an alternative to MongoDB to a CEO? Justify your response.



Research Big Data Storage and Management and answer the following Questions:



  • List two Big Data Solutions and explain how they work and what the pros and cons are.

  • Evaluate the cost of the two Big Data Solutions you researched and make a recommendation for the CEO on which one you would use.















  • Complete a Comparative Analysis for MongoDB and Neo4J and discuss the Differences and Similarities of Both Databases



    Would you recommend using MongoDB or Neo4J to the CEO? Justify your recommendation.



    Report to CEO









BIAM 530WEEK 8: FINAL COURSE PROJECT INSTRUCTIONS SOFTWARE Please access all software for the Lab at: https://labs.azure.com SCENARIO You have been tasked to assist the CEO in Big Data and NoSQL projects at work. You did such an awesome for the CEO, you are now being asked to present a briefing to management on how NoSQL databases work in the company and help to improve the Big Data environment. You are expected to report back to the CEO with the following answers to the questions the CEO has posed below. STEPS Step1: Explain what is a NoSQL database, specifically focus on Mongo DB and Neo4J. You can refer to Table 14.3 below from your textbook as reference. Step 2: Please explain the difference between Hadoop and Spark and provide examples of each one. How do these tools assist in the Big Data concept? You may refer to the Figure14.6 below from the textbook and Dashboards. Map Reduce Simplification Applications Step 3: Research Hadoop and MongoDB/RoboMongo and answer the following questions: · Would you recommend using MongoDB to a CEO? Justify your response. · Would you recommend an alternative to MongoDB to a CEO? Justify your response. Step 4: Research Big Data Storage and Management and answer the following Questions: · List two Big Data Solutions and explain how they work and what the pros and cons are. · Evaluate the cost of the two Big Data Solutions you researched and make a recommendation for the CEO on which one you would use. Step 5: Please skip this step. Click on Neural Tools 7.6 software icon in Azure Labs and use Neo4J (Graphical Database) for completing the following tasks: •Upload the file:Ch14_FCC.txt file (Food Critics Database).The file is located in the Files Tab under Course Project Files or in Course Project Overview. •Please click on the links below for tutorial for NeuralTools: https://www.youtube.com/watch?v=JlligEJijng https://www.youtube.com/watch?v=aircAruvnKk outube.com/watch?v=JlligEJijnghttps://www.youtube.com/watch?v=aircAruvnKkRefer to the Figure 14.13 below from the textbook and refer to page 686-687 in the textbook and perform the following tasks: a. Creating Nodes b. Retrieving Node Data c. Retrieving Relationship Data Step 6: Complete a Comparative Analysis for MongoDB and Neo4J and discuss the differences and similarities of both databases Step 7: Would you recommend using MongoDB or Neo4J to the CEO? Justify your recommendation. Write at least 2 paragraphs to summarize the information that you have learned. You may use either version of Microsoft Word (2016 or 2019) in Azure Labs. Step8: Present the Final Report including Charts and Findings to CEO. Week 8 Final Project Rubric Category Description Points Step 1 Please explain in a paragraph what is a No SQL database specifically Mongo DB and Neo4J using Table 14.3 in the textbook. 40 Step 2 Please explain the difference between Hadoop and Spark and provide examples of each one. How do these tools assist in the Big Data concept? You may refer to the Figure14.6 below from the textbook or use outside references to substantiate your answer. 40 Step 3 Research Hadoop and MongoDB/RoboMongo and answer the following questions: · Would you recommend using MongoDB to a CEO? Justify your response. · Would you recommend an alternative to MongoDB to a CEO? Justify your response. 10 Step 4 Research Big Data Storage and Management and answer the following Questions: · List two Big Data Solutions and explain how they work and what the pros and cons are. · Evaluate the cost of the two Big Data Solutions you researched and make a recommendation for the CEO on which one you would use. 10 Step 5 Click on Neural Tools 7.6 software icon in Azure Labs and use Neo4J (Graphical Database) to complete tasks 10 Step 6 Complete a Comparative Analysis for MongoDB and Neo4J and discuss the Differences and Similarities of Both Databases 40 Step 7 Would you recommend using MongoDB or Neo4J to the CEO? Justify your recommendation. 40 Step 8 Report to CEO 40    TOTAL POINTS 230
Answered 3 days AfterJun 21, 2021

Answer To: Step 1 Please explain in a paragraph what is a No SQL database specifically Mongo DB and Neo4J u...

Neha answered on Jun 25 2021
140 Votes
Step1
The NoSQL databases stands for not only SQL and they are present in the non-tabular format. They are used to store the data but in the different manner. The NoSQL databases can be in different types on the basis of their data model. The major types for the NoSQL databases are wide column, craft document and the key value. They are used to provide the flexible schemas. They can easily skill up with a large amount of the data and also handle the high user load. Then the people use the term NoSQL database then they ty
pically use it for referring it to the non-relational databases (Macak, M., Stovcik, M., Buhnova, B., & Merjavy, M). Mongo DB and neo 4j are the two types of NoSQL databases.
The mongo DB can be defined as the open-source document database, and it is the leading type for the NoSQL database. It is basically written in C++, and it is the cross-platform data space which is document oriented and it is known for providing high performance, easy scalability and the high availability for the data. It works over the concept of document and the collection. It is the document database which has a single collection, and it will hold the different documents. The document will have size, number of the fields and the content which will be different from one document to another.
It has a clear structure for the single object, and it does not include any type of the complex joins. It provides the deep query ability. It is known for supporting dynamic queries over the documents with the help of document-based query language which has same power as the structured query language. It is easy to scale and also it does not need the mapping or the conversion for application objects to the database objects. It uses the internal memory for storing the working set and also enables the faster access of the data.
Another type of NoSQL databases neo4j and it is the popular graph database. It is known as the cipher query language, and it is written in Java language. The graph can be defined as the pictorial representation for the set of objects in which we have some pair of the connected objects with the help of links. It is composed of two major elements which are the nodes or the vertices and edges or the relationship.
The graph database can be defined as the database which is used for modelling the data in the form of graph and the notes of the graph are used to show the entities and the relationship is used to show the association for all the nodes. It is the flexible data model, and it provides the flexible but powerful data model which we can modify as per the industry or the application. It provides result on the basis of real time data, and it is highly available for the large enterprise real time applications with the help of proper transactional guarantees.
With the help of this database, we can represent the semi structured and connected data. With the help of this database, we can represent the data but also retrieve the connected data faster than the other databases. It is known for providing the declarative query language for representing the graph visually. The commands for this language are present in human readable format and they are easy to learn. It will not need any complex joins for retrieving the data as we can retrieve the adjacent node or the relationship details without using indexes or the joints.
Step2
The Hadoop started in 2006 by Yahoo and it became the top-level Apache open-source project. The general purpose of the Hadoop is to perform distributed processing which has several components. The components are Hadoop distributed file system which is used to store the files in the Hadoop native format, and it will then parallelize across the cluster, yarn is used as the schedule which will coordinate with the application from time and the MapReduce which is the algorithm for processing the data in parallel.
It is built in Java language and is assessable with the help of different programming languages. We can use any programming language for writing the MapReduce code. It is available as the open source or even through the vendors.
The spark is newer project and it started in 2012. It is the top-level project of Apache which has focus on processing the data and parallel across the cluster, but the major difference is the working in the memory. The Hadoop will read and write the files to the Hadoop distributed file system, but spark will process the data in ram with the help of concept known as the re silent distributed data set.
It can either run on the stand-alone mode, in conjunction with massage or with the Hadoop cluster which is serving as the data source. The spark has been created around the spark core which is the engine that is able to drive optimization, scheduling and the abstraction. It is also connected with the spark for the correct file system. There are different libraries which are operating on the spark core, and they will allow us to run the SQL like commands on the distributed data sets (Samadi, Y., Zbakh, M., & Tadonki, C).
Hadoop can be used when we want to have linear processing over the huge data set. The Hadoop MapReduce will help us to perform parallel processing for the huge amount of the data as it will break the large...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here