Please read the instructions carefully before starting the project. This is a commented Jupyter IPython Notebook file in which all the instructions and tasks to be performed are mentioned. · Blanks...

[email protected]


Please read the instructions carefully before starting the project. This is a commented Jupyter IPython Notebook file in which all the instructions and tasks to be performed are mentioned. · Blanks '___' are provided in the notebook that needs to be filled with an appropriate code to get the correct result. With every '___' blank, there is a comment that briefly describes what needs to be filled in the blank space. · Identify the task to be performed correctly, and only then proceed to write the required code. · Fill the code wherever asked by the commented lines like "# write your code here" or "# complete the code". Running incomplete code may throw error. · Please run the codes in a sequential manner from the beginning to avoid any unnecessary errors. · Add the results/observations (wherever mentioned) derived from the analysis in the presentation and submit the same. Any mathematical or computational details which are a graded part of the project can be included in the Appendix section of the presentation. Context An online news portal aims to expand its business by acquiring new subscribers. Every visitor to the website takes certain actions based on their interest. The company plans to analyze these interests and wants to determine whether a new feature will be effective or not. Companies often analyze users' responses to two variants of a product to decide which of the two variants is more effective. This experimental technique is known as a/b testing that is used to determine whether a new feature attracts users based on a chosen metric. Objective Suppose you are hired as a Data Scientist in E-news Express. The design team of the company has created a new landing page. You have been assigned the task to decide whether the new landing page is more effective to gather new subscribers. Suppose you randomly selected 100 users and divided them equally into two groups. The old landing page is served to the first group (control group) and the new landing page is served to the second group (treatment group). Various data about the customers in both groups are collected in 'abtest.csv'. Perform the statistical analysis to answer the following questions using the collected data. 1. Explore the dataset and extract insights using Exploratory Data Analysis. 2. Do the users spend more time on the new landing page than the existing landing page? 3. Is the conversion rate (the proportion of users who visit the landing page and get converted) for the new page greater than the conversion rate for the old page? 4. Does the converted status depend on the preferred language? [Hint: Create a contingency table using the pandas.crosstab() function] 5. Is the time spent on the new page same for the different language users? *Consider a significance level of 0.05 for all tests. The idea behind answering these questions is to decide whether the new page is effective enough to gather new subscribers for the news portal. We will perform the statistical analysis on the collected data to make the business decision. Data Dictionary 1. user_id - This represents the user ID of the person visiting the website. 2. group - This represents whether the user belongs to the first group (control) or the second group (treatment). 3. landing_page - This represents whether the landing page is new or old. 4. time_spent_on_the_page - This represents the time (in minutes) spent by the user on the landing page. 5. converted - This represents whether the user gets converted to a subscriber of the news portal or not. 6. language_preferred - This represents the language chosen by the user to view the landing page.
Mar 17, 2022
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here