View more »
Answered 8 days AfterMay 11, 2023

Answer To: pdf

Sumit Kumar answered on May 19 2023
43 Votes
Answer at least 13 of the first 20 questions and all of the questions in the
Required Questions section below.
Question 1: - How many users are there?
File: datascience.stackexchange.db & stats.stackexchange.db: -
SQL Query: - select count (*) from users;
R Language: -
result <- dbGetQuery(con1,"select count(*) from users")
totaluser <- result[1,1]
print(totaluser)
Question 2: - How many users joined since 2020? (Hint: Convert the CreationDate to a year.)
File: datascience.stackexchange.db & stats.stackexcha
nge.db: -
SQL: - SELECT COUNT(*) AS TotalUsers
...> FROM users
...> WHERE strftime('%Y', CreationDate) >= '2020';
R: -
result <- dbGetQuery(con1,"SELECT COUNT(*) AS TotalUsers FROM users WHERE strftime('%Y', CreationDate) >= '2020';")
print(result[1,1])
Question 3: - How many users joined each year? Describe this with a plot, commenting on any anomalies.
SQL: - SELECT strftime('%Y', CreationDate) AS Year, COUNT(*) AS TotalUsers
...> FROM users
...> GROUP BY Year;
R: -
library(RSQLite)
library(DBI)
library(ggplot2)
query <- "SELECT strftime('%Y', CreationDate) AS Year, COUNT(*) AS TotalUsers FROM users GROUP BY Year"
result <- dbGetQuery(con1, query)
For Plotting
ggplot(result, aes(x = Year, y = TotalUsers)) +
geom_bar(stat = "identity", fill = "steelblue") +
labs(x = "Year", y = "Total Users", title = "Number of Users Joined Each Year") +
theme_minimal()
Question 4: - How many different types of posts are there in the Posts table? Get the description of the types from the
PostTypeIdMap table. In other words, create a table with the description of each post type and the number
of posts of that type, and arrange it from most to least occurrences.
SQL: - SELECT p.value AS TypeDescription, COUNT(*) AS NumPosts
...> FROM Posts
...> JOIN PostTypeIdMap AS p ON Posts.PostTypeId = p.id
...> GROUP BY p.value
...> ORDER BY NumPosts DESC;
R: -
> query <- "SELECT p.value AS TypeDescription, COUNT(*) AS NumPosts
+ FROM Posts
+ JOIN PostTypeIdMap AS p ON Posts.PostTypeId = p.id
+ GROUP BY p.value
+ ORDER BY NumPosts DESC"
> result <- dbGetQuery(con1, query)
> print(result)
Question 5: - How many posted questions are there?
SQL: -
SELECT COUNT(*) AS NumQuestions
FROM Posts
WHERE PostTypeId = (SELECT id FROM PostTypeIdMap WHERE value = 'Question');
R: -
query <- "SELECT COUNT(*) AS NumQuestions
+ FROM Posts
+ WHERE PostTypeId = (SELECT id FROM PostTypeIdMap WHERE value = 'Question')"
result <- dbGetQuery(con1, query)
print(result$NumQuestions)
Question 6: - What are the top 50 most common tags on questions? For each of the top 50 tags on questions, how many
questions are there for each tag.
SQL: -
SELECT Tags, COUNT(*) AS NumQuestions
FROM Posts
WHERE PostTypeId = (SELECT id FROM PostTypeIdMap WHERE value = 'Question')
GROUP BY Tags
ORDER BY NumQuestions DESC
LIMIT 50 ;
R: -
query <- "SELECT Tags, COUNT(*) AS NumQuestions
FROM Posts
WHERE PostTypeId = (SELECT id FROM PostTypeIdMap WHERE value = 'Question')
GROUP BY Tags
ORDER BY NumQuestions DESC
LIMIT 50"
result <- dbGetQuery(con1, query)
print(result)
Question 7: - How many tags are in most questions?
SQL: -
SELECT COUNT(*) AS NumTags
FROM (
SELECT Tags
FROM Posts
WHERE PostTypeId = (SELECT id FROM PostTypeIdMap WHERE value = 'Question')
GROUP BY Tags
ORDER BY COUNT(*) DESC
LIMIT 1
) AS MostCommonTags;
R: -
query <- "SELECT COUNT(*) AS NumTags
FROM (
SELECT Tags
FROM Posts
WHERE PostTypeId = (SELECT id FROM PostTypeIdMap WHERE value = 'Question')
GROUP BY Tags
ORDER BY COUNT(*) DESC
LIMIT 1
) AS MostCommonTags"
result <- dbGetQuery(con1, query)
print(result$NumTags)
Question 11. Following from the previous questions, for the 10 users who posted the most questions, how many gold,
silver and bronze badges does each of these 10 individuals have?
SQL: -
SELECT u.DisplayName, COUNT(CASE WHEN b.Name = 'gold' THEN 1 END) AS GoldBadges,
COUNT(CASE WHEN b.Name = 'silver' THEN 1 END) AS SilverBadges,
COUNT(CASE WHEN b.Name = 'bronze' THEN 1 END) AS BronzeBadges
FROM Users AS u
JOIN Badges AS b ON u.Id = b.UserId
JOIN ( SELECT OwnerUserId, COUNT(*) AS NumQuestions
FROM Posts
WHERE PostTypeId = (SELECT id FROM PostTypeIdMap WHERE value = 'Question')
GROUP BY OwnerUserId
ORDER BY NumQuestions DESC
LIMIT 10
) AS top10users ON u.Id = top10users.OwnerUserId
GROUP BY u.Id, u.DisplayName ;
R: -
query <- "SELECT u.DisplayName, COUNT(CASE WHEN b.Name = 'gold' THEN 1 END) AS GoldBadges,
+ COUNT(CASE WHEN b.Name = 'silver' THEN 1 END) AS SilverBadges,
+ COUNT(CASE WHEN b.Name = 'bronze' THEN 1 END) AS BronzeBadges
+ FROM Users AS u
+ JOIN Badges AS b ON u.Id = b.UserId
+ JOIN (
+ SELECT OwnerUserId, COUNT(*) AS NumQuestions
+ FROM Posts
+ WHERE PostTypeId = (SELECT id FROM PostTypeIdMap WHERE value = 'Question')
+ GROUP BY OwnerUserId
+ ORDER BY NumQuestions DESC
+ LIMIT 10
+ ) AS top10users ON u.Id = top10users.OwnerUserId
+ GROUP BY u.Id,...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here
April
January
February
March
April
May
June
July
August
September
October
November
December
2025
2025
2026
2027
SunMonTueWedThuFriSat
30
31
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
1
2
3
00:00
00:30
01:00
01:30
02:00
02:30
03:00
03:30
04:00
04:30
05:00
05:30
06:00
06:30
07:00
07:30
08:00
08:30
09:00
09:30
10:00
10:30
11:00
11:30
12:00
12:30
13:00
13:30
14:00
14:30
15:00
15:30
16:00
16:30
17:00
17:30
18:00
18:30
19:00
19:30
20:00
20:30
21:00
21:30
22:00
22:30
23:00
23:30