-
What does the function
seq(1, 10, by = 2)produce in R?- A) 1, 2, 3, 4, …, 10
- B) 1, 3, 5, 7, 9
- C) 1, 4, 7, 10
- D) 2, 4, 6, 8, 10
-
Which of the following is not a valid way to create a vector in R?
- A)
c(2, 5, 7) - B)
1:5 - C)
vector(3) - D)
array(1, 3)
- A)
-
What is the result of
length(c(4, 9, 2, NA, 7))?- A) 4
- B) 5
- C) NA
- D) Error
-
Given
x <- c(10, 20, 30), what isx * 2?- A)
c(20, 40, 60) - B)
c(10, 20, 30, 10, 20, 30) - C)
c(12, 24, 36) - D) Error
- A)
-
What does
NArepresent in R?- A) The string
"NA" - B) Missing data / Not Available
- C) A logical FALSE
- D) Zero
- A) The string
-
What is the output type of
mean(c(1, 2, 3, 4))?- A) integer
- B) numeric (double)
- C) character
- D) logical
-
Which operator in R is used for elementwise logical AND?
- A)
& - B)
&& - C)
| - D)
||
- A)
-
What is the difference between
&and&&?- A)
&works only with scalars,&&is vectorized - B)
&does elementwise logical AND,&&only returns a single TRUE/FALSE (with short-circuit) - C) They are identical
- D)
&&works only on numeric vectors
- A)
-
What does the function
is.na()do?- A) Tests if a value is not numeric
- B) Tests if a value is
NA(missing) - C) Tests if a value is
NaN - D) Converts a value to NA
-
Suppose
x <- c(1, 2, NA, 4). What is the result ofsum(x)?- A) 7
- B) NA
- C) 8
- D) Error
-
How can you instruct
sum()to ignoreNAvalues?- A)
sum(x, remove = TRUE) - B)
sum(x, na.rm = TRUE) - C)
sum(x, na = TRUE) - D)
sum(x, ignore.na = TRUE)
- A)
-
Which of the following is not a mode (or atomic type) in R?
- A) numeric
- B) logical
- C) factor
- D) character
-
Suppose you have
y <- c("a", "b", "c"). What isy[2]?- A)
"a" - B)
"b" - C)
"c" - D)
NA
- A)
-
What happens if you access an index out of bounds, e.g.
y[10]for a vector of length 3?- A) Error
- B)
NULL - C)
NA - D) The vector recycles
-
Which function gives the unique values of a vector?
- A)
unique() - B)
distinct() - C)
uniq() - D)
levels()
- A)
-
What does
factor()do?- A) Converts a numeric vector to binary
- B) Converts a vector into a factor (categorical) type
- C) Returns factorials
- D) Converts character to numeric
-
What is the output of
as.numeric(factor(c("a", "b", "a")))?- A)
c(1, 2, 1) - B)
c("a", "b", "a") - C)
c(0, 1, 0) - D)
c("1", "2", "1")
- A)
-
What does
data.frame()create?- A) A matrix
- B) A list
- C) A tabular structure with equal-length columns
- D) A vector
-
If
df <- data.frame(a = c(1,2), b = c("x","y")), what isdf$a?- A) A data frame
- B) The column named “a” as a vector
- C) The entire data frame
- D) Error
-
What does the
str()function do when given an R object?- A) Prints only the names of the object
- B) Gives the structure (internal representation) of the object
- C) Summarizes statistical properties
- D) Converts object to string
- Given this R snippet, what is the output? Explain step by step.
x <- c(5, NA, 10, 15)
mean_x <- mean(x)
total <- sum(x, na.rm = TRUE)
c(mean_x, total)
- Identify the bug / error in the following R code, and suggest a fix.
v <- c(2, 4, 6, 8)
w <- c(1, 2)
result <- v + w
print(result)-
Write a short program in R that takes a numeric vector
vand returns a vector of the same length where each entry is TRUE if the corresponding entry inv`` is above the mean ofv` (ignoring NAs), and FALSE otherwise. -
Given the following code, describe what it does (in plain English):
df <- data.frame(id = 1:5, score = c(10, 15, NA, 20, 18))
df$above_avg <- df$score > mean(df$score, na.rm = TRUE)
subset(df, above_avg == TRUE)- What will be the result (or error) of this code? Explain.
x <- factor(c("low","medium","high","low"))
as.numeric(x) + 1
levels(x)- Suppose
A <- matrix(c(1, 2, 3, 4), nrow = 2, byrow = TRUE)
B <- matrix(c(2, 0, 1, 2), nrow = 2, byrow = TRUE)
A %*% BWhat is the result of A %*% B?
- A) matrix(c(4, 4, 10, 8), nrow = 2, byrow = TRUE)
- B) matrix(c(2, 2, 6, 8), nrow = 2, byrow = TRUE)
- C) matrix(c(4, 4, 8, 10), nrow = 2, byrow = TRUE)
- D) Error (dimensions do not match)
27/ Which operator is used for matrix multiplication in R?
- A) *
- B) %*%
- C) .
- D) crossprod()
- Given:
M <- matrix(1:6, nrow = 2, ncol = 3)
N <- matrix(1:6, nrow = 3, ncol = 2)
M %*% NWhat is the dimension of the result?
- A) 2×2
- B) 3×3
- C) 2×3
- D) Error
- Suppose:
X <- matrix(c(1, 2, 3, 4), nrow = 2)
Y <- matrix(c(5, 6), nrow = 2)
X %*% YWhat happens?
- A) 2×1 matrix result
- B) 1×2 matrix result
- C) Error (non-conformable arguments)
- D) A scalar
- Write pseudo-code for multiplying two matrices A (m×n) and B (n×p) to produce matrix C (m×p). Clearly specify the loop structure you would use. You do not need to write actual R code, just outline the algorithm in clear steps.
-
In the linear regression model
$y = X\beta + \epsilon$ , which of the following is the closed-form solution for$\hat\beta$ ?-
A)
$X^\top y$ -
B)
$(X^\top X)^{-1} X^\top y$ -
C)
$(X X^\top)^{-1} y$ -
D)
$X (X^\top y)^{-1}$
-
-
Under the standard linear regression assumptions, which of the following is true about bias and variance of
$\hat\beta$ ?-
A)
$\hat\beta$ is unbiased, variance depends on$\sigma^2 (X^\top X)^{-1}$ -
B)
$\hat\beta$ is biased, variance is always zero -
C)
$\hat\beta$ is unbiased, variance does not depend on$X$ -
D)
$\hat\beta$ is biased, variance depends only on sample size
-
-
In R, which of the following correctly computes the OLS estimate?
-
A) beta_hat <- solve(t(X) %% X) %% t(X) %*% y
-
B) beta_hat <- X %% t(X) %% y
-
C) beta_hat <- lm(X, y)
-
D) beta_hat <- solve(X) %*% y
-
-
Explain in words what the bias-variance tradeoff means in the context of linear regression.
-
Suppose you fit a regression model in R using:
model <- lm(y ~ x1 + x2, data = df)
summary(model)Explain the following concepts:
The coefficient estimate of x1.
The p-value associated with the coefficient estimate of x1.
- The following is the output of a summary function:
Call:
lm(formula = y ~ x1 + x2, data = df)
Residuals:
Min 1Q Median 3Q Max
-2.345 -0.876 -0.123 0.754 2.567
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.2350 0.4321 2.86 0.005**
x1 0.5678 0.0987 5.75 1.2e-07***
x2 -0.2345 0.1123 -2.09 0.039*
Residual standard error: 1.05 on 96 degrees of freedom
Multiple R-squared: 0.642, Adjusted R-squared: 0.631
F-statistic: 58.4 on 2 and 96 DF, p-value: < 2.2e-16Questions:
- How would you interpret the coefficient of x1?
- Is x2 statistically significant at the 5% level? Why or why not?
- How do you interpret the F-test and its result?