Free SOA Exam SRM (Statistics for Risk Modeling) Unsupervised Learning Techniques Practice Questions
Explore unsupervised learning techniques for Exam SRM. Questions cover principal component analysis, k-means and hierarchical clustering, and dimensionality reduction methods.
Sample Questions
Question 1
Easy
The elbow method is used in clustering to determine:
Solution
The elbow method plots the within-cluster sum of squares (WCSS) against different values of . The appropriate is identified at the 'elbow' — the point where increasing yields diminishing returns in reducing WCSS.
(C) is correct: The elbow method helps select the number of clusters for K-means.
(A) is incorrect: While scree plots for PCA use a similar visual approach, the elbow method specifically refers to the K-means context of plotting WCSS vs. .
(E) is incorrect: The elbow method does not address linkage method selection.
(D) is incorrect: The method examines within-cluster sum of squares, not variance proportions per cluster.
(B) is incorrect: The elbow method does not compare clustering algorithms.
(C) is correct: The elbow method helps select the number of clusters for K-means.
(A) is incorrect: While scree plots for PCA use a similar visual approach, the elbow method specifically refers to the K-means context of plotting WCSS vs. .
(E) is incorrect: The elbow method does not address linkage method selection.
(D) is incorrect: The method examines within-cluster sum of squares, not variance proportions per cluster.
(B) is incorrect: The elbow method does not compare clustering algorithms.
Question 2
Medium
When performing PCA, why is it important that the loading vectors (eigenvectors) have unit length?
Solution
Without the unit-length constraint, the optimization problem of maximizing variance would have no solution — the loadings could be scaled arbitrarily large to achieve infinite projected variance. The unit-length constraint makes the optimization well-defined and ensures a unique solution (the eigenvector).
(D) is correct: The unit-length constraint provides a well-defined, unique optimization problem.
(E) is incorrect: The unit-length constraint enables the solution; it does not by itself cause maximum variance.
(C) is incorrect: Eigenvalues of a positive semi-definite matrix are non-negative regardless of the constraint.
(A) is incorrect: The unit-length constraint does not determine the signs of loadings.
(B) is incorrect: Eigenvalues are generally unequal; each PC does not explain equal variance.
(D) is correct: The unit-length constraint provides a well-defined, unique optimization problem.
(E) is incorrect: The unit-length constraint enables the solution; it does not by itself cause maximum variance.
(C) is incorrect: Eigenvalues of a positive semi-definite matrix are non-negative regardless of the constraint.
(A) is incorrect: The unit-length constraint does not determine the signs of loadings.
(B) is incorrect: Eigenvalues are generally unequal; each PC does not explain equal variance.
Question 3
Hard
K-means is applied to 6 one-dimensional observations with . After convergence, cluster 1 contains and cluster 2 contains . What is the total within-cluster sum of squares (WCSS)?
Solution
Cluster 1: , centroid
Cluster 1 WCSS:
-
-
-
- Sum =
Cluster 2: , centroid
Cluster 2 WCSS:
-
-
-
- Sum =
Total WCSS .
(B) is correct.
(A) uses the overall mean (7.0) instead of cluster centroids, giving , which is the TSS not the WCSS. (C) takes square roots of each cluster's SS before adding, which has no statistical meaning. (D) divides by cluster size, computing the average squared deviation rather than the total. (E) uses absolute deviations per cluster instead of squared deviations.
Cluster 1 WCSS:
-
-
-
- Sum =
Cluster 2: , centroid
Cluster 2 WCSS:
-
-
-
- Sum =
Total WCSS .
(B) is correct.
(A) uses the overall mean (7.0) instead of cluster centroids, giving , which is the TSS not the WCSS. (C) takes square roots of each cluster's SS before adding, which has no statistical meaning. (D) divides by cluster size, computing the average squared deviation rather than the total. (E) uses absolute deviations per cluster instead of squared deviations.
More Exam SRM Topics
About FreeFellow
FreeFellow is a free exam prep platform for actuarial (SOA & CAS), CFA, CFP, CPA, CAIA, and securities licensing candidates. Every question includes a detailed solution. Full lessons, flashcards with spaced repetition, timed mock exams, performance analytics, and a personalized study plan are all included — no paywalls, no ads.