Free SOA Exam SRM (Statistics for Risk Modeling) Decision Trees Practice Questions
Master decision tree methods for Exam SRM, including classification and regression trees, random forests, boosting, and bagging. Questions test both algorithm mechanics and practical interpretation.
Sample Questions
Question 1
Easy
Random forests improve upon bagging by:
Solution
The key innovation of random forests over bagging is the random selection of a subset of predictors at each split point. This prevents dominant predictors from being used in every tree, thereby reducing the correlation between trees. Since the variance of an average of correlated quantities depends on the correlation, decorrelation leads to greater variance reduction.
Choice A is incorrect because both methods typically use full-depth trees.
Choice E is incorrect because both can use the same splitting criteria.
Choice B is incorrect because random forests, like bagging, fit trees independently (not sequentially -- that describes boosting).
Choice C is incorrect because both use standard bootstrap samples of size .
Choice A is incorrect because both methods typically use full-depth trees.
Choice E is incorrect because both can use the same splitting criteria.
Choice B is incorrect because random forests, like bagging, fit trees independently (not sequentially -- that describes boosting).
Choice C is incorrect because both use standard bootstrap samples of size .
Question 2
Medium
Which statement about decision trees and outliers is correct?
Solution
Decision trees split by comparing predictor values to thresholds. The split only depends on whether an observation falls above or below the cutpoint, not on how far it is from the cutpoint. This makes trees inherently robust to outliers in the predictor space. An extreme value of a predictor simply falls to one side of a split just like any other value on that side.
(E) is wrong because the ordering-based nature of splits provides natural outlier robustness. (A) is wrong because trees can handle any data without preprocessing for outliers. (B) is wrong because outliers are routed through the tree based on their predictor values, not grouped together. (C) is wrong because trees do not remove any observations — they use all data.
(E) is wrong because the ordering-based nature of splits provides natural outlier robustness. (A) is wrong because trees can handle any data without preprocessing for outliers. (B) is wrong because outliers are routed through the tree based on their predictor values, not grouped together. (C) is wrong because trees do not remove any observations — they use all data.
Question 3
Hard
A gradient boosting model for regression uses absolute error loss . For a terminal node containing five observations with residuals , the optimal leaf value (the value that minimizes the total absolute error within the node) is:
Solution
When the loss function is absolute error , the value that minimizes over the constant is the median of the residuals.
Sorting the residuals: . With 5 observations, the median is the 3rd value: .
(E) is correct.
(A) arrives at the same numerical answer (2.0) in this particular example, but the reasoning is wrong. The mean minimizes squared error loss, not absolute error loss. In general, when the residual distribution is asymmetric, the mean and median differ, and only the median minimizes absolute error. (C) also happens to get 2.0 here, but trimmed means have no optimality property for absolute error loss and would give incorrect results in other configurations.
(D) is incorrect because while pseudo-residuals for squared error loss can have mean near zero, the optimal leaf value for absolute error loss is the median, not necessarily zero. (E) computes the midrange, which minimizes the maximum deviation (minimax criterion), not the sum of absolute deviations.
Sorting the residuals: . With 5 observations, the median is the 3rd value: .
(E) is correct.
(A) arrives at the same numerical answer (2.0) in this particular example, but the reasoning is wrong. The mean minimizes squared error loss, not absolute error loss. In general, when the residual distribution is asymmetric, the mean and median differ, and only the median minimizes absolute error. (C) also happens to get 2.0 here, but trimmed means have no optimality property for absolute error loss and would give incorrect results in other configurations.
(D) is incorrect because while pseudo-residuals for squared error loss can have mean near zero, the optimal leaf value for absolute error loss is the median, not necessarily zero. (E) computes the midrange, which minimizes the maximum deviation (minimax criterion), not the sum of absolute deviations.
More Exam SRM Topics
About FreeFellow
FreeFellow is a free exam prep platform for actuarial (SOA & CAS), CFA, CFP, CPA, CAIA, and securities licensing candidates. Every question includes a detailed solution. Full lessons, flashcards with spaced repetition, timed mock exams, performance analytics, and a personalized study plan are all included — no paywalls, no ads.