SOA Exam ASTAM Cheat Sheet

Contents

Exam Overview

The Advanced Statistics for Actuarial Modeling (ASTAM) exam tests candidates on advanced statistical techniques used in actuarial work, with a focus on model selection, validation, and advanced regression techniques. The exam is 3 hours and 15 minutes long with a mix of multiple-choice and written-answer questions.

Linear Models and Regression Analysis

Multiple Linear Regression

The fundamental equation:
y = Xβ + ε

Where:

  • y is the n×1 vector of responses
  • X is the n×p design matrix
  • β is the p×1 vector of parameters
  • ε is the n×1 vector of errors

Parameter estimation (OLS):
β̂ = (X’X)^(-1)X’y

Variance of parameter estimates:
Var(β̂) = σ²(X’X)^(-1)

Residual standard error:
σ̂² = RSS/(n-p)
Where RSS = Σ(yi – ŷi)²

Model Evaluation Metrics

R-squared:
R² = 1 – RSS/TSS
Where TSS = Σ(yi – ȳ)²

Adjusted R-squared:
R²_adj = 1 – (RSS/(n-p))/(TSS/(n-1))

Akaike Information Criterion (AIC):
AIC = -2ln(L) + 2p

Bayesian Information Criterion (BIC):
BIC = -2ln(L) + p×ln(n)

Generalized Linear Models (GLMs)

Model Components

  1. Random Component: Y ~ Distribution from exponential family
  2. Systematic Component: η = Xβ
  3. Link Function: g(μ) = η

Common Link Functions

Logistic Regression:
g(μ) = ln(μ/(1-μ))

Poisson Regression:
g(μ) = ln(μ)

Gamma Regression:
g(μ) = 1/μ or ln(μ)

Deviance

Deviance = 2[l(y;y) – l(μ̂;y)]

Scaled Deviance:
D* = D/φ

Parameter Estimation

Maximum Likelihood Estimation through Iteratively Reweighted Least Squares (IRLS):
β̂_(t+1) = β̂_t + (X’W_tX)^(-1)X’W_tz_t

Where:

  • W_t is the weight matrix
  • z_t is the working response

Time Series Analysis

Stationarity Tests

Augmented Dickey-Fuller test statistic:
ΔYt = α + βt + γYt-1 + δ1ΔYt-1 + … + δp-1ΔYt-p+1 + εt

ARIMA Models

ARIMA(p,d,q) model:
φ(B)(1-B)^d Yt = θ(B)εt

Where:

  • φ(B) is the AR polynomial
  • θ(B) is the MA polynomial
  • B is the backshift operator

Forecasting

One-step-ahead forecast:
Ŷt(1) = E(Yt+1|Yt,Yt-1,…)

h-step-ahead forecast:
Ŷt(h) = E(Yt+h|Yt,Yt-1,…)

Advanced Regression Techniques

Principal Component Analysis (PCA)

Eigenvalue decomposition:
Σ = PΛP’

Where:

  • Σ is the covariance matrix
  • Λ is diagonal matrix of eigenvalues
  • P is matrix of eigenvectors

Principal components:
Z = XP

Ridge Regression

Parameter estimation:
β̂_ridge = (X’X + λI)^(-1)X’y

Where λ is the regularization parameter

Lasso Regression

Objective function:
min_β {RSS + λΣ|βj|}

Elastic Net

Objective function:
min_β {RSS + λ[(1-α)Σβj² + αΣ|βj|]}

Where:

  • α controls the mix of ridge and lasso penalties
  • λ controls overall regularization strength

Model Validation Techniques

Cross-Validation

K-fold CV error:
CV_k = (1/k)Σ_i=1^k MSE_i

Leave-one-out CV error:
LOOCV = (1/n)Σ_i=1^n (yi – ŷ_i^(-i))²

Bootstrap Methods

Bootstrap estimate:
θ̂_boot = (1/B)Σ_b=1^B θ̂*_b

Bootstrap standard error:
SE_boot = √[(1/(B-1))Σ_b=1^B (θ̂*_b – θ̂_boot)²]

Advanced Statistical Concepts

Mixed Effects Models

Linear Mixed Model:
y = Xβ + Zu + ε

Where:

  • β are fixed effects
  • u are random effects
  • Z is the random effects design matrix

Survival Analysis

Hazard function:
h(t) = f(t)/S(t)

Survival function:
S(t) = exp(-∫_0^t h(u)du)

Cox Proportional Hazards:
h(t|X) = h₀(t)exp(Xβ)

Model Selection Techniques

Stepwise Selection

Forward Selection:
Add variables based on F-statistic or p-value

Backward Elimination:
Remove variables based on F-statistic or p-value

Information Criteria Comparison

Choose model that minimizes:

  • AIC = -2ln(L) + 2p
  • BIC = -2ln(L) + pln(n)
  • HQIC = -2ln(L) + 2pln(ln(n))

Study Strategies

  1. Understanding Theoretical Foundations
  • Focus on assumptions behind each model
  • Know when each model is appropriate
  • Understand relationships between different techniques
  1. Practical Application
  • Practice interpreting model outputs
  • Learn to identify violation of assumptions
  • Develop intuition for model selection
  1. Common Pitfalls to Avoid
  • Overlooking multicollinearity
  • Ignoring model assumptions
  • Misinterpreting significance tests

Essential R Functions

While you won’t be coding in the exam, understanding these R functions helps grasp the concepts:

# Linear Models
lm(y ~ x1 + x2)
summary(model)

# GLMs
glm(y ~ x, family=binomial)
glm(y ~ x, family=poisson)

# Time Series
arima(ts_data, order=c(p,d,q))
forecast(model, h=10)

# Advanced Regression
prcomp(X, scale=TRUE)  # PCA
glmnet(X, y, alpha=0)  # Ridge
glmnet(X, y, alpha=1)  # Lasso

Exam Tips

  1. Time Management
  • Read questions carefully
  • Prioritize questions you’re confident about
  • Leave time for checking work
  1. Calculation Strategy
  • Write out formulas before plugging in numbers
  • Show intermediate steps
  • Check units and scaling
  1. Conceptual Understanding
  • Explain why you chose specific methods
  • Consider practical implications
  • Reference assumptions when relevant
Like this content? Share with your friends!Share on linkedin
Linkedin
Share on facebook
Facebook
Share on twitter
Twitter
Share on stumbleupon
Stumbleupon
Share on tumblr
Tumblr