An Ensemble Approach to Predict Default Risk in Stress Testing
This presentation discussed the importance of performing stress tests on financial models that are used to predict the term structure of corporate default probabilities. This is accomplished by using an ensemble approach using different machine learning algorithms that minimize typical shortcomings and maximize output accuracy.
A stress test is software testing that is performed to verify the stability and reliability of a system. This mainly focuses on determining the systems’ robustness and error handling capability under circumstances that are historically atypical and under extremely heavy load conditions. The CCAR (comprehensive capital analysis and review) is an annual exercise by the Federal Reserve that looks to ensure that institutions have well-defined and forward-looking capital planning processes.
This exercise looks to account for unique risks and ensure that sufficient capital is present in order to continue operations through times of economic and financial stress. This regular scrutiny can put pressure on companies within the financial industry. Fortunately, with recent advances in machine learning, stress tests can be applied with relatively simple models to aid in determining future outcomes that extend beyond historical occurrences.
“It is difficult to predict, especially the future. It is difficult to forecast, especially the future.”
Yun describes the Dodd-Frank Act as a forward-looking stress test on a quantitative basis. This operates within three scenarios: baseline (normal), adverse (bad), and severely adverse (worst). These are used to describe different fluctuations in the market and how they will impact companies. Different stress tests encompass a variety of different financial pressures. What happens if the unemployment rate rises to r% in a specific quarter? What happens if GDP falls by x% in a given quarter? What happens if interest rates go up by y in the second year?
Including changes in variables is a standard task that machine learning algorithms are used for to explain different prediction outcomes. However, this problem becomes more complicated when you attempt to explain scenarios that have never occurred before and therefore do not give any historical trends that can be used for future predictions (examples of how changing different variables can drastically alter predictions when there is little or no historical examples to work off of).
Standard challenges such as variable dependency, scenario dependency, and methodology dependency all exist within this modeling framework. Feature engineering and selection helps determine data relevance and variable stability. Models goals are to capture dramatic shifts in severely adverse scenarios. This scenario dependence uses predictions based off of both back testing and peak forecasting to mitigate this issue. The methodology dependency is solved by using an ensemble method, a combination of GLM and GBM modeling.
GLM (generalized linear modeling) and GBM (gradient boosting machine) are combined to capture both global and local trends. This helps aid in gauging appropriate responses to stressors even when they are out of historical range. The combination of the two can predict the probability of default at a segmentation level that provides the flexibility required for atypical results.
This represents a two-step sequential modelling process. The first step is to model the main effect drivers, including segmentation variables and macro drivers. This helps ensure the significance of the main effect. The second step is to model the interaction between variables by offsetting the estimation from the first modelling step to capture the sensitivity at a segmentation level. Gradient boosting was chosen as this second step by training models sequentially to reduce the amount of modelling bias. This helped reveal the relationships between the predictor variables and target while showing the marginal effect of the features on the prediction.
Yun and team understand that continually improving this process is the only sure fire way to ensure the highest degree of accuracy from these models. Extrapolation of unseen scenarios is complicated to accurately predict. Stress testing using different machine learning algorithms is a great tool to utilize to help with the standard issues faced when dealing with these regulatory exercises that companies are required to participate in.