License¶

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

DISCLAIMER: This notebook is not legal compliance advice.

All models are wrong ... but why is my model wrong?¶

Part 1: Sensitivity analysis for model debugging¶

"All models are wrong but some are useful" -- George Box, 1978

This notebook focuses on using sensitivity analysis to discover error mechanisms and security vulnerabilities and to assess stability and fairness in a trained XGBoost model. It begins by loading the UCI credit card default data and then training an interpretable, monotonically constrained XGBoost gradient boosting machine (GBM) model. (Pearson correlation with the prediction target is used to determine the direction of the monotonicity constraints for each input variable.) After the model is trained, global and local Shapley feature importance is calculated. These Shapley values help inform the application of partial dependence and ICE, and together these results guide a search for adversarial examples. The notebook closes by exposing the trained model to a random attack and analyzing the attack results. These model debugging exercises uncover several accuracy, drift, and security problems such as over-emphasis of important features and non-robust interactions. Several remediation mechanisms are proposed including editing of final model artifacts, missing value injection or regularization during training, and use of assertions to inject missing values during scoring.

Python imports¶

In [1]:

import numpy as np                        # array, vector, matrix calculations
import pandas as pd                       # DataFrame handling
import shap                               # for consistent, signed variable importance measurements
import xgboost as xgb                     # gradient boosting machines (GBMs)

from itertools import combinations        # n choose k combinations of objects
import string                             # for operations on character strings

pd.options.display.max_columns = 999      # enable display of all DataFrame columns in notebook

import matplotlib.pyplot as plt           # general plotting utilities
from mpl_toolkits.mplot3d import Axes3D   # for 3-D projections
from matplotlib import cm                 # colormaps for plots

# enables display of plots in notebook
%matplotlib inline

np.random.seed(12345)                     # set random seed for global scope

/home/patrickh/workspace/interpretable_machine_learning_with_python/env_iml/lib/python3.6/site-packages/sklearn/ensemble/weight_boosting.py:29: DeprecationWarning: numpy.core.umath_tests is an internal NumPy module and should not be imported. It will be removed in a future NumPy release.
  from numpy.core.umath_tests import inner1d

1. Download, explore, and prepare UCI credit card default data¶

UCI credit card default data: https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients

The UCI credit card default data contains demographic and payment information about credit card customers in Taiwan in the year 2005. The dataset contains 23 input variables:

LIMIT_BAL: Amount of given credit (NT dollar)
SEX: 1 = male; 2 = female
EDUCATION: 1 = graduate school; 2 = university; 3 = high school; 4 = others
MARRIAGE: 1 = married; 2 = single; 3 = others
AGE: Age in years
PAY_0, PAY_2 - PAY_6: History of past payment; PAY_0 = the repayment status in September, 2005; PAY_2 = the repayment status in August, 2005; ...; PAY_6 = the repayment status in April, 2005. The measurement scale for the repayment status is: -1 = pay duly; 1 = payment delay for one month; 2 = payment delay for two months; ...; 8 = payment delay for eight months; 9 = payment delay for nine months and above.
BILL_AMT1 - BILL_AMT6: Amount of bill statement (NT dollar). BILL_AMT1 = amount of bill statement in September, 2005; BILL_AMT2 = amount of bill statement in August, 2005; ...; BILL_AMT6 = amount of bill statement in April, 2005.
PAY_AMT1 - PAY_AMT6: Amount of previous payment (NT dollar). PAY_AMT1 = amount paid in September, 2005; PAY_AMT2 = amount paid in August, 2005; ...; PAY_AMT6 = amount paid in April, 2005.

These 23 input variables are used to predict the target variable, whether or not a customer defaulted on their credit card bill in late 2005. Because XGBoost accepts only numeric inputs, all variables will be treated as numeric.

Import data and clean¶

In [2]:

# import XLS file
path = 'default_of_credit_card_clients.xls'
data = pd.read_excel(path,
                     skiprows=1) # skip the first row of the spreadsheet

# remove spaces from target column name 
data = data.rename(columns={'default payment next month': 'DEFAULT_NEXT_MONTH'}) 

Assign modeling roles¶

The shorthand name y is assigned to the prediction target. X is assigned to all other input variables in the credit card default data except the row identifier, ID.

In [3]:

# assign target and inputs for GBM
y = 'DEFAULT_NEXT_MONTH'
X = [name for name in data.columns if name not in [y, 'ID']]
print('y =', y)
print('X =', X)

y = DEFAULT_NEXT_MONTH
X = ['LIMIT_BAL', 'SEX', 'EDUCATION', 'MARRIAGE', 'AGE', 'PAY_0', 'PAY_2', 'PAY_3', 'PAY_4', 'PAY_5', 'PAY_6', 'BILL_AMT1', 'BILL_AMT2', 'BILL_AMT3', 'BILL_AMT4', 'BILL_AMT5', 'BILL_AMT6', 'PAY_AMT1', 'PAY_AMT2', 'PAY_AMT3', 'PAY_AMT4', 'PAY_AMT5', 'PAY_AMT6']

Display descriptive statistics¶

The input variables SEX, EDUCATION, MARRIAGE, PAY_0-PAY_6, and the prediction target DEFAULT_NEXT_MONTH, are really categorical variables, but they have already been encoded into meaningful numeric, integer values, which is great for XGBoost. Also, there are no missing values in this dataset.

(In a traditional credit scoring environment in the U.S., demographic variables like SEX, EDUCATION, MARRIAGE and AGE would not be used directly in the model, but only for disparate impact testing after model training. They are included in the model here for example purposes.)

In [4]:

data[X + [y]].describe() # display descriptive statistics for all columns

Out[4]:

	LIMIT_BAL	SEX	EDUCATION	MARRIAGE	AGE	PAY_0	PAY_2	PAY_3	PAY_4	PAY_5	PAY_6	BILL_AMT1	BILL_AMT2	BILL_AMT3	BILL_AMT4	BILL_AMT5	BILL_AMT6	PAY_AMT1	PAY_AMT2	PAY_AMT3	PAY_AMT4	PAY_AMT5	PAY_AMT6	DEFAULT_NEXT_MONTH
count	30000.000000	30000.000000	30000.000000	30000.000000	30000.000000	30000.000000	30000.000000	30000.000000	30000.000000	30000.000000	30000.000000	30000.000000	30000.000000	3.000000e+04	30000.000000	30000.000000	30000.000000	30000.000000	3.000000e+04	30000.00000	30000.000000	30000.000000	30000.000000	30000.000000
mean	167484.322667	1.603733	1.853133	1.551867	35.485500	-0.016700	-0.133767	-0.166200	-0.220667	-0.266200	-0.291100	51223.330900	49179.075167	4.701315e+04	43262.948967	40311.400967	38871.760400	5663.580500	5.921163e+03	5225.68150	4826.076867	4799.387633	5215.502567	0.221200
std	129747.661567	0.489129	0.790349	0.521970	9.217904	1.123802	1.197186	1.196868	1.169139	1.133187	1.149988	73635.860576	71173.768783	6.934939e+04	64332.856134	60797.155770	59554.107537	16563.280354	2.304087e+04	17606.96147	15666.159744	15278.305679	17777.465775	0.415062
min	10000.000000	1.000000	0.000000	0.000000	21.000000	-2.000000	-2.000000	-2.000000	-2.000000	-2.000000	-2.000000	-165580.000000	-69777.000000	-1.572640e+05	-170000.000000	-81334.000000	-339603.000000	0.000000	0.000000e+00	0.00000	0.000000	0.000000	0.000000	0.000000
25%	50000.000000	1.000000	1.000000	1.000000	28.000000	-1.000000	-1.000000	-1.000000	-1.000000	-1.000000	-1.000000	3558.750000	2984.750000	2.666250e+03	2326.750000	1763.000000	1256.000000	1000.000000	8.330000e+02	390.00000	296.000000	252.500000	117.750000	0.000000
50%	140000.000000	2.000000	2.000000	2.000000	34.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	22381.500000	21200.000000	2.008850e+04	19052.000000	18104.500000	17071.000000	2100.000000	2.009000e+03	1800.00000	1500.000000	1500.000000	1500.000000	0.000000
75%	240000.000000	2.000000	2.000000	2.000000	41.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	67091.000000	64006.250000	6.016475e+04	54506.000000	50190.500000	49198.250000	5006.000000	5.000000e+03	4505.00000	4013.250000	4031.500000	4000.000000	0.000000
max	1000000.000000	2.000000	6.000000	3.000000	79.000000	8.000000	8.000000	8.000000	8.000000	8.000000	8.000000	964511.000000	983931.000000	1.664089e+06	891586.000000	927171.000000	961664.000000	873552.000000	1.684259e+06	896040.00000	621000.000000	426529.000000	528666.000000	1.000000

2. Investigate pair-wise Pearson correlations for `DEFAULT_NEXT_MONTH`¶

Monotonic relationships are much easier to explain to colleagues, bosses, customers, and regulators than more complex, non-monotonic relationships and monotonic relationships may also prevent overfitting and excess error due to variance for new data.

To train a transparent monotonic classifier, constraints must be supplied to XGBoost that determine whether the learned relationship between an input variable and the prediction target DEFAULT_NEXT_MONTH will be increasing for increases in an input variable or decreasing for increases in an input variable. Pearson correlation provides a linear measure of the direction of the relationship between each input variable and the target. If the pair-wise Pearson correlation between an input and DEFAULT_NEXT_MONTH is positive, it will be constrained to have an increasing relationship with the predictions for DEFAULT_NEXT_MONTH. If the pair-wise Pearson correlation is negative, the input will be constrained to have a decreasing relationship with the predictions for DEFAULT_NEXT_MONTH.

Calculate Pearson correlation¶

Because DEFAULT_NEXT_MONTH is the last column in the data DataFrame, the last column of the Pearson correlation matrix indicates the direction of the linear relationship between each input variable and the prediction target, DEFAULT_NEXT_MONTH. According to the calculated values, as a customer's balance limit (LIMIT_BAL), bill amounts (BILL_AMT1-BILL_AMT6), and payment amounts (PAY_AMT1-PAY_AMT6) increase, their probability of default tends to decrease. However as a customer's number of late payments increase (PAY_0, PAY_2-PAY6), their probability of default usually increases. In general, the Pearson correlation values make sense, and they will be used to ensure that the modeled relationships will make sense as well. (Pearson correlation values between the target variable, DEFAULT_NEXT_MONTH, and each input variable are displayed directly below.)

In [5]:

# displays last column of Pearson correlation matrix as Pandas DataFrame
pd.DataFrame(data[X + [y]].corr()[y]).iloc[:-1] 

Out[5]:

	DEFAULT_NEXT_MONTH
LIMIT_BAL	-0.153520
SEX	-0.039961
EDUCATION	0.028006
MARRIAGE	-0.024339
AGE	0.013890
PAY_0	0.324794
PAY_2	0.263551
PAY_3	0.235253
PAY_4	0.216614
PAY_5	0.204149
PAY_6	0.186866
BILL_AMT1	-0.019644
BILL_AMT2	-0.014193
BILL_AMT3	-0.014076
BILL_AMT4	-0.010156
BILL_AMT5	-0.006760
BILL_AMT6	-0.005372
PAY_AMT1	-0.072929
PAY_AMT2	-0.058579
PAY_AMT3	-0.056250
PAY_AMT4	-0.056827
PAY_AMT5	-0.055124
PAY_AMT6	-0.053183

Create tuple of monotonicity constraints from Pearson correlation values¶

The last column of the Pearson correlation matrix is transformed from a numeric column in a Pandas DataFrame into a Python tuple of 1s and -1s that will be used to specify monotonicity constraints for each input variable in XGBoost.

In [6]:

# creates a tuple in which positive correlation values are assigned a 1
# and negative correlation values are assigned a -1
mono_constraints = tuple([int(i) for i in np.sign(data[X + [y]].corr()[y].values[:-1])])

# (-1, -1, 1, -1, 1, 1, 1, 1, 1, 1, 1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1)

Split data into training and test sets for early stopping¶

3. Train XGBoost with monotonicity constraints¶

XGBoost is a very accurate, open source GBM library for regression and classification tasks. XGBoost can learn complex relationships between input variables and a target variable, but here the monotone_constraints tuning parameter is used to enforce monotonicity between inputs and the prediction for DEFAULT_NEXT_MONTH. XGBoost's early stopping functionality is also used to limit overfitting to the training data

XGBoost is available from: https://github.com/dmlc/xgboost and the implementation of XGBoost is described in detail here: http://www.kdd.org/kdd2016/papers/files/rfp0697-chenAemb.pdf.

After training, Shapley variable importance is calculated and displayed. Shapley values are an old idea, originally from the 1950s. A more recent innovation called Tree SHAP made them more efficient for use in everyday data mining tasks: https://arxiv.org/pdf/1802.03888.pdf. Shapley variable importance is a global and local measure of the contribution of an input variable to the GBM model predictions. The global variable importance values give an indication of the magnitude of a variable's contribution to model predictions for all observations. To enhance trust in the GBM model, variable importance values should typically conform to human domain knowledge and reasonable expectations.

Note: As of the h2o 3.24 "Yates" release, Shapley values and monotonicity constraints are supported in h2o. To see Shapley values and monotonicity constraints for an h2o GBM in action please see: https://github.com/jphall663/interpretable_machine_learning_with_python/blob/master/dia.ipynb.

The credit card default data is split into training and test sets to monitor and prevent overtraining. Reproducibility is another important factor in creating trustworthy models, and randomly splitting datasets can introduce randomness in model predictions and other results. A random seed was set above to ensure the data split is reproducible.

In [7]:

split_ratio = 0.70    # 70%/30% train/test split

# execute split
split = np.random.rand(len(data)) < split_ratio
train = data[split]
test = data[~split]

# summarize split
print('Train data rows = %d, columns = %d' % (train.shape[0], train.shape[1]))
print('Test data rows = %d, columns = %d' % (test.shape[0], test.shape[1]))

Train data rows = 20946, columns = 25
Test data rows = 9054, columns = 25

Train XGBoost GBM classifier¶

For brevity's sake, a previously-discovered set of good tuning parameters are specified here. Notice that the monotonicity constraints are passed to XGBoost using the monotone_constraints parameter. Because gradient boosting methods typically resample training data, an additional random seed is also specified for XGBoost using the seed parameter to create reproducible predictions, error rates, and variable importance values. To avoid overfitting, the early_stopping_rounds parameter is used to stop the training process after the test area under the curve (AUC) statistic fails to increase for 50 iterations.

In [8]:

# XGBoost uses SVMLight data structure, not Numpy arrays or Pandas DataFrames 
dtrain = xgb.DMatrix(train[X], train[y])
dtest = xgb.DMatrix(test[X], test[y])

# used to calibrate predictions to mean of y 
base_y = train[y].mean()

# tuning parameters
params = {
    'objective': 'binary:logistic',             # produces 0-1 probabilities for binary classification
    'booster': 'gbtree',                        # base learner will be decision tree
    'eval_metric': 'auc',                       # stop training based on
    'eta': 0.08,                                # learning rate
    'subsample': 0.9,                           # use 90% of rows in each decision tree
    'colsample_bytree': 0.9,                    # use 90% of columns in each decision tree
    'max_depth': 15,                            # allow decision trees to grow to depth of 15
    'monotone_constraints':mono_constraints,    # 1 = increasing relationship, -1 = decreasing relationship
    'base_score': base_y,                       # calibrate predictions to mean of y 
    'seed': 12345                               # set random seed for reproducibility
}

# watchlist is used for early stopping
watchlist = [(dtrain, 'train'), (dtest, 'eval')]

# train model
xgb_model = xgb.train(params,                   # set tuning parameters from above                   
                      dtrain,                   # training data
                      5000,                     # train up to 5000 iterations
                      evals=watchlist,          # use watchlist for early stopping 
                      early_stopping_rounds=50, # stop after 50 rounds of no increase in AUC in test set 
                      verbose_eval=True)        # display iteration progress

[0]	train-auc:0.738066	eval-auc:0.733449
Multiple eval metrics have been passed: 'eval-auc' will be used for early stopping.

Will train until eval-auc hasn't improved in 50 rounds.
[1]	train-auc:0.772991	eval-auc:0.769311
[2]	train-auc:0.775536	eval-auc:0.772624
[3]	train-auc:0.776248	eval-auc:0.771985
[4]	train-auc:0.777216	eval-auc:0.772796
[5]	train-auc:0.777782	eval-auc:0.773066
[6]	train-auc:0.777783	eval-auc:0.773471
[7]	train-auc:0.777856	eval-auc:0.773591
[8]	train-auc:0.777633	eval-auc:0.773209
[9]	train-auc:0.777417	eval-auc:0.772892
[10]	train-auc:0.777519	eval-auc:0.772666
[11]	train-auc:0.778476	eval-auc:0.773346
[12]	train-auc:0.778541	eval-auc:0.773312
[13]	train-auc:0.778351	eval-auc:0.773266
[14]	train-auc:0.778519	eval-auc:0.773695
[15]	train-auc:0.779383	eval-auc:0.774314
[16]	train-auc:0.779555	eval-auc:0.77473
[17]	train-auc:0.780178	eval-auc:0.775043
[18]	train-auc:0.780743	eval-auc:0.775246
[19]	train-auc:0.78146	eval-auc:0.776021
[20]	train-auc:0.781944	eval-auc:0.776477
[21]	train-auc:0.782181	eval-auc:0.776696
[22]	train-auc:0.782319	eval-auc:0.776915
[23]	train-auc:0.78263	eval-auc:0.776998
[24]	train-auc:0.782965	eval-auc:0.777144
[25]	train-auc:0.783396	eval-auc:0.777498
[26]	train-auc:0.783791	eval-auc:0.777791
[27]	train-auc:0.784151	eval-auc:0.778091
[28]	train-auc:0.784382	eval-auc:0.778375
[29]	train-auc:0.784678	eval-auc:0.778567
[30]	train-auc:0.784912	eval-auc:0.778755
[31]	train-auc:0.785231	eval-auc:0.778938
[32]	train-auc:0.785418	eval-auc:0.779125
[33]	train-auc:0.785598	eval-auc:0.779226
[34]	train-auc:0.785838	eval-auc:0.779421
[35]	train-auc:0.7861	eval-auc:0.779432
[36]	train-auc:0.786347	eval-auc:0.779549
[37]	train-auc:0.786633	eval-auc:0.779575
[38]	train-auc:0.786833	eval-auc:0.779668
[39]	train-auc:0.787093	eval-auc:0.779743
[40]	train-auc:0.787307	eval-auc:0.779926
[41]	train-auc:0.787602	eval-auc:0.780015
[42]	train-auc:0.787819	eval-auc:0.780089
[43]	train-auc:0.787943	eval-auc:0.780196
[44]	train-auc:0.788092	eval-auc:0.780239
[45]	train-auc:0.788194	eval-auc:0.780198
[46]	train-auc:0.788298	eval-auc:0.780235
[47]	train-auc:0.788422	eval-auc:0.780305
[48]	train-auc:0.78862	eval-auc:0.780372
[49]	train-auc:0.788803	eval-auc:0.780392
[50]	train-auc:0.789049	eval-auc:0.780558
[51]	train-auc:0.789221	eval-auc:0.780659
[52]	train-auc:0.789405	eval-auc:0.780756
[53]	train-auc:0.789553	eval-auc:0.780721
[54]	train-auc:0.78968	eval-auc:0.780791
[55]	train-auc:0.789779	eval-auc:0.780849
[56]	train-auc:0.789875	eval-auc:0.780932
[57]	train-auc:0.79002	eval-auc:0.780963
[58]	train-auc:0.790156	eval-auc:0.781038
[59]	train-auc:0.790292	eval-auc:0.781085
[60]	train-auc:0.790403	eval-auc:0.781079
[61]	train-auc:0.790509	eval-auc:0.781091
[62]	train-auc:0.790554	eval-auc:0.781061
[63]	train-auc:0.790635	eval-auc:0.781148
[64]	train-auc:0.790679	eval-auc:0.781166
[65]	train-auc:0.790822	eval-auc:0.781155
[66]	train-auc:0.790896	eval-auc:0.781178
[67]	train-auc:0.790977	eval-auc:0.781188
[68]	train-auc:0.791155	eval-auc:0.781211
[69]	train-auc:0.791239	eval-auc:0.781179
[70]	train-auc:0.7914	eval-auc:0.781347
[71]	train-auc:0.791525	eval-auc:0.78134
[72]	train-auc:0.791578	eval-auc:0.781312
[73]	train-auc:0.791691	eval-auc:0.781325
[74]	train-auc:0.791747	eval-auc:0.781323
[75]	train-auc:0.791801	eval-auc:0.781304
[76]	train-auc:0.791844	eval-auc:0.781313
[77]	train-auc:0.7919	eval-auc:0.781325
[78]	train-auc:0.792056	eval-auc:0.781448
[79]	train-auc:0.792088	eval-auc:0.781397
[80]	train-auc:0.792151	eval-auc:0.78138
[81]	train-auc:0.792173	eval-auc:0.781388
[82]	train-auc:0.792236	eval-auc:0.781301
[83]	train-auc:0.792327	eval-auc:0.781355
[84]	train-auc:0.792372	eval-auc:0.781356
[85]	train-auc:0.792402	eval-auc:0.781321
[86]	train-auc:0.79247	eval-auc:0.781286
[87]	train-auc:0.792521	eval-auc:0.781283
[88]	train-auc:0.792543	eval-auc:0.781265
[89]	train-auc:0.792595	eval-auc:0.781255
[90]	train-auc:0.792618	eval-auc:0.781242
[91]	train-auc:0.792673	eval-auc:0.781309
[92]	train-auc:0.792766	eval-auc:0.781357
[93]	train-auc:0.792826	eval-auc:0.781381
[94]	train-auc:0.792914	eval-auc:0.781387
[95]	train-auc:0.792967	eval-auc:0.781385
[96]	train-auc:0.793016	eval-auc:0.781375
[97]	train-auc:0.793053	eval-auc:0.781353
[98]	train-auc:0.79312	eval-auc:0.78137
[99]	train-auc:0.793145	eval-auc:0.781413
[100]	train-auc:0.793191	eval-auc:0.781456
[101]	train-auc:0.793256	eval-auc:0.781435
[102]	train-auc:0.793282	eval-auc:0.781382
[103]	train-auc:0.793322	eval-auc:0.781385
[104]	train-auc:0.793346	eval-auc:0.781361
[105]	train-auc:0.793399	eval-auc:0.781418
[106]	train-auc:0.793436	eval-auc:0.781398
[107]	train-auc:0.793511	eval-auc:0.781358
[108]	train-auc:0.793578	eval-auc:0.78137
[109]	train-auc:0.793655	eval-auc:0.781336
[110]	train-auc:0.793683	eval-auc:0.781299
[111]	train-auc:0.793701	eval-auc:0.781277
[112]	train-auc:0.79371	eval-auc:0.78128
[113]	train-auc:0.793745	eval-auc:0.781298
[114]	train-auc:0.793817	eval-auc:0.781307
[115]	train-auc:0.793838	eval-auc:0.781301
[116]	train-auc:0.793867	eval-auc:0.781308
[117]	train-auc:0.793877	eval-auc:0.781315
[118]	train-auc:0.793917	eval-auc:0.781246
[119]	train-auc:0.793947	eval-auc:0.781302
[120]	train-auc:0.794	eval-auc:0.781358
[121]	train-auc:0.794026	eval-auc:0.781357
[122]	train-auc:0.794053	eval-auc:0.781277
[123]	train-auc:0.794064	eval-auc:0.781257
[124]	train-auc:0.794209	eval-auc:0.781289
[125]	train-auc:0.794219	eval-auc:0.781288
[126]	train-auc:0.794287	eval-auc:0.781347
[127]	train-auc:0.79429	eval-auc:0.781347
[128]	train-auc:0.794327	eval-auc:0.781331
[129]	train-auc:0.794336	eval-auc:0.781349
[130]	train-auc:0.794367	eval-auc:0.781347
[131]	train-auc:0.794364	eval-auc:0.78134
[132]	train-auc:0.794385	eval-auc:0.781363
[133]	train-auc:0.794387	eval-auc:0.781326
[134]	train-auc:0.794472	eval-auc:0.78132
[135]	train-auc:0.794483	eval-auc:0.781326
[136]	train-auc:0.794495	eval-auc:0.781306
[137]	train-auc:0.794583	eval-auc:0.781293
[138]	train-auc:0.794589	eval-auc:0.781293
[139]	train-auc:0.7946	eval-auc:0.781286
[140]	train-auc:0.794642	eval-auc:0.7813
[141]	train-auc:0.794656	eval-auc:0.781303
[142]	train-auc:0.794654	eval-auc:0.781287
[143]	train-auc:0.794695	eval-auc:0.781273
[144]	train-auc:0.794705	eval-auc:0.781268
[145]	train-auc:0.794708	eval-auc:0.781268
[146]	train-auc:0.79471	eval-auc:0.781258
[147]	train-auc:0.794775	eval-auc:0.781284
[148]	train-auc:0.794782	eval-auc:0.781277
[149]	train-auc:0.794794	eval-auc:0.781283
[150]	train-auc:0.794815	eval-auc:0.78126
Stopping. Best iteration:
[100]	train-auc:0.793191	eval-auc:0.781456

Global Shapley variable importance¶

By setting pred_contribs=True, XGBoost's predict() function will return Shapley values for each row of the test set.

Calculate Shapley values using Tree SHAP¶

Be sure to use the ntree_limit=xgb_model.best_ntree_limit option so that tree SHAP values are calculated for the best model. Also, the more trees, the slower the tree SHAP procedure!

In [9]:

# dtest is DMatrix
# shap_values is Numpy array
shap_values = xgb_model.predict(dtest, pred_contribs=True, ntree_limit=xgb_model.best_ntree_limit)

Display Shapley values for each input variable¶

Instead of relying on traditional single-value variable importance measures, local Shapley values for each input will be plotted below to get a more holistic and consistent measurement for the global importance of each input variable.

In [10]:

# drop last column of Shapley value array, which is constant intercept
# safest to use xgb_model.feature_names, seems to gaurantee consistent order
shap.summary_plot(shap_values[:, :-1], test[xgb_model.feature_names])

From the Shapley summary plot, it can be seen that PAY_0, or a customer's most recent repayment status, exerts a large influence on the model. High values of PAY_0 are associated with large positive local contributions to model predictions. Low PAY_0 values are associated with large negative local contributions to model predictions. PAY_3, or a customer's third most recent repayment status also seems to have a similar, but less intense, effect on the model predictions. PAY_AMT1 and PAY_AMT2, a customer most and second-most recent payment amounts, display large negative local feature contributions.

These results will be used to guide and inform model debugging by partial dependence, individual conditional expectation (ICE), and adversarial examples in the sections below.

4. Partial Dependence for Model Debugging¶

Standard approaches for partial dependence and ICE are discussed at length in another notebook in this series: https://github.com/jphall663/interpretable_machine_learning_with_python/blob/master/xgboost_pdp_ice.ipynb. In general partial dependence is used to understand the average predictions of a model across the domain of one or more input variable(s). This alone is very helpful information from a model debugging perspective. A user can see generally when a model is making high predictions and low predictions and if this prediction behavior makes sense from an accuracy standpoint. You can also begin to understand the sensitivity of a model to any given input variable. If partial dependence swings widely across the domain of a given variable, the model predictions are likely sensitive to the values of this variable and this variable would be expected to have high importance in the model.

Even if the model predictions are suitably accurate, being highly sensitive to the values of an input variable can present disparate impact, security, and stability problems. If a model's average predictions, as illustrated by partial dependence, are different across different races, genders, disability statuses, or other sensitive demographic attributes, this is likely cause for a deeper analysis into the observational fairness of the model's prediction behavior, such as a disparate impact analysis: https://github.com/jphall663/interpretable_machine_learning_with_python/blob/master/dia.ipynb. Being highly sensitive to the values of any one input variable is also worrisome due to model drift and the ease with which an adversary could game the model predictions. Once a model is deployed, it's input variable values will likely drift as the real-world environment changes. Partial dependence can show us approximately how a model may react when drift occurs. If the model's simulated drift behavior is problematic, these drift patterns should be monitored closely after the model is deployed. The steps below will calculate partial dependence and ICE on missing values and high out-of-range values to test a model's performance on unseen data. Partial dependence can also be determined from public prediction APIs. A malicious actor can potentially see the same partial dependence plots as the model developers and use this information to construct adversarial examples to ensure they receive the prediction they want from the model. Adversarial examples are discussed further in Section 5.

If a variable's partial dependence curve is mostly flat, the model should not be very sensitive to changes in the value of the variable and this variable would be expected to have low importance. If your model is placing high importance on a variable with a flat partial dependence curve, this could indicate a problem with your model training process, or the partial dependence itself could be misleading because the variable in question could be part of a group of strongly correlated variables or variables with strong interactions. ICE will be used to help increase the trustworthiness of partial dependence in the steps below. If the average global behavior of partial dependence is representative of the individual ICE curves, the partial dependence is likely trustworthy. If ICE curves diverge from partial dependence curves, this is an indication of strong correlation or interactions in the model training data which can cause accuracy problems in many cases. ICE also clues users into any local disparate impact, drift, or security problems: ICE can show if certain regions of the model's response function are overly sensitive to an individual's demographic segment or to any other changes in input variable values that may be interesting from a model drift or adversarial attack perspective.

Function for calculating partial dependence, its residuals, and ICE¶

In addition to calculating partial dependence, the residuals of partial dependence can also be indicative of model error mechanisms in some cases. The function here can calculate partial dependence, it's residuals, and ICE.

In [11]:

def par_dep(xs, frame, model, y=None, resid=False, abs_=False, resolution=20,
            bins=None):
    
    """ Calculates partial dependence and residuals of partial dependence.

    Args:
        xs: Variable for which to calculate partial dependence or its residuals.
        frame: Pandas DataFrame for which to calculate partial dependence or 
               its residuals.
        model: XGBoost model for which to calculate partial dependence or 
               its residuals.
        y: Name of original target variable.   
        resid: Return residuals of partial dependence instead of partial 
               dependence, default False.
        abs_: Return unsigned, absolute residuals, default False.
              (Good for handling both classes in logloss simultaneously.)
        resolution: The number of points across the domain of xs for which 
                    to calculate partial dependence or its residuals, default 20.
        bins: List of values at which to set xs, default 20 equally-spaced 
              points between column minimum and maximum.
    
    Returns: Pandas DataFrame containing partial dependence or its residual 
             values at bins.
        
    """
    
    # turn off pesky Pandas copy warning
    pd.options.mode.chained_assignment = None
    
    # initialize empty Pandas DataFrame with correct column names
    return_frame = pd.DataFrame(columns=[xs,'residual'])
    
    # cache original column values 
    col_cache = frame.loc[:, xs].copy(deep=True)
  
    # determine values at which to calculate partial dependence
    if bins == None:
        min_ = frame[xs].min()
        max_ = frame[xs].max()
        by = (max_ - min_)/resolution
        # modify max and by 
        # to preserve resolution and actually search up to max
        bins = np.arange(min_, (max_ + by), (by + np.round((1. / resolution) * by, 3)))
        
    # residuals of partial dependence
    if resid:        
        
        # initialize empty Pandas DataFrame with correct column names
        return_frame = pd.DataFrame(columns=[xs,'residual'])
        
        for j in bins:

            # frame to cache intermediate results
            rframe_ = pd.DataFrame(columns=['actual', 'pred',  'res'])

            frame.loc[:, xs] = j 
            dframe = xgb.DMatrix(frame.drop(y, axis=1)) 
            # reset index for expected merge behavior
            rframe_['actual'] = frame[y].reset_index(drop=True)
            rframe_['pred'] = pd.DataFrame(model.predict(dframe,
                                                         ntree_limit=model.best_ntree_limit))
            # logloss residual
            rframe_['res'] = -rframe_['actual']*np.log(rframe_['pred']) -\
                             (1 - rframe_['actual'])*np.log(1 - rframe_['pred'])

            if abs_:
                # optionally return absolute value
                resid_j = np.abs(rframe_['res']).mean() 
            else:
                resid_j = rframe_['res'].mean()

            del rframe_            

            return_frame = return_frame.append({xs:j,
                                               'residual': resid_j}, 
                                               ignore_index=True)
            
    # partial dependence  
    else:   
        
        # initialize empty Pandas DataFrame with correct column names
        return_frame = pd.DataFrame(columns=[xs, 'partial_dependence'])
        
        for j in bins:
            frame.loc[:, xs] = j
            dframe = xgb.DMatrix(frame)
            par_dep_i = pd.DataFrame(model.predict(dframe, 
                                                   ntree_limit=model.best_ntree_limit))
            par_dep_j = par_dep_i.mean()[0]
            return_frame = return_frame.append({xs:j,
                                               'partial_dependence': par_dep_j}, 
                                               ignore_index=True)
           
    # return input frame to original cached state    
    frame.loc[:, xs] = col_cache

    return return_frame

Establish out-of-range and missing bins for partial dependence and ICE¶

Users should consider how their models will behave in the future if their operating environment changes, such as in a recession, and then explicitly generate data to test their model behavior in these scenarios. Partial dependence, it's residuals, and ICE will be calculated for values in the training data, for missing values, and for values of PAY_0 worse than those found in the training data, up to 10 months late on the most recent repayment.

In [12]:

bins = [np.nan, -2.0, -1.5, -1.0, -0.5, 0.0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7.0, 7.5, 8.0, 8.5, 9.0, 9.5, 10.0]

Calculate partial dependence and its residuals for the most important input variable in the GBM¶

The table below includes partial dependence and it's residuals on missing and out-of-range data.

In [13]:

# calculate partial dependence
par_dep_PAY_0 = par_dep('PAY_0', test[X], xgb_model, bins=bins)

# calculate both classes of residuals on the same bins as above
bins = list(par_dep_PAY_0['PAY_0'])

# 0
par_dep_PAY_0['residual_0'] = par_dep('PAY_0', test.loc[test[y] == 0, X + [y]], 
                                      xgb_model, y=y, resid=True, 
                                      bins=bins)['residual'] 

# 1
par_dep_PAY_0['residual_1'] = par_dep('PAY_0', test.loc[test[y] == 1, X + [y]], 
                                      xgb_model, y=y, resid=True, 
                                      bins=bins)['residual'] 

par_dep_PAY_0

Out[13]:

	PAY_0	partial_dependence	residual_0	residual_1
0	NaN	0.165248	0.160465	1.581913
1	-2.0	0.165248	0.160465	1.581913
2	-1.5	0.175359	0.172128	1.522745
3	-1.0	0.175359	0.172128	1.522745
4	-0.5	0.175790	0.172634	1.519659
5	0.0	0.175793	0.172638	1.519644
6	0.5	0.261269	0.279674	1.136142
7	1.0	0.261269	0.279674	1.136142
8	1.5	0.556665	0.809080	0.484478
9	2.0	0.556665	0.809080	0.484478
10	2.5	0.556758	0.809351	0.484261
11	3.0	0.556758	0.809351	0.484261
12	3.5	0.556760	0.809355	0.484258
13	4.0	0.556760	0.809355	0.484258
14	4.5	0.556760	0.809355	0.484258
15	5.0	0.556760	0.809355	0.484258
16	5.5	0.556760	0.809355	0.484258
17	6.0	0.556760	0.809355	0.484258
18	6.5	0.556760	0.809355	0.484258
19	7.0	0.556760	0.809355	0.484258
20	7.5	0.556760	0.809355	0.484258
21	8.0	0.556760	0.809355	0.484258
22	8.5	0.556760	0.809355	0.484258
23	9.0	0.556760	0.809355	0.484258
24	9.5	0.556760	0.809355	0.484258
25	10.0	0.556760	0.809355	0.484258

The expected monotonic behavior for PAY_0 w.r.t. the model predictions holds even on missing data and out-of-range data. Although robust to missing values and high out-of-range values, this dependable behavior can make the model more susceptible to adversarial example attacks. Also, is the large swing in predicted probabilities w.r.t PAY_0 justified or is the model over-emphasizing this input variable? These possibilities are analyzed in subsequent sections.

Helper function for finding percentiles of predictions¶

ICE can be calculated for any row in the training or test data, but without intimate knowledge of a data source it can be difficult to know where to apply ICE. Calculating and analyzing ICE curves for every row of training and test data can be overwhelming, even for the example credit card default dataset. One place to start with ICE is to calculate ICE curves at every decile of predicted probabilities in a dataset, giving an indication of local prediction behavior across the dataset. The function below finds and returns the row indices for the maximum, minimum, and deciles of one column in terms of another.

In [14]:

def get_percentile_dict(sort_var, id_, frame):

    """ Returns the percentiles of a column, sort_var, as the indices based on 
        another column id_.
    
    Args:
        sort_var: Column in which to find percentiles.
        id_: Row id, column that stores indices for percentiles of yhat.
        frame: Pandas DataFrame containing sort_var and id_. 
    
    Returns:
        Dictionary of percentile values and index column values.
    
    """
    
    # create a copy of frame and sort it by sort_var
    sort_df = frame.copy(deep=True)
    sort_df.sort_values(sort_var, inplace=True)
    sort_df.reset_index(inplace=True)
    
    # find top and bottom percentiles
    percentiles_dict = {}
    percentiles_dict[0] = sort_df.loc[0, id_]
    percentiles_dict[99] = sort_df.loc[sort_df.shape[0]-1, id_]

    # find 10th-90th percentiles
    inc = sort_df.shape[0]//10
    for i in range(1, 10):
        percentiles_dict[i * 10] = sort_df.loc[i * inc,  id_]

    return percentiles_dict

Find some percentiles of `p_DEFAULT_NEXT_MONTH` in the test data¶

The values for ID that correspond to the maximum, minimum, and deciles of predicted probability of default, or p_DEFAULT_NEXT_MONTH, are displayed below. ICE will be calculated for the rows of the test dataset associated with these ID values. Logloss residuals, or r_DEFAULT_NEXT_MONTH are also calculated here for use in Section 5.

In [15]:

# merge GBM predictions and residuals onto test data
yhat = 'p_DEFAULT_NEXT_MONTH'
resid = 'r_DEFAULT_NEXT_MONTH' # residuals for use in section 5
yhat_test = pd.concat([test.reset_index(drop=True), pd.DataFrame(xgb_model.predict(dtest, ntree_limit=xgb_model.best_ntree_limit))], axis=1)
yhat_test = yhat_test.rename(columns={0: yhat})
yhat_test[resid] = -yhat_test[y]*np.log(yhat_test[yhat]) -\
                             (1 - yhat_test[y])*np.log(1 - yhat_test[yhat]) # logloss

# find percentiles of predictions
yhat_percentile_dict = get_percentile_dict(yhat, 'ID', yhat_test)

# display percentiles dictionary
# ID values for rows
# from lowest prediction 
# to highest prediction
yhat_percentile_dict

Out[15]:

Calculate ICE curve values¶

Here ICE curves are calculated at the minimum, maximum, and deciles of p_DEFAULT_NEXT_MONTH. In practice ICE should also be applied to understand local disparate impact, drift, and security concerns too. For example, it's reasonable to demand that changing any individual's demographic attributes should not greatly impact their prediction outcome. Additionally, ICE can uncover local, high sensitivity regions in the data or trained model response function. Perhaps a variable's average effect is negligible, but for certain high probability of default individuals, changing a variable value has a strong impact on model predictions. Or perhaps certain interactions only come into play for high probability of default individuals. ICE can be used to find a number of such locally interesting phenomena. ICE for PAY_0 is calculated here, along with partial dependence and ICE for AGE -- which will be used in Section 5.

In [16]:

# calculate partial dependence for AGE
# (used later in section 5)
# (interesting from disparate impact perspective)
par_dep_AGE = par_dep('AGE', test[X], xgb_model) 

# retreive bins from partial dependence calculation
bins_PAY_0 = list(par_dep_PAY_0['PAY_0']) 
bins_AGE = list(par_dep_AGE['AGE']) 

# for each percentile in percentile_dict
# create a new column in the par_dep frame 
# representing the ICE curve for that percentile
# and the variables of interest, PAY_0 and AGE
for i in sorted(yhat_percentile_dict.keys()):
    
    col_name = 'Percentile_' + str(i)
    
    # ICE curves for PAY_0 across percentiles at bins_PAY_0 intervals
    par_dep_PAY_0[col_name] = par_dep('PAY_0', 
                                    test[test['ID'] == int(yhat_percentile_dict[i])][X],  
                                    xgb_model, 
                                    bins=bins_PAY_0)['partial_dependence']
    
    # ICE curves for AGE across percentiles at bins_AGE intervals
    par_dep_AGE[col_name] = par_dep('AGE', 
                            test[test['ID'] == int(yhat_percentile_dict[i])][X],  
                            xgb_model, 
                            bins=bins_AGE)['partial_dependence']
    

par_dep_PAY_0.head(n=5)

Out[16]:

	PAY_0	partial_dependence	residual_0	residual_1	Percentile_0	Percentile_10	Percentile_20	Percentile_30	Percentile_40	Percentile_50	Percentile_60	Percentile_70	Percentile_80	Percentile_90	Percentile_99
0	NaN	0.165248	0.160465	1.581913	0.001872	0.057264	0.076212	0.097046	0.127956	0.142703	0.173241	0.222074	0.261528	0.151399	0.785043
1	-2.0	0.165248	0.160465	1.581913	0.001872	0.057264	0.076212	0.097046	0.127956	0.142703	0.173241	0.222074	0.261528	0.151399	0.785043
2	-1.5	0.175359	0.172128	1.522745	0.002067	0.061132	0.081262	0.102698	0.133810	0.153896	0.185654	0.236098	0.268033	0.162736	0.806465
3	-1.0	0.175359	0.172128	1.522745	0.002067	0.061132	0.081262	0.102698	0.133810	0.153896	0.185654	0.236098	0.268033	0.162736	0.806465
4	-0.5	0.175790	0.172634	1.519659	0.002302	0.061587	0.082731	0.102988	0.133810	0.153896	0.185654	0.236098	0.268033	0.163392	0.812470

Helper function to plot partial dependence and ICE¶

Overlaying partial dependence onto ICE in a plot is a convenient way to validate and understand both global and local monotonic behavior, even on out-of-range and missing data. Plots of partial dependence curves overlaid onto ICE curves for several percentiles of predictions for DEFAULT_NEXT_MONTH are used to validate monotonic behavior on missing and out-of-range data, describe the GBM model mechanisms, and to compare the local GBM behavior with the average GBM behavior in the test data to determine the trustworthiness of partial dependence.

In [17]:

def plot_par_dep_ICE(xs, par_dep_frame, ax=None, ticks=None, labels=None):

    
    """ Plots ICE overlayed onto partial dependence for a single variable.
    Conditionally uses user-defined axes, ticks, and labels for grouped subplots.
    
    Args: 
        xs: Name of variable for which to plot ICE and partial dependence.
        par_dep_frame: Name of Pandas DataFrame containing ICE and partial
                       dependence values.
        ax: Matplotlib axis object to use, default None.         
        ticks: List of numeric x-axis tick marks, default None.  
        labels: List of string values to use as the visual label for ticks, 
                default None. 
    
    """
    
    # initialize figure and axis for grouped subplots 
    if (ticks is None) & (labels is None) & (ax is None):
        pass
    else:
        _ = ax.set(xticks=ticks, xticklabels=labels) # user-defined 
    
    # for standalone plotting
    if ax is None:
        
        # initialize figure and axis
        fig, ax = plt.subplots()
        
        # plot ICE
        par_dep_frame.drop('partial_dependence', axis=1).plot(x=xs, 
                                                              colormap='coolwarm', 
                                                              ax=ax)
        # overlay partial dependence, annotate plot
        par_dep_frame.plot(title='Partial Dependence with ICE',
                           x=xs, 
                           y='partial_dependence',
                           color='grey', 
                           linewidth=3, 
                           ax=ax)
    
    # for grouped subplots 
    else: 
        
        # plot ICE
        par_dep_frame.drop('partial_dependence', axis=1).plot(x=xs, 
                                                              colormap='coolwarm',
                                                              ax=ax)
        # overlay partial dependence, annotate plot
        par_dep_frame.plot(title='Partial Dependence with ICE',
                           x=xs, 
                           y='partial_dependence',
                           color='grey', 
                           linewidth=3, 
                           ax=ax)

    # add legend
    _ = plt.legend(bbox_to_anchor=(1.05, 0),
                   loc=3, 
                   borderaxespad=0.)

Plot partial dependence, its residuals, and ICE along with a histogram of `PAY_0` grouped by target class¶

Another aspect of determining the trustworthiness of partial dependence and ICE is understanding the amount of data available for the model to learn from in different local regions. For any valid input data schema and values, a model will generate a prediction. This does not mean the model learned anything meaningful about that new kind of input data during training. To look for any possible epistemic uncertainty issues, a histogram is plotted alongside partial dependence, it's residuals, and ICE.

In [18]:

# setup ########################################################################

# tricks to have NaN appear on the x-axis
par_dep_PAY_0.iloc[0, 0] = -2.5
ticks = [-2.5] + list(par_dep_PAY_0['PAY_0'])[2:][::3]
labels = ['NaN'] + list(par_dep_PAY_0['PAY_0'])[2:][::3]

# setup 3-way figure
fig, (ax0, ax1, ax2) = plt.subplots(ncols=3, sharey=False)
plt.tight_layout()
plt.subplots_adjust(left=0.0, right=2.5, wspace=0.2)
   
# histogram grouped by target class for PAY_0 ##################################

from matplotlib.lines import Line2D # custom legend
custom_lines = [Line2D([0], [0], color='royalblue', lw=4),
                Line2D([0], [0], color='deeppink', lw=4)]

# establish groups and plot w/ custom legend
gb = test.groupby(['PAY_0', y]).count()
gb['color'] = 'deeppink'
gb.iloc[::2, :]['color'] = 'royalblue'
_ = test.groupby(['PAY_0', y]).count().plot(kind='bar', y='ID', legend=False, 
                                            color=gb['color'], 
                                            title='Distribution of DEFAULT_NEXT_MONTH by PAY_0', 
                                            ax=ax0)
_ = ax0.legend(custom_lines, ['PAY_0: 0', 'PAY_0: 1'])
 
# partial dependence and residuals #############################################

_ = ax1.set(xticks=ticks, xticklabels=labels)
_ = par_dep_PAY_0.plot(kind='line', x='PAY_0', y='partial_dependence', 
                       color='darkgrey', linewidth=3,
                       title='Partial Dependence with LogLoss Residuals', 
                       ax=ax1)
_ = par_dep_PAY_0.plot(kind='line', x='PAY_0', y='residual_0', color='royalblue', 
                       linestyle='--', ax=ax1)
_ = par_dep_PAY_0.plot(kind='line', x='PAY_0', y='residual_1', color='deeppink', 
                       linestyle='--', ax=ax1)

# partial dependence and ICE ###################################################

plot_par_dep_ICE('PAY_0', par_dep_PAY_0.drop(['residual_0', 'residual_1'], 
                                             axis=1), 
                  ax2, ticks, labels)

These three plots together are highly informative.

The monotonicity of the learned GBM response function has been confirmed at many percentiles of model predictions locally and on average globally.
In general there is not much real data for PAY_0 > 2, so predictions in that range are low-confidence, no matter what the modeled probability is. Luckily this model was trained with monotonicity constraints and it carries the predictions from PAY_0 = 2 forward even into out-of-range data. This seems reasonable. However new customers with PAY_0 > 2 could also be handed over to human case workers or to a more transparent business rule system to be even more cautious.
This model treats missing data with lower probability of default than someone who paid their most recent bill in-full. This seems worrisome from a predictive accuracy, model drift, and security perspective. Logically, why would NaN == -2? Also, PAY_0 is the most important variable in the model. Setting it to missing would often result in a customer being seen as likely to pay future bills. Is this prediction behavior prudent from a adversarial, drift, or security standpoint?
Partial dependence and ICE curves do not diverge, indicating that partial dependence is trustworthy and representative of local behavior and also that strong interactions with other variables are likely not affecting model predictions w.r.t. PAY_0 in the observed test set.
The customer at the 90th percentile of predicted probability of default has the largest observed swing of prediction values across the domain of PAY_0. This information will guide a search for adversarial examples in Section 5.
The distribution of actual payments and defaults is fairly equal for high values of PAY_0, say PAY_0 > 3, and yet when the scenario in which PAY_0 is high for everyone is simulated, this leads to both high predicted probabilities and large residuals for DEFAULT_NEXT_MONTH = 0 individuals. In this case, the inputs to the model changed, but not the labels, so this means that the situation in which someone has paid their bill, but has bad values for PAY_0, the model to generates high probabilities of default, and will lead to large errors. A mechanism for error has been identified: over-emphasis of a feature. I.e. PAY_0 > 2 leads to high error when someone does pay their bill the next month.
This over-emphasis on PAY_0 could be removed from the model by some means. Potentially by editing the final model artifact, or by injection of missing values or regularization during training, or by missing value injection based on assertions during scoring.

5. Adversarial Examples¶

The GBM model is highly sensitive to PAY_0 values, but what other variable's values have a large impact on model predictions? In this section, brute-force searches will be conducted for rows of data, known as adversarial examples, that exert a strong influence on the trained model. Information learned from Shapley values and ICE will guide a search for rows that can swing model predictions high and low. These rows are particularly interesting from a drift and security viewpoint. If market conditions change so that high-impact values of important features are more common, e.g. a recession, then this search will help users understand potential future model behavior. Combinations of variables with strong effects can also be recorded and then monitored for in real-time as an active preventative measure against malicious adversarial attacks. Then a search for adversarial examples that minimize model residuals for high-residual customers will also be conducted to confirm, isolate, or remediate any previously known or unknown error mechanisms. Finally, a search for combinations of the demographic variables, AGE, SEX, MARRIAGE, and EDUCATION, that have a strong effect on model predictions will be undertaken to help elucidate any unforeseen disparate impact problems.

Conduct adversarial example search over important variables for an interesting row¶

In the ICE plot in Section 4, the customer at the 90th percentile of predicted probability of default was shown to have a high degree of variability in their predictions across the values of PAY_0. The row will serve as a prototype in a search for adversarial examples.

The original row, used as prototype¶

For reference, the values for each variable in the prototype row are displayed.

In [19]:

yhat_test[yhat_test['ID'] == yhat_percentile_dict[90]]

Out[19]:

	ID	LIMIT_BAL	SEX	EDUCATION	MARRIAGE	AGE	PAY_0	PAY_2	PAY_3	PAY_4	PAY_5	PAY_6	BILL_AMT1	BILL_AMT2	BILL_AMT3	BILL_AMT4	BILL_AMT5	BILL_AMT6	PAY_AMT1	PAY_AMT2	PAY_AMT3	PAY_AMT4	PAY_AMT5	PAY_AMT6	DEFAULT_NEXT_MONTH	p_DEFAULT_NEXT_MONTH	r_DEFAULT_NEXT_MONTH
5594	18421	40000	2	2	2	24	2	0	0	0	0	0	19012	20018	21047	22360	22499	22969	1326	1355	1668	807	834	837	1	0.595022	0.519157

Function to find adversarial examples¶

The search for adversarial examples takes a prototype row, perturbs the values of certain variables in the row, and records the new prediction and residual for each new perturbed row.

In [20]:

def find_adversaries(xs, frame, model, row_id, oor_proportion=0.1, 
                     resolution=10, verbose=False):  
    
    """ A brute force function to find adversaries. Dynamically generates nested loops
        to populate adversary_frame with perturbed values and their predictions
        and logloss residuals.
        
        ASSUMES global vars y, yhat, and resid have been defined previously.
        
    Args:    
        xs: List of variables over which to find adversaries.
        frame: Pandas DataFrame for which to calculate bounds for adversary search.
               row_id is assumed to be in frame.
        model: Model to use in adversary search.
        row_id: Prototype row on which the search is based.
        oor_proportion: The proportion by which the search can exceed minimum and 
                        maximum values in frame. Must be between 0-1, default 0.1.
        resolution: The number of points across the domain of xs for which 
                    to search for adversaries, default 10 due to deep nesting.
        verbose: Boolean, whether to print generated code statements.
                
    Returns: 
        Frame containing all tested values and their associated predictions 
        and logloss residuals. 
    
    """
    
    # init dicts to hold bin values
    bins_dict = {}
    
    # find boundaries, bins and record 
    for j in xs:
        
        min_ = frame[j].min()
        max_ = frame[j].max()
        min_ = min_ - np.abs(oor_proportion*min_)
        max_ = max_ + np.abs(oor_proportion*max_)
        by = (max_ - min_)/resolution
        # modify max and by 
        # to preserve resolution and actually search up to max
        bins_dict[j] = np.arange(min_, (max_ + by), (by + np.round((1. / resolution) * by, 3)))
        bins_dict[j] = np.round_(bins_dict[j], 6)  # reasonable precision
    
    # initialize prototype row
    # deep copy to prevent corruption of original data
    row = frame[frame['ID'] == row_id].copy(deep=True)
    
    # generate nested loops dynamically ############################################
    # to search all vals in all search cols
    
    # init true tab
    # define code variable and init returned Pandas DataFrame, adversary_frame
    tab = '    '
    code = 'global adversary_frame\n'
    code += 'adversary_frame = pd.DataFrame(columns=xs + [yhat, resid])\n'

    # generate for loop statements for search
    for i, j in enumerate(xs):
        code += i*tab + 'for ' + string.ascii_lowercase[i] + ' in ' +\
            str(list(bins_dict[j])) + ':\n'
            
    # generate value assignment statements to perturb search vars
    for k, j in enumerate(xs):
        code += (i + 1)*tab + 'row["' + j + '"] = ' + string.ascii_lowercase[k] +\
            '\n'
    
    # generate progress reporting statements  
    # generate statements for appending test values, preds, and resids to adversary_frame
    # uses only absolute residuals to avoid averaging problems between 0/1 target classes
    code += (i + 1)*tab + 'if (adversary_frame.shape[0] + 1) % 1000 == 0:\n'
    code += (i + 2)*tab +\
        'print("Built %i/%i rows ..." % (adversary_frame.shape[0] + 1, (resolution)**(i+1)))\n'
    code += (i + 1)*tab +\
        'adversary_frame = adversary_frame.append(row, ignore_index=True, sort=False)\n' 
    code += 'print("Scoring ...")\n'
    code += 'adversary_frame[yhat] = model.predict(xgb.DMatrix(adversary_frame[model.feature_names]), ntree_limit=model.best_ntree_limit)\n'
    code += 'adversary_frame[resid] = np.abs(adversary_frame[y]*np.log(adversary_frame[yhat]) - (1 - adversary_frame[y])*np.log(1 - adversary_frame[yhat]))\n'
    code += 'print("Done.")'
    
    if verbose:
        print('Executing:')
        print(code)
        print('--------------------------------------------------------------------------------')
    
    # execute generated code
    exec(code)
    
    return adversary_frame

Due to fluctuations observed in the ICE plot above, the customer at the 90th percentile of probability of default will serve as the prototype row. In the Shapley summary plot in Section 3, PAY_0, PAY_3, PAY_AMT1, and PAY_AMT2, exhibit the most extreme local prediction contributions. The search for adversarial examples will be conducted over perturbations of these four variables. The search will consider 10,000 permutations of PAY_0, PAY_3, PAY_AMT1, and PAY_AMT2 to find adversarial examples with the largest impact on model predictions and to understand the contributions and interactions of these variables in the model in general.

In [21]:

res_ = 10 # keep resolution small to avoid long run time

# four variables with widest Shapley contribs
search_cols = ['PAY_0', 'PAY_3', 'PAY_AMT1', 'PAY_AMT2'] 

# execute search
yhat_adversaries = find_adversaries(search_cols, test, xgb_model, yhat_percentile_dict[90], resolution=res_)

Built 1000/10000 rows ...
Built 2000/10000 rows ...
Built 3000/10000 rows ...
Built 4000/10000 rows ...
Built 5000/10000 rows ...
Built 6000/10000 rows ...
Built 7000/10000 rows ...
Built 8000/10000 rows ...
Built 9000/10000 rows ...
Built 10000/10000 rows ...
Scoring ...
Done.

Adversarial examples that minimize model predictions¶

These, and similar, rows minimize the predicted probability of default for the prototype row. They could be used by a malicious attacker in an integrity attack using hacked or falsified application information to maximize their chance at receiving undeserved credit line increases, loans, or other financially beneficial outcomes. These rows should be recorded and monitored for after the model is deployed.

In [22]:

yhat_adversaries.sort_values(by=yhat).head(n=3)

Out[22]:

	PAY_0	PAY_3	PAY_AMT1	PAY_AMT2	p_DEFAULT_NEXT_MONTH	r_DEFAULT_NEXT_MONTH	ID	LIMIT_BAL	SEX	EDUCATION	MARRIAGE	AGE	BILL_AMT1	BILL_AMT2	BILL_AMT3	BILL_AMT4	BILL_AMT5	BILL_AMT6	PAY_AMT3	PAY_AMT4	PAY_AMT5	PAY_AMT6	DEFAULT_NEXT_MONTH
65	-2.2	-2.2	634198.752	735359.955	0.026825	3.618438	18421.0	40000.0	2.0	2.0	2.0	24.0	19012.0	20018.0	21047.0	22360.0	22499.0	22969.0	1668.0	807.0	834.0	837.0	1.0
52	-2.2	-2.2	528498.960	294143.982	0.026825	3.618438	18421.0	40000.0	2.0	2.0	2.0	24.0	19012.0	20018.0	21047.0	22360.0	22499.0	22969.0	1668.0	807.0	834.0	837.0	1.0
51	-2.2	-2.2	528498.960	147071.991	0.026825	3.618438	18421.0	40000.0	2.0	2.0	2.0	24.0	19012.0	20018.0	21047.0	22360.0	22499.0	22969.0	1668.0	807.0	834.0	837.0	1.0

Adversarial examples that maximize model predictions¶

These, and similar, rows maximize the predicted probability of default for the prototype row. They could be used by a malicious attacker in a hacked or falsified application in a denial-of-service (DOS) attack to minimize another customer's chances of receiving credit line increases, loans, or other financially beneficial outcomes. These rows should be recorded and monitored for after the model is deployed.

In [23]:

yhat_adversaries.sort_values(by=yhat).tail(n=3)

Out[23]:

	PAY_0	PAY_3	p_DEFAULT_NEXT_MONTH	r_DEFAULT_NEXT_MONTH	ID	LIMIT_BAL	SEX	EDUCATION	MARRIAGE	AGE	BILL_AMT1	BILL_AMT2	BILL_AMT3	BILL_AMT4	BILL_AMT5	BILL_AMT6	PAY_AMT3	PAY_AMT4	PAY_AMT5	PAY_AMT6	DEFAULT_NEXT_MONTH
7900	6.27	8.69	0.60917	0.495658	18421.0	40000.0	2.0	2.0	2.0	24.0	19012.0	20018.0	21047.0	22360.0	22499.0	22969.0	1668.0	807.0	834.0	837.0	1.0
7600	6.27	5.06	0.60917	0.495658	18421.0	40000.0	2.0	2.0	2.0	24.0	19012.0	20018.0	21047.0	22360.0	22499.0	22969.0	1668.0	807.0	834.0	837.0	1.0
6600	5.06	5.06	0.60917	0.495658	18421.0	40000.0	2.0	2.0	2.0	24.0	19012.0	20018.0	21047.0	22360.0	22499.0	22969.0	1668.0	807.0	834.0	837.0	1.0

Function for visual summary of adversary search results.¶

This function groups the DataFrame by two search variables and uses a custom descriptive statistic to summarize the group in terms of predictions or residuals. The summarized value is then plotted on a surface in terms of the two groupby variables.

In [24]:

def plot_sm_3d(frame, resolution, search_cols, z_axis_var, stat):

    """Plots 6 small multiples in a grouped subplot to display adversary search 
       results from find_adversaries.
    
    Args:
        frame: Pandas DataFrame containing potential adversaries generated by 
               find_adversaries.
        resolution: The number of points across the domain of the variables in 
                    search_cols in find_adversaries. 
        search_cols: List of search_cols used in find_adversaries.
        z_axis_var: The name of the variable by which to group the frame and
                    that will appear in the title and z-axis label of the 
                    generated plot.
        stat: Groupby stat for z_axis_var and frame.
    
    """
    
    max_ = frame[z_axis_var].max() # establish z-axis high limit
    
    fig = plt.figure(figsize=plt.figaspect(0.67)*2.5) # init 2 x 3 plot

    _2d_shape = (resolution, resolution) # correct shape for 3-D display of 1-D arrays

    # loop through 2-way combos of search_cols
    for i, two_way_combo in enumerate(list(combinations(search_cols, 2))):

        # many unique values exist for each 2-way combo of search_cols values 
        # summarize by stat to plot concisely
        
        # execute groupby
        groups = eval('frame[[two_way_combo[0], two_way_combo[1], z_axis_var]]\
                        .groupby([two_way_combo[0], two_way_combo[1]]).' + stat + '()')

        groups = groups.reset_index() # convert groupby vals to cols (from index)

        ax = fig.add_subplot(2, 3, (i + 1), projection='3d') # subplot placement
            
        # values and labels for each axis
        x = np.asarray(groups[two_way_combo[0]]).reshape(_2d_shape)
        ax.set_xlabel('\n' + two_way_combo[0])
        y = np.asarray(groups[two_way_combo[1]]).reshape(_2d_shape)
        ax.set_ylabel('\n' + two_way_combo[1])
        z = np.asarray(groups[z_axis_var]).reshape(_2d_shape)
        ax.set_zlabel('\n' + stat + ' ' + str(z_axis_var))
        ax.set_zlim3d(0, max_) # ensure constant scale
        
        # plot
        surf = ax.plot_surface(x, y, z, 
                               cmap=cm.coolwarm, 
                               linewidth=0.05, 
                               rstride=1, 
                               cstride=1, 
                               antialiased=True)
        
    
    # main title and display    
    fig.suptitle(stat.title() + ' ' + str(z_axis_var) + ' for ' + str(search_cols))    
    plt.tight_layout()
    _ = plt.show()
    

Maximum predictions across pairs of search variables¶

The plots below show the maximum model prediction generated by an adversary in terms of values of the search variables.

In [25]:

plot_sm_3d(yhat_adversaries, res_, search_cols, yhat, 'max') # plot max prediction

The search has revealed several interesting patterns:

The monotonicity of the learned GBM response function was preserved across this set of variables.
Another potential error mechanism has been discovered: non-robust interaction features. Due to the PDP and ICE for PAY_0 showing no evidence of strong interactions with other variables, there appear to be some interactions between PAY_0, PAY_AMT1, and PAY_AMT2, which may not be observed in the training or validation data, but that have a strong effect on model predictions. They should be remediated from the model by some method. For instance high PAY_0 values and high PAY_AMT1 values seem to have a strong effect on model predictions, but are likely unrealistic and unobserved, because at a certain point, PAY_AMT1 values exceed a customer's LIMIT_BAL -- meaning they cannot technically default. If the impact of this interaction is somehow remediated, potentially by editing the final model artifact or by injection of missing values or regularization during training or missing value injection based on assertions during scoring, that could possibly lead to more accurate model predictions on future data.
This model is vulnerable to adversarial attack, especially using PAY_0 and PAY_3. Obvious combinations of values for PAY_0 and PAY_3 lead to extreme predicted probabilities of default, but so do other combinations of the repayment status and payment amount variables. Perhaps more surprisingly, combinations of low values for the payment amount variables alone can also lead to high predictions. These payment amount value combinations could be the more subtle route for an adversarial denial-of-service (DOS) attack. Active counter-measures should be considered.
Users should investigate these patterns to get a sense for how their model might perform if data drifts in any of the searched directions.

Conduct adversarial example search over high-residual rows¶

By attempting to decrease the residual of a high-error row, it may be possible to decrease the overall error of the model or to learn more about general error mechanisms in the model.

High-residual prototype row¶

A customer with a high residual will serve as the prototype row for a search to reduce residual values.

In [26]:

resid_percentile_dict = get_percentile_dict(resid, 'ID', yhat_test) # dictionary of row ID and residuals
high_row = yhat_test[yhat_test['ID'] == resid_percentile_dict[80]] # take customer with high residual
high_row # display

Out[26]:

	ID	LIMIT_BAL	SEX	EDUCATION	MARRIAGE	AGE	PAY_0	PAY_2	PAY_3	PAY_4	PAY_5	PAY_6	BILL_AMT1	BILL_AMT2	BILL_AMT3	BILL_AMT4	BILL_AMT5	BILL_AMT6	PAY_AMT1	PAY_AMT2	PAY_AMT3	PAY_AMT4	PAY_AMT5	PAY_AMT6	DEFAULT_NEXT_MONTH	p_DEFAULT_NEXT_MONTH	r_DEFAULT_NEXT_MONTH
2083	6898	120000	1	2	2	32	1	2	2	2	2	2	60316	61763	63154	64524	66345	68104	3000	3000	3000	3000	3000	0	1	0.584431	0.537117

Execute search and visually summarize adversaries¶

This search will be over the same combinations of PAY_0, PAY_3, PAY_AMT1, and PAY_AMT2, but results will be analyzed in terms of minimum residual instead of maximum prediction.

In [27]:

# four variables with widest Shapley contribs
search_cols = ['PAY_0', 'PAY_3', 'PAY_AMT1', 'PAY_AMT2'] 

# search for high-residual adversaries
high_resid_adversaries = find_adversaries(search_cols, test, xgb_model, resid_percentile_dict[80], resolution=res_)

# plot minimum residual
plot_sm_3d(high_resid_adversaries, res_, search_cols, resid, 'min')

Built 1000/10000 rows ...
Built 2000/10000 rows ...
Built 3000/10000 rows ...
Built 4000/10000 rows ...
Built 5000/10000 rows ...
Built 6000/10000 rows ...
Built 7000/10000 rows ...
Built 8000/10000 rows ...
Built 9000/10000 rows ...
Built 10000/10000 rows ...
Scoring ...
Done.

These results basically confirm the findings from the search over predictions for these same variables above and from ICE plots. Interactions with low PAY_3 values or high PAY_AMT1 and high PAY_AMT2 values do worsen the over emphasis on PAY_0 in ways that seem robust, just too dramatic. Interestingly, low PAY_0 and high PAY_AMT2 values exhibit the highest residual for this relatively high-risk individual. Also, the same non-robust lack of sensitivity to PAY_AMT* variables is observed when PAY_0 is high.

Conduct adversarial example search over `AGE`, `EDUCATION`, `SEX` and `MARRIAGE`¶

This search will look for combinations of demographic segments that may experience disparate impact under the trained model.

Partial dependence and ICE for AGE¶

ICE will be used again to select a prototype row with predictions that vary with AGE.

In [28]:

plot_par_dep_ICE('AGE', par_dep_AGE)

Prototype row for demographic search¶

The prototype row for the demographic adversary search is a relatively young married male with a graduate education.

In [29]:

yhat_test[yhat_test['ID'] == yhat_percentile_dict[70]]

Out[29]:

	ID	LIMIT_BAL	SEX	EDUCATION	MARRIAGE	AGE	PAY_0	PAY_2	PAY_3	PAY_4	PAY_5	PAY_6	BILL_AMT1	BILL_AMT2	BILL_AMT3	BILL_AMT4	BILL_AMT5	BILL_AMT6	PAY_AMT1	PAY_AMT2	PAY_AMT3	PAY_AMT4	PAY_AMT5	PAY_AMT6	DEFAULT_NEXT_MONTH	p_DEFAULT_NEXT_MONTH	r_DEFAULT_NEXT_MONTH
2272	7411	20000	1	2	1	52	0	0	0	0	0	0	15768	16792	18105	18157	18538	19200	1285	1598	650	674	971	580	0	0.236098	0.269315

Execute and visually summarize adversaries¶

Search for adversaries whose predictions change widely due to changes in demographic variables. This search will be analyzed in terms of maximum prediction for each tested combination of AGE, EDUCATION, SEX and MARRIAGE.

In [30]:

search_cols = ['AGE', 'EDUCATION', 'SEX', 'MARRIAGE'] 

# execute search
dem_adversaries = find_adversaries(search_cols, test, xgb_model, yhat_percentile_dict[80], resolution=res_)
plot_sm_3d(dem_adversaries, res_, search_cols, yhat, 'max') # display max change in predictions

Built 1000/10000 rows ...
Built 2000/10000 rows ...
Built 3000/10000 rows ...
Built 4000/10000 rows ...
Built 5000/10000 rows ...
Built 6000/10000 rows ...
Built 7000/10000 rows ...
Built 8000/10000 rows ...
Built 9000/10000 rows ...
Built 10000/10000 rows ...
Scoring ...
Done.

From this search, it can be seen that:

EDUCATION has little impact on model predictions.
Even high-degree interactions of demographic variables do not swing model predictions drastically compared to other important variables, however several subgroups do appear to have higher predictions for probability of default:
- Older male customers
- Older married customers
- Married male customers
A more detailed analysis of disparate impact should always be conducted for models that affect people, however this adversary search has indicated that certain subgroups maybe be at higher risk of disparate impact for this model. For an example of a basic disparate impact analysis, please see: https://github.com/jphall663/interpretable_machine_learning_with_python/blob/master/dia.ipynb.
If the interactions described above appear non-robust they could be removed from the model by some means, potentially by editing the final model artifact or by injection of missing values or regularization during training or missing value injection based on assertions during scoring.
Impersonation attacks seem potentially less damaging than standard adversarial example attacks.
These patterns should also be considered in the context of model drift, if the customer population gets older, more male, or contains more married customers, this model indicates much higher levels of risk may be present.

6. Bonus: Random Attacks¶

Random attacks expose a model to random data in an attempt to discover patterns or problems structured or guided searches may not reveal. As the nonlinear or interaction complexity of machine learning models may simply be incomprehensible to human users in some cases, exposure to random data may help reveal unforeseen, unaccounted for, or previously unknown problems.

Function for random attacks¶

In [31]:

def random_attack(X, frame, model, oor_proportion=0.33, N=10000, 
                  inject_missing=True, missing_proportion=0.15, seed=12345):

    """ Generates random data and scores model on random data.

    Args:
        X: Input variables for model in frame.
        frame: Pandas DataFrame for which to calculate bounds 
               for random data, should contain X.
        model: Model to score on random data.
        oor_proportion: The proportion by which the random values can 
                        exceed minimum and maximum values in frame. 
                        Must be between 0-1, default 0.33.
        N: Number of random samples to generate.
        inject_missing: Whether to randomly inject missing values into 
                        generated frame.
        missing_proportion: The proportion of random values to be  
                            replaced by nan. Must be between 0-1, 
                            default 0.15.
        seed: Seed to enforce reproducibility. 

    Returns:
        Scored random frame.

    """
    
    # init random frame
    random_frame = pd.DataFrame(columns=X, index=np.arange(N))
    
    # find bounds for each j and generate random data
    for j in X:
        
        min_ = frame[j].min()
        max_ = frame[j].max()
        
        np.random.seed(seed) # ensure column bounds are set similary 
        random_frame[j] =\
            np.random.uniform(low=(min_ - np.abs(oor_proportion*min_)),
                              high=(max_ + np.abs(oor_proportion*max_)),
                              size=(N,)) 
        
        # ensure treatment as numeric
        random_frame[j] = pd.to_numeric(random_frame[j])
        
    if inject_missing:
        
        np.random.seed(seed) # ensure nan is injected similarly       
        random_frame =\
            random_frame.mask(np.random.random(random_frame.shape) < missing_proportion)    
    
    # score
    random_frame[yhat] = model.predict(xgb.DMatrix(random_frame[model.feature_names]), 
                                       ntree_limit=model.best_ntree_limit)

    return random_frame

Execute random attacks with and without missing values¶

Partial dependence and ICE revealed a potentially worrisome correlation between low predicted probabilities and missing values. Missingness' effect on the model predictions will also be considered here, as well as problems not considered earlier.

In [32]:

# Run largish random attack
N_ = 1000000
randoms_w_miss = random_attack(X, test, xgb_model, N=N_)
randoms_wo_miss = random_attack(X, test, xgb_model, N=N_, inject_missing=False)

Confirm the only difference between the two random attacks is the presence of missing values¶

In the two sets below, the only apparent difference between them appears to be the presence of missing values.

In [33]:

randoms_w_miss.head(n=5)

Out[33]:

	LIMIT_BAL	SEX	EDUCATION	MARRIAGE	AGE	PAY_0	PAY_2	PAY_3	PAY_4	PAY_5	PAY_6	BILL_AMT1	BILL_AMT2	BILL_AMT3	BILL_AMT4	BILL_AMT5	BILL_AMT6	PAY_AMT1	PAY_AMT2	PAY_AMT3	PAY_AMT4	PAY_AMT5	PAY_AMT6	p_DEFAULT_NEXT_MONTH
0	1.236861e+06	2.519936	7.418336	3.709168	93.719507	9.703894	9.703894	9.703894	9.703894	9.703894	9.703894	1.178004e+06	NaN	NaN	1.101620e+06	1.140599e+06	1.174860e+06	1.080050e+06	1.502795e+06	1.107854e+06	767797.819540	383447.627638	651754.019301	0.703498
1	4.253598e+05	1.299587	2.524677	1.262338	41.177058	1.547795	1.547795	NaN	1.547795	1.547795	NaN	2.649418e+05	352621.957687	234265.111327	NaN	3.343339e+05	2.673989e+05	3.675728e+05	5.114453e+05	3.770353e+05	NaN	130498.446385	221810.961429	0.496677
2	2.500798e+05	1.035998	1.467672	0.733836	29.828164	-0.213880	-0.213880	-0.213880	-0.213880	-0.213880	-0.213880	6.772515e+04	167389.440220	101937.093858	2.096570e+05	1.601847e+05	7.139200e+04	2.136813e+05	NaN	2.191822e+05	151904.064128	75862.748677	128945.513811	0.032246
3	2.773946e+05	1.077075	1.632391	0.816196	31.596725	0.060652	0.060652	0.060652	0.060652	0.060652	0.060652	9.845852e+04	196255.247953	122558.500738	2.343472e+05	1.873234e+05	1.019368e+05	2.376631e+05	3.306873e+05	2.437813e+05	168952.470865	84376.931645	143417.250160	0.013024
4	7.579705e+05	1.799773	NaN	2.265223	62.712680	4.890743	4.890743	NaN	4.890743	4.890743	4.890743	NaN	NaN	485370.316867	6.687453e+05	6.647986e+05	6.393410e+05	6.595967e+05	9.177709e+05	6.765768e+05	468901.133269	234174.964519	NaN	0.571079

In [34]:

randoms_wo_miss.head(n=5)

Out[34]:

	LIMIT_BAL	SEX	EDUCATION	MARRIAGE	AGE	PAY_0	PAY_2	PAY_3	PAY_4	PAY_5	PAY_6	BILL_AMT1	BILL_AMT2	BILL_AMT3	BILL_AMT4	BILL_AMT5	BILL_AMT6	PAY_AMT1	PAY_AMT2	PAY_AMT3	PAY_AMT4	PAY_AMT5	PAY_AMT6	p_DEFAULT_NEXT_MONTH
0	1.236861e+06	2.519936	7.418336	3.709168	93.719507	9.703894	9.703894	9.703894	9.703894	9.703894	9.703894	1.178004e+06	1.210201e+06	846909.684010	1.101620e+06	1.140599e+06	1.174860e+06	1.080050e+06	1.502795e+06	1.107854e+06	767797.819540	383447.627638	651754.019301	0.500221
1	4.253598e+05	1.299587	2.524677	1.262338	41.177058	1.547795	1.547795	1.547795	1.547795	1.547795	1.547795	2.649418e+05	3.526220e+05	234265.111327	3.680946e+05	3.343339e+05	2.673989e+05	3.675728e+05	5.114453e+05	3.770353e+05	261304.061796	130498.446385	221810.961429	0.443920
2	2.500798e+05	1.035998	1.467672	0.733836	29.828164	-0.213880	-0.213880	-0.213880	-0.213880	-0.213880	-0.213880	6.772515e+04	1.673894e+05	101937.093858	2.096570e+05	1.601847e+05	7.139200e+04	2.136813e+05	2.973188e+05	2.191822e+05	151904.064128	75862.748677	128945.513811	0.011485
3	2.773946e+05	1.077075	1.632391	0.816196	31.596725	0.060652	0.060652	0.060652	0.060652	0.060652	0.060652	9.845852e+04	1.962552e+05	122558.500738	2.343472e+05	1.873234e+05	1.019368e+05	2.376631e+05	3.306873e+05	2.437813e+05	168952.470865	84376.931645	143417.250160	0.013024
4	7.579705e+05	1.799773	4.530446	2.265223	62.712680	4.890743	4.890743	4.890743	4.890743	4.890743	4.890743	6.391795e+05	7.041186e+05	485370.316867	6.687453e+05	6.647986e+05	6.393410e+05	6.595967e+05	9.177709e+05	6.765768e+05	468901.133269	234174.964519	398032.125757	0.496527

Plot ranked predictions with a reference line¶

The predictions of the model on the random attack dataset are ranked from lowest to highest and plotted in two dimensions, with the x-axis simply being the ranked row index. This technique allows large amounts of information to be summarized in just two dimensions. The ranked predictions are compared to a linearly increasing reference line.

In [35]:

fig, ax = plt.subplots(figsize=(8, 6)) # init plot

# 45-degree reference line
_ = ax.plot(np.linspace(*[0, N_]), np.linspace(*[0, 1]), c='grey', linestyle='-')

# sorted preds w/ missing
_ = randoms_w_miss.sort_values(by=yhat).reset_index().plot(y=yhat, ax=ax,
                                                           title='Ranked Predictions on Random Data',
                                                           color='royalblue',
                                                           label='33% Missing Data')

# sorted preds w/o missing
_ = randoms_wo_miss.sort_values(by=yhat).reset_index().plot(y=yhat, ax=ax,
                                                            label='0% Missing Data',
                                                            color='deeppink')
# margins and axes decoration
_ = ax.margins(x=0, y=0) 
_ = ax.get_xaxis().set_ticks([])
_ = ax.set_xlabel('Ranked Row Index')
_ = ax.set_ylabel('Predicted Probability')

These results confirm the model treats customers with missing data quite differently than those without missing data. Missing values appear to increase output probabilities for low and high predicted probabilities, but seem to decrease model output for medium range predictions relative to rows without missing data. If missing data were to be allowed in credit applications, or could be hacked into the process, missingness could be used in an adversarial attack to alter the model's predictions. Also, if missingness was to somehow become more common in future data, that would have a notable impact on future model predictions.

Another interesting result of the random attack is that adversaries that can force the model to generate nearly any predicted probability are now known. So, aside from general observations regarding missing values, the adversarial examples of interest seem to be on the extreme low side for both missing and non-missing cases, in the middle range of model output for rows without missing data, and the in the extreme high range of rows with missing data. A few examples of random adversaries are displayed below.

Low-range non-missing random adversarial examples¶

In [36]:

randoms_wo_miss.sort_values(by=yhat).head(n=3)

Out[36]:

	LIMIT_BAL	SEX	EDUCATION	MARRIAGE	AGE	PAY_0	PAY_2	PAY_3	PAY_4	PAY_5	PAY_6	BILL_AMT1	BILL_AMT2	BILL_AMT3	BILL_AMT4	BILL_AMT5	BILL_AMT6	PAY_AMT1	PAY_AMT2	PAY_AMT3	PAY_AMT4	PAY_AMT5	PAY_AMT6	p_DEFAULT_NEXT_MONTH
471461	213233.843349	0.980589	1.245477	0.622739	27.442493	-0.584205	-0.584205	-0.584205	-0.584205	-0.584205	-0.584205	26267.889148	128451.385453	74120.187509	176351.564266	123576.545970	30189.040414	181331.526533	252306.916916	185999.575336	128906.897330	64377.682131	109424.103993	0.003226
614130	207333.039739	0.971715	1.209893	0.604947	27.060432	-0.643511	-0.643511	-0.643511	-0.643511	-0.643511	-0.643511	19628.587815	122215.530486	69665.363181	171017.759792	117713.813803	23590.465060	176150.769185	245098.347405	180685.448858	125223.945070	62538.370700	106297.787562	0.003226
977178	212487.610337	0.979467	1.240977	0.620489	27.394176	-0.591705	-0.591705	-0.591705	-0.591705	-0.591705	-0.591705	25428.263538	127662.780859	73556.817297	175677.035635	122835.127585	29354.565131	180676.352693	251395.299975	185327.535243	128441.140335	64145.077388	109028.740805	0.003226

Low-range random adversarial examples with missingness¶

In [37]:

randoms_w_miss.sort_values(by=yhat).head(n=3)

Out[37]:

	LIMIT_BAL	SEX	EDUCATION	MARRIAGE	AGE	PAY_0	PAY_2	PAY_3	PAY_4	PAY_5	PAY_6	BILL_AMT1	BILL_AMT2	BILL_AMT3	BILL_AMT4	BILL_AMT5	BILL_AMT6	PAY_AMT1	PAY_AMT2	PAY_AMT3	PAY_AMT4	PAY_AMT5	PAY_AMT6	p_DEFAULT_NEXT_MONTH
35709	261416.477889	1.053047	NaN	0.768018	NaN	NaN	NaN	NaN	NaN	NaN	NaN	80480.679620	179369.860421	110495.770679	219904.403622	171448.310580	84069.285506	223634.669358	3.111680e+05	229391.735273	158979.808496	79396.462009	134951.840886	0.000959
585563	252037.649816	1.038943	NaN	0.739739	NaN	NaN	NaN	NaN	-0.194202	NaN	NaN	69928.071901	169458.496565	103415.204333	211426.772701	162129.993843	73581.408285	215400.293897	2.997106e+05	220945.380862	153126.067492	76473.032112	129982.825436	0.001123
722478	840381.465449	1.923704	5.027415	2.513707	NaN	NaN	NaN	NaN	5.719025	NaN	NaN	731904.327605	791208.845739	547586.628396	743237.536748	746677.869242	731497.051289	731951.385402	1.018446e+06	750794.136326	520337.438796	259862.884993	NaN	0.001153

Mid-range non-missing random adversarial examples¶

In [38]:

randoms_wo_miss[(randoms_wo_miss[yhat] > 0.3) & (randoms_wo_miss[yhat] < 0.5)].sort_values(by=yhat).head(n=3)

Out[38]:

	LIMIT_BAL	SEX	EDUCATION	MARRIAGE	AGE	PAY_0	PAY_2	PAY_3	PAY_4	PAY_5	PAY_6	BILL_AMT1	BILL_AMT2	BILL_AMT3	BILL_AMT4	BILL_AMT5	BILL_AMT6	PAY_AMT1	PAY_AMT2	PAY_AMT3	PAY_AMT4	PAY_AMT5	PAY_AMT6	p_DEFAULT_NEXT_MONTH
138040	12774.271391	0.679135	0.036630	0.018315	14.463292	-2.598950	-2.598950	-2.598950	-2.598950	-2.598950	-2.598950	-199279.611036	-83390.407642	-77217.195800	-4846.155996	-75589.679250	-193974.934249	5333.057702	7420.482098	5470.347527	3791.221167	1893.382249	3218.221739	0.304458
659968	12831.514608	0.679221	0.036975	0.018488	14.466999	-2.598374	-2.598374	-2.598374	-2.598374	-2.598374	-2.598374	-199215.203712	-83329.914119	-77173.979910	-4794.413192	-75532.805361	-193910.922005	5383.315808	7490.411846	5521.899437	3826.949188	1911.225260	3248.549881	0.304458
456685	12864.778682	0.679271	0.037176	0.018588	14.469152	-2.598040	-2.598040	-2.598040	-2.598040	-2.598040	-2.598040	-199177.776571	-83294.761290	-77148.867127	-4764.345412	-75499.755904	-193873.724445	5412.520830	7531.048072	5551.856288	3847.710766	1921.593846	3266.173585	0.304458

In [39]:

randoms_wo_miss[(randoms_wo_miss[yhat] > 0.3) & (randoms_wo_miss[yhat] < 0.5)].sort_values(by=yhat).tail(n=3)

Out[39]:

	LIMIT_BAL	SEX	EDUCATION	MARRIAGE	AGE	PAY_0	PAY_2	PAY_3	PAY_4	PAY_5	PAY_6	BILL_AMT1	BILL_AMT2	BILL_AMT3	BILL_AMT4	BILL_AMT5	BILL_AMT6	PAY_AMT1	PAY_AMT2	PAY_AMT3	PAY_AMT4	PAY_AMT5	PAY_AMT6	p_DEFAULT_NEXT_MONTH
802206	493116.682111	1.401481	2.933277	1.466638	45.564129	2.228795	2.228795	2.228795	2.228795	2.228795	2.228795	341178.641529	424226.148617	285418.336245	429340.891885	401653.606376	343168.106289	427061.628583	594218.804096	438055.549842	303594.143623	151618.630809	257709.384302	0.497614
32798	493270.105312	1.401712	2.934202	1.467101	45.574063	2.230337	2.230337	2.230337	2.230337	2.230337	2.230337	341351.265959	424388.283284	285534.163419	429479.572886	401806.039704	343339.671827	427196.330299	594406.229721	438193.719208	303689.901821	151666.453625	257790.669752	0.497614
473196	493097.658017	1.401453	2.933162	1.466581	45.562898	2.228603	2.228603	2.228603	2.228603	2.228603	2.228603	341157.236530	424206.044323	285403.973964	429323.695786	401634.705023	343146.832590	427044.925906	594195.563786	438038.417185	303582.269845	151612.700899	257699.305109	0.497614

High-range random adversarial examples with missingness¶

In [40]:

randoms_w_miss.sort_values(by=yhat).tail(n=3)

Out[40]:

	LIMIT_BAL	SEX	EDUCATION	MARRIAGE	AGE	PAY_0	PAY_2	PAY_3	PAY_4	PAY_5	PAY_6	BILL_AMT1	BILL_AMT2	BILL_AMT3	BILL_AMT4	BILL_AMT5	BILL_AMT6	PAY_AMT1	PAY_AMT2	PAY_AMT3	PAY_AMT4	PAY_AMT5	PAY_AMT6	p_DEFAULT_NEXT_MONTH
994580	NaN	2.218071	6.207844	3.103922	80.722639	7.686406	7.686406	7.686406	7.686406	7.686406	7.686406	NaN	NaN	695366.301520	NaN	941162.00143	NaN	NaN	NaN	NaN	642511.836964	320878.274649	545403.570488	0.993020
799516	1.288654e+06	2.597823	7.730665	3.865333	97.072934	10.224442	10.224442	10.224442	10.224442	10.224442	10.224442	NaN	NaN	886010.624324	NaN	NaN	NaN	NaN	1.566067e+06	NaN	800123.874604	NaN	679194.363334	0.993734
707572	NaN	NaN	7.356426	3.678213	93.054784	9.600710	9.600710	9.600710	9.600710	9.600710	9.600710	NaN	NaN	839159.016062	NaN	NaN	NaN	NaN	1.490254e+06	1.098609e+06	761390.083132	380247.525656	NaN	0.994000

Any of the above values could be used in an adversarial integrity or DOS attack and should potentially be monitored for after the model is deployed. These values could also be used to find non-sensical, non-robust interactions that can cause unwanted model behavior. If these interactions can be isolated, they can be potentially remediated from model mechanisms, potentially by editing the final model artifact or by injection of missing values or regularization during training or missing value injection based on assertions during scoring.

Conclusion¶

In this model debugging exercise several accuracy, drift, and security patterns were discovered.

Monotonicity constraints held in all tested scenarios.
Two types of error mechanisms were identified:
- Over-emphasis of an important feature.
- Non-robust interactions.
In-depth analysis with partial dependence and ICE found an over-emphasis on PAY_0.
Searching for adversarial examples found:
- Specific features and combinations of features that can swing model predictions.
- Specific features and combinations of features that can reduce the residual of high-residual customers.
- Specific features and combinations of features that should be investigated further for disparate impact.
- A non-robust interaction between PAY_0, PAY_3, PAY_AMT1 and PAY_AMT2 in which large payment amounts do not decrease probability of default.
- That randomly-generated adversaries can make the model predict nearly any probability.
- That missing data can drastically affect model predictions.
To make the model more accurate, the final model artifact could be edited to remove over-emphasis on certain features or to remove model dependence on non-robust features.
To make the model more accurate, missing values could be injected at training time or regularization could be used to lessen the impact of overly-important features or assertions could be appended to model scoring logic to inject missing values into non-robust interactions at scoring time.
Assertions could be appended to model scoring logic to catch, flag, or remediate security problems in real-time, for instance by re-routing any known adversaries to a review queue or by alerting IT personnel if known adversaries are detected.
An autoencoder could be trained to detect random adversaries or other types of anomalies at scoring time.
All discovered patterns should also be considered in the context of model drift.