{"id":1300,"date":"2023-10-16T17:22:48","date_gmt":"2023-10-16T15:22:48","guid":{"rendered":"https:\/\/lorentzen.ch\/?p=1300"},"modified":"2023-11-12T07:23:15","modified_gmt":"2023-11-12T06:23:15","slug":"interactions-where-are-you","status":"publish","type":"post","link":"https:\/\/lorentzen.ch\/index.php\/2023\/10\/16\/interactions-where-are-you\/","title":{"rendered":"Interactions &#8211; where are you?"},"content":{"rendered":"\n<p>This question sends shivers down the poor modelers spine&#8230; <\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"747\" height=\"598\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-2.png\" alt=\"\" class=\"wp-image-1303\" style=\"width:468px;height:374px\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-2.png 747w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-2-300x240.png 300w\" sizes=\"auto, (max-width: 747px) 100vw, 747px\" \/><\/figure>\n\n\n\n<p>The {hstats} R package introduced in our <a href=\"https:\/\/lorentzen.ch\/index.php\/2023\/08\/01\/its-the-interactions\/\">last post<\/a> measures their strength using Friedman&#8217;s H-statistics, a collection of statistics based on partial dependence functions.<\/p>\n\n\n\n<p>On Github, the preview version of {hstats} 1.0.0 out &#8211; I will try to bring it to CRAN in about one week (October 2023). Until then, try it via d<code>evtools::install_github(\"mayer79\/hstats\")<\/code><\/p>\n\n\n\n<p>The current version offers:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>H statistics per feature, feature pair, and feature triple<\/li>\n\n\n\n<li>Multivariate predictions at no additional cost<\/li>\n\n\n\n<li>A convenient API<\/li>\n\n\n\n<li>Other important tools from explainable ML: \n<ul class=\"wp-block-list\">\n<li>performance calculations<\/li>\n\n\n\n<li>permutation importance (e.g., to select features for calculating H-statistics)<\/li>\n\n\n\n<li>partial dependence plots (including grouping, multivariate, multivariable)<\/li>\n\n\n\n<li>individual conditional expectations (ICE)<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Case-weights are available for all methods, which is important, e.g., in insurance applications<\/li>\n\n\n\n<li>The option for fast quantile approximation of H-statistics<\/li>\n<\/ul>\n\n\n\n<p>This post has two parts:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Example with house-prices and XGBoost<\/li>\n\n\n\n<li><strong>Naive<\/strong> benchmark against {iml}, {DALEX}, and my old {flashlight}.<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\">1. Example<\/h2>\n\n\n\n<p>Let&#8217;s model logarithmic sales prices of houses sold in Miami Dade County, a dataset prepared by Prof. Dr. Steven Bourassa, and available in {shapviz}. We use XGBoost with <em>interaction constraints<\/em> to provide a model additive in all structure information, but allowing for interactions between latitude\/longitude for a flexible representation of geographic effects.<\/p>\n\n\n\n<p>The following code prepares the data, splits the data into train and validation, and then fits an XGBoost model.<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:&quot;language&quot;,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;r&quot;,&quot;mime&quot;:&quot;text\/x-rsrc&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;R&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;r&quot;}\">library(hstats)\nlibrary(shapviz)\nlibrary(xgboost)\nlibrary(ggplot2)\n\n# Data preparation\ncolnames(miami) &lt;- tolower(colnames(miami))\nmiami &lt;- transform(miami, log_price = log(sale_prc))\nx &lt;- c(&quot;tot_lvg_area&quot;, &quot;lnd_sqfoot&quot;, &quot;latitude&quot;, &quot;longitude&quot;,\n       &quot;structure_quality&quot;, &quot;age&quot;, &quot;month_sold&quot;)\ncoord &lt;- c(&quot;longitude&quot;, &quot;latitude&quot;)\n\n# Modeling\nset.seed(1)\nix &lt;- sample(nrow(miami), 0.8 * nrow(miami))\ntrain &lt;- data.frame(miami[ix, ])\nvalid &lt;- data.frame(miami[-ix, ])\ny_train &lt;- train$log_price\ny_valid &lt;- valid$log_price\nX_train &lt;- data.matrix(train[x])\nX_valid &lt;- data.matrix(valid[x])\n\ndtrain &lt;- xgb.DMatrix(X_train, label = y_train)\ndvalid &lt;- xgb.DMatrix(X_valid, label = y_valid)\n\nic &lt;- c(\n  list(which(x %in% coord) - 1),\n  as.list(which(!x %in% coord) - 1)\n)\n\n# Fit via early stopping\nfit &lt;- xgb.train(\n  params = list(\n    learning_rate = 0.15, \n    objective = &quot;reg:squarederror&quot;, \n    max_depth = 5,\n    interaction_constraints = ic\n  ),\n  data = dtrain,\n  watchlist = list(valid = dvalid),\n  early_stopping_rounds = 20,\n  nrounds = 1000,\n  callbacks = list(cb.print.evaluation(period = 100))\n)<\/pre><\/div>\n\n\n\n<p>Now it is time for a compact analysis with {hstats} to interpret the model:<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:&quot;language&quot;,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;r&quot;,&quot;mime&quot;:&quot;text\/x-rsrc&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;R&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;r&quot;}\">average_loss(fit, X = X_valid, y = y_valid)  # 0.0247 MSE -&gt; 0.157 RMSE\n\nperm_importance(fit, X = X_valid, y = y_valid) |&gt; \n  plot()\n\n# Or combining some features\nv_groups &lt;- list(\n  coord = c(&quot;longitude&quot;, &quot;latitude&quot;),\n  size = c(&quot;lnd_sqfoot&quot;, &quot;tot_lvg_area&quot;),\n  condition = c(&quot;age&quot;, &quot;structure_quality&quot;)\n)\nperm_importance(fit, v = v_groups, X = X_valid, y = y_valid) |&gt; \n  plot()\n\nH &lt;- hstats(fit, v = x, X = X_valid)\nH\nplot(H)\nplot(H, zero = FALSE)\nh2_pairwise(H, zero = FALSE, squared = FALSE, normalize = FALSE)\npartial_dep(fit, v = &quot;tot_lvg_area&quot;, X = X_valid) |&gt; \n  plot()\npartial_dep(fit, v = &quot;tot_lvg_area&quot;, X = X_valid, BY = &quot;structure_quality&quot;) |&gt; \n  plot(show_points = FALSE)\nplot(ii &lt;- ice(fit, v = &quot;tot_lvg_area&quot;, X = X_valid))\nplot(ii, center = TRUE)\n\n# Spatial plots\ng &lt;- unique(X_valid[, coord])\npp &lt;- partial_dep(fit, v = coord, X = X_valid, grid = g)\nplot(pp, d2_geom = &quot;point&quot;, alpha = 0.5, size = 1) + \n  coord_equal()\n\n# Takes some seconds because it generates the last plot per structure quality\npartial_dep(fit, v = coord, X = X_valid, grid = g, BY = &quot;structure_quality&quot;) |&gt;\n  plot(pp, d2_geom = &quot;point&quot;, alpha = 0.5) +\n  coord_equal()\n)<\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">Results summarized by plots<\/h3>\n\n\n\n<h4 class=\"wp-block-heading\">Permutation importance<\/h4>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"714\" height=\"592\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image.png\" alt=\"\" class=\"wp-image-1301\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image.png 714w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-300x249.png 300w\" sizes=\"auto, (max-width: 714px) 100vw, 714px\" \/><figcaption class=\"wp-element-caption\">Figure 1: Permutation importance (4 repetitions) on the validation data. Error bars show standard errors of the estimated increase in MSE from shuffling feature values.<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"727\" height=\"586\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-1.png\" alt=\"\" class=\"wp-image-1302\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-1.png 727w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-1-300x242.png 300w\" sizes=\"auto, (max-width: 727px) 100vw, 727px\" \/><figcaption class=\"wp-element-caption\">Figure 2: Feature groups can be shuffled together &#8211; accounting for issues of permutation importance with highly correlated features<\/figcaption><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\">H-Statistics<\/h4>\n\n\n\n<p>Let&#8217;s now move on to interaction statistics.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"747\" height=\"598\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-2.png\" alt=\"\" class=\"wp-image-1303\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-2.png 747w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-2-300x240.png 300w\" sizes=\"auto, (max-width: 747px) 100vw, 747px\" \/><figcaption class=\"wp-element-caption\">Figure 3: Overall and pairwise H-statistics. Overall H^2 gives the proportion of prediction variability explained by all interactions of the feature. By default, {hstats} picks the five features with largest H^2 and calculates their pairwise H^2. This explains why not all 21 feature pairs appear in the figure on the right-hand side. Pairwise H^2 is differently scaled than overall H^2: It gives the proportion of joint effect variability of the two features explained by their interaction.<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"730\" height=\"589\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-3.png\" alt=\"\" class=\"wp-image-1304\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-3.png 730w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-3-300x242.png 300w\" sizes=\"auto, (max-width: 730px) 100vw, 730px\" \/><figcaption class=\"wp-element-caption\">Figure 4: Use &#8220;zero = FALSE&#8221; to drop variable (pairs) with value 0.<\/figcaption><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\">PDPs and ICEs<\/h4>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"721\" height=\"594\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-4.png\" alt=\"\" class=\"wp-image-1305\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-4.png 721w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-4-300x247.png 300w\" sizes=\"auto, (max-width: 721px) 100vw, 721px\" \/><figcaption class=\"wp-element-caption\">Figure 5: A partial dependence plot of living area.<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"733\" height=\"591\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-5.png\" alt=\"\" class=\"wp-image-1306\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-5.png 733w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-5-300x242.png 300w\" sizes=\"auto, (max-width: 733px) 100vw, 733px\" \/><figcaption class=\"wp-element-caption\">Figure 6: Stratification shows indeed: no interactions between structure quality and living area.<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"742\" height=\"600\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-6.png\" alt=\"\" class=\"wp-image-1308\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-6.png 742w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-6-300x243.png 300w\" sizes=\"auto, (max-width: 742px) 100vw, 742px\" \/><figcaption class=\"wp-element-caption\">Figure 7: ICE plots also show no interations with any other feature. The interaction constraints of XGBoost did a good job.<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"609\" height=\"601\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-7.png\" alt=\"\" class=\"wp-image-1309\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-7.png 609w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-7-300x296.png 300w\" sizes=\"auto, (max-width: 609px) 100vw, 609px\" \/><figcaption class=\"wp-element-caption\">Figure 8: This two-dimensional PDP evaluated over all unique coordinates shows a realistic profile of house prices in Miami Dade County (mind the log scale).<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"442\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-8-1024x442.png\" alt=\"\" class=\"wp-image-1310\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-8-1024x442.png 1024w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-8-300x129.png 300w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-8-768x331.png 768w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/10\/image-8.png 1108w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">Figure 8: Same, but grouped by structure quality (5 is best). Since there is no interaction between location and structure quality, the plots are just shifted versions of each other. (You can&#8217;t really see it on the plots.)<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Naive Benchmark<\/h2>\n\n\n\n<p>All methods in {hstats} are optimized for speed. But how fast are they compared to other implementations? Note that: this is just a simple benchmark run on a Windows notebook with Intel i7-8650U CPU. <\/p>\n\n\n\n<p>Note that {iml} offers a parallel backend, but we could not make it run with XGBoost and Windows. Let me know how fast it is using parallelism and Linux! <\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Setup + benchmark on permutation importance<\/h4>\n\n\n\n<p>Always using the full validation dataset and 10 repetitions.<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:&quot;language&quot;,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;r&quot;,&quot;mime&quot;:&quot;text\/x-rsrc&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;R&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;r&quot;}\">library(iml)  # Might benefit of multiprocessing, but on Windows with XGB models, this is not easy\nlibrary(DALEX)\nlibrary(ingredients)\nlibrary(flashlight)\nlibrary(bench)\n\nset.seed(1)\n\n# iml\npredf &lt;- function(object, newdata) predict(object, data.matrix(newdata[x]))\nmod &lt;- Predictor$new(fit, data = as.data.frame(X_valid), y = y_valid, \n                     predict.function = predf)\n\n# DALEX\nex &lt;- DALEX::explain(fit, data = X_valid, y = y_valid)\n\n# flashlight (my slightly old fashioned package)\nfl &lt;- flashlight(\n  model = fit, data = valid, y = &quot;log_price&quot;, predict_function = predf, label = &quot;lm&quot;\n)\n  \n# Permutation importance: 10 repeats over full validation data (~2700 rows)\nbench::mark(\n  iml = FeatureImp$new(mod, n.repetitions = 10, loss = &quot;mse&quot;, compare = &quot;difference&quot;),\n  dalex = feature_importance(ex, B = 10, type = &quot;difference&quot;, n_sample = Inf),\n  flashlight = light_importance(fl, v = x, n_max = Inf, m_repetitions = 10),\n  hstats = perm_importance(fit, X = X_valid, y = y_valid, m_rep = 10, verbose = FALSE),\n  check = FALSE,\n  min_iterations = 3\n)\n\n# expression      min   median `itr\/sec` mem_alloc `gc\/sec` n_itr  n_gc total_time\n# iml           1.58s    1.58s     0.631   209.4MB    2.73      3    13      4.76s\n# dalex      566.21ms 586.91ms     1.72     34.6MB    0.572     3     1      1.75s\n# flashlight 587.03ms 613.15ms     1.63     27.1MB    1.63      3     3      1.84s\n# hstats     353.78ms 360.57ms     2.79     27.2MB    0         3     0      1.08s<\/pre><\/div>\n\n\n\n<p>{hstats} is about 30% faster as the second, {DALEX}. <\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Partial dependence<\/h4>\n\n\n\n<p>Here, we study the time for crunching partial dependence of a continuous feature and a discrete feature.<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:&quot;language&quot;,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;r&quot;,&quot;mime&quot;:&quot;text\/x-rsrc&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;R&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;r&quot;}\"># Partial dependence (cont)\nv &lt;- &quot;tot_lvg_area&quot;\nbench::mark(\n  iml = FeatureEffect$new(mod, feature = v, grid.size = 50, method = &quot;pdp&quot;),\n  dalex = partial_dependence(ex, variables = v, N = Inf, grid_points = 50),\n  flashlight = light_profile(fl, v = v, pd_n_max = Inf, n_bins = 50),\n  hstats = partial_dep(fit, v = v, X = X_valid, grid_size = 50, n_max = Inf),\n  check = FALSE,\n  min_iterations = 3\n)\n# expression      min   median `itr\/sec` mem_alloc `gc\/sec` n_itr  n_gc total_time\n# iml           1.11s    1.13s     0.887   376.3MB     3.84     3    13      3.38s\n# dalex      782.13ms 783.08ms     1.24    192.8MB     2.90     3     7      2.41s\n# flashlight 367.73ms  372.5ms     2.68     67.9MB     2.68     3     3      1.12s\n# hstats     220.88ms  222.5ms     4.50     14.2MB     0        3     0   666.33ms\n \n# Partial dependence (discrete)\nv &lt;- &quot;structure_quality&quot;\nbench::mark(\n  iml = FeatureEffect$new(mod, feature = v, method = &quot;pdp&quot;, grid.points = 1:5),\n  dalex = partial_dependence(ex, variables = v, N = Inf, variable_type = &quot;categorical&quot;, grid_points = 5),\n  flashlight = light_profile(fl, v = v, pd_n_max = Inf),\n  hstats = partial_dep(fit, v = v, X = X_valid, n_max = Inf),\n  check = FALSE,\n  min_iterations = 3\n)\n# expression      min   median `itr\/sec` mem_alloc `gc\/sec` n_itr  n_gc total_time\n# iml            90ms     96ms     10.6    13.29MB     7.06     3     2      283ms\n# dalex       170.6ms  174.4ms      5.73   20.55MB     2.87     2     1      349ms\n# flashlight   40.8ms   43.8ms     23.1     6.36MB     2.10    11     1      476ms\n# hstats       23.5ms   24.4ms     40.6     1.53MB     2.14    19     1      468ms<\/pre><\/div>\n\n\n\n<p>{hstats} is <strong>1.5 to 2 times faster<\/strong> than {flashlight}, and about four times as fast as the other packages. It&#8217;s memory foodprint is much lower.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">H-statistics<\/h4>\n\n\n\n<p>How fast can overall H-statistics be computed? How fast can it do pairwise calculations? <\/p>\n\n\n\n<p>{DALEX} does not offer these statistics yet. {iml} was the first model-agnostic implementation of H-statistics I am aware of. It uses quantile approximation by default, but we purposely force it to calculate exact, in order to compare the numbers. Thus, we made it slower than it actually is.<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:&quot;language&quot;,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;r&quot;,&quot;mime&quot;:&quot;text\/x-rsrc&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;R&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;r&quot;}\"># H-Stats -&gt; we use a subset of 500 rows\nX_v500 &lt;- X_valid[1:500, ]\nmod500 &lt;- Predictor$new(fit, data = as.data.frame(X_v500), predict.function = predf)\nfl500 &lt;- flashlight(fl, data = as.data.frame(valid[1:500, ]))\n\n# iml  # 225s total, using slow exact calculations\nsystem.time(  # 90s\n  iml_overall &lt;- Interaction$new(mod500, grid.size = 500)\n)\nsystem.time(  # 135s for all combinations of latitude\n  iml_pairwise &lt;- Interaction$new(mod500, grid.size = 500, feature = &quot;latitude&quot;)\n)\n\n# flashlight: 14s total, doing only one pairwise calculation, otherwise would take 63s\nsystem.time(  # 12s\n  fl_overall &lt;- light_interaction(fl500, v = x, grid_size = Inf, n_max = Inf)\n)\nsystem.time(  # 2s\n  fl_pairwise &lt;- light_interaction(\n    fl500, v = coord, grid_size = Inf, n_max = Inf, pairwise = TRUE\n  )\n)\n\n# hstats: 3s total\nsystem.time({\n  H &lt;- hstats(fit, v = x, X = X_v500, n_max = Inf)\n  hstats_overall &lt;- h2_overall(H, squared = FALSE, zero = FALSE)\n  hstats_pairwise &lt;- h2_pairwise(H, squared = FALSE, zero = FALSE)\n}\n)\n\n# Overall statistics correspond exactly\niml_overall$results |&gt; filter(.interaction &gt; 1e-6)\n#     .feature .interaction\n# 1:  latitude    0.2458269\n# 2: longitude    0.2458269\n\nfl_overall$data |&gt; subset(value &gt; 0, select = c(variable, value))\n#   variable  value\n# 1 latitude  0.246\n# 2 longitude 0.246\n\nhstats_overall\n# longitude  latitude \n# 0.2458269 0.2458269 \n\n# Pairwise results match as well\niml_pairwise$results |&gt; filter(.interaction &gt; 1e-6)\n#              .feature .interaction\n# 1: longitude:latitude    0.3942526\n\nfl_pairwise$data |&gt; subset(value &gt; 0, select = c(variable, value))\n# latitude:longitude 0.394\n\nhstats_pairwise\n# latitude:longitude \n# 0.3942526 <\/pre><\/div>\n\n\n\n<ul class=\"wp-block-list\">\n<li>{hstats} is about <strong>four times as fast<\/strong> as {flashlight}.<\/li>\n\n\n\n<li>Since one often want to study relative and absolute H-statistics, in practice, the speed-up would be about a factor of eight.<\/li>\n\n\n\n<li>In multi-classification\/multi-output settings with m categories, the speed-up would be even m times larger.<\/li>\n\n\n\n<li>The fast approximation via quantile binning is again a factor of four faster. The difference would diminish if we would calculate many pairwise or three-way H-statistics.<\/li>\n\n\n\n<li>Forcing all three packages to calculate exact statistics, <strong>all results match.<\/strong><\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Wrap-Up<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>{hstats} is much faster than other XAI packages, at least in our use-case. This includes H-statistics, permutation importance, and partial dependence. <strong>Note that making good benchmarks is not my strength, so forgive any bias in the results.<\/strong><\/li>\n\n\n\n<li>The memory foodprint is lower as well.<\/li>\n\n\n\n<li>With multivariate output, the potential is even larger.<\/li>\n\n\n\n<li>H-Statistics match other implementations.<\/li>\n<\/ul>\n\n\n\n<p>Try it out! <\/p>\n\n\n\n<p>The full R code in one piece is <a href=\"https:\/\/github.com\/lorentzenchr\/notebooks\/blob\/master\/blogposts\/2023-10-16%20hstats.R\">here<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This question sends shivers down the poor modelers spine&#8230; The {hstats} R package introduced in our last post measures their strength using Friedman&#8217;s H-statistics, a collection of statistics based on partial dependence functions. On Github, the preview version of {hstats} 1.0.0 out &#8211; I will try to bring it to CRAN in about one week [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[16,17,9],"tags":[5],"class_list":["post-1300","post","type-post","status-publish","format-standard","hentry","category-machine-learning","category-programming","category-statistics","tag-r"],"featured_image_src":null,"author_info":{"display_name":"Michael Mayer","author_link":"https:\/\/lorentzen.ch\/index.php\/author\/michael\/"},"_links":{"self":[{"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/posts\/1300","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/comments?post=1300"}],"version-history":[{"count":9,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/posts\/1300\/revisions"}],"predecessor-version":[{"id":1341,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/posts\/1300\/revisions\/1341"}],"wp:attachment":[{"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/media?parent=1300"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/categories?post=1300"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/tags?post=1300"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}