{"id":1279,"date":"2023-08-01T11:42:16","date_gmt":"2023-08-01T09:42:16","guid":{"rendered":"https:\/\/lorentzen.ch\/?p=1279"},"modified":"2023-08-01T11:42:16","modified_gmt":"2023-08-01T09:42:16","slug":"its-the-interactions","status":"publish","type":"post","link":"https:\/\/lorentzen.ch\/index.php\/2023\/08\/01\/its-the-interactions\/","title":{"rendered":"It&#8217;s the interactions"},"content":{"rendered":"\n<p>What makes a ML model a black-box? It is the interactions. Without any interactions, the ML model is additive and can be exactly described. <\/p>\n\n\n\n<p>Studying interaction effects of ML models is challenging. The main XAI approaches are:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Looking at ICE plots, stratified PDP, and\/or 2D PDP.<\/li>\n\n\n\n<li>Study vertical scatter in SHAP dependence plots, or even consider SHAP interaction values.<\/li>\n\n\n\n<li>Check partial-dependence based H-statistics introduced in Friedman and Popescu (2008), or related statistics.<\/li>\n<\/ol>\n\n\n\n<p>This post is mainly about the third approach. Its beauty is that we get information about all interactions. The downside: it is as good\/bad as partial dependence functions. And: the statistics are computationally very expensive to compute (of order n^2). <\/p>\n\n\n\n<p>Different R packages offer some of these H-statistics, including {iml}, {gbm}, {flashlight}, and {vivid}. They all have their limitations. This is why I wrote the new R package {<a href=\"https:\/\/CRAN.R-project.org\/package=hstats\">hstats<\/a>}:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/logo.png\" alt=\"\" class=\"wp-image-1281\" width=\"145\" height=\"168\"\/><\/figure>\n\n\n\n<ul class=\"wp-block-list\">\n<li>It is very efficient. <\/li>\n\n\n\n<li>Has a clean API. DALEX explainers and meta-learners (mlr3, Tidymodels, caret) work out-of-the-box.<\/li>\n\n\n\n<li>Supports multivariate predictions, including classification models.<\/li>\n\n\n\n<li>Allows to calculate unnormalized H-statistics. They help to compare pairwise and three-way statistics.<\/li>\n\n\n\n<li>Contains fast multivariate ICE\/PDPs with optional grouping variable.<\/li>\n<\/ul>\n\n\n\n<p>In Python,  there is the very interesting project <a href=\"https:\/\/github.com\/pyartemis\/artemis\">artemis<\/a>. I will write a post on it later.  <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Statistics supported by {hstats}<\/h2>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-1024x288.png\" alt=\"\" class=\"wp-image-1280\" width=\"806\" height=\"226\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-1024x288.png 1024w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-300x85.png 300w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-768x216.png 768w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image.png 1239w\" sizes=\"auto, (max-width: 806px) 100vw, 806px\" \/><\/figure>\n\n\n\n<p>Furthermore, a global measure of non-additivity (proportion of prediction variability unexplained by main effects), and a measure of feature importance is available. For technical details and references, check the following <a href=\"https:\/\/github.com\/mayer79\/hstats\/blob\/main\/docu\/document.pdf\">pdf<\/a> or <a href=\"https:\/\/github.com\/mayer79\/hstats#background\">github<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Classification example<\/h2>\n\n\n\n<p>Let&#8217;s fit a probability random forest on iris species.<\/p>\n\n\n<div class=\"wp-block-ub-tabbed-content wp-block-ub-tabbed-content-holder wp-block-ub-tabbed-content-horizontal-holder-mobile wp-block-ub-tabbed-content-horizontal-holder-tablet\" id=\"ub-tabbed-content-61eea646-d523-4e66-9f82-7861e79faf4c\" style=\"\">\n\t\t\t<div class=\"wp-block-ub-tabbed-content-tab-holder horizontal-tab-width-mobile horizontal-tab-width-tablet\">\n\t\t\t\t<div role=\"tablist\" class=\"wp-block-ub-tabbed-content-tabs-title wp-block-ub-tabbed-content-tabs-title-mobile-horizontal-tab wp-block-ub-tabbed-content-tabs-title-tablet-horizontal-tab\" style=\"justify-content: flex-start; \"><div role=\"tab\" id=\"ub-tabbed-content-61eea646-d523-4e66-9f82-7861e79faf4c-tab-0\" aria-controls=\"ub-tabbed-content-61eea646-d523-4e66-9f82-7861e79faf4c-panel-0\" aria-selected=\"true\" class=\"wp-block-ub-tabbed-content-tab-title-wrap active\" style=\"--ub-tabbed-title-background-color: #6d6d6d; --ub-tabbed-active-title-color: inherit; --ub-tabbed-active-title-background-color: #6d6d6d; text-align: center; \" tabindex=\"-1\">\n\t\t\t\t<div class=\"wp-block-ub-tabbed-content-tab-title\">R<\/div>\n\t\t\t<\/div><\/div>\n\t\t\t<\/div>\n\t\t\t<div class=\"wp-block-ub-tabbed-content-tabs-content\" style=\"\"><div role=\"tabpanel\" class=\"wp-block-ub-tabbed-content-tab-content-wrap active\" id=\"ub-tabbed-content-61eea646-d523-4e66-9f82-7861e79faf4c-panel-0\" aria-labelledby=\"ub-tabbed-content-61eea646-d523-4e66-9f82-7861e79faf4c-tab-0\" tabindex=\"0\">\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting='{\"showPanel\":true,\"languageLabel\":\"language\",\"fullScreenButton\":true,\"copyButton\":true,\"mode\":\"r\",\"mime\":\"text\/x-rsrc\",\"theme\":\"material\",\"lineNumbers\":false,\"styleActiveLine\":false,\"lineWrapping\":false,\"readOnly\":true,\"fileName\":\"\",\"language\":\"R\",\"maxHeight\":\"400px\",\"modeName\":\"r\"}'>library(ranger)\nlibrary(ggplot2)\nlibrary(hstats)\n\nv &lt;- setdiff(colnames(iris), \"Species\")\nfit &lt;- ranger(Species ~ ., data = iris, probability = TRUE, seed = 1)\ns &lt;- hstats(fit, v = v, X = iris)  # 8 seconds run-time\ns\n# Proportion of prediction variability unexplained by main effects of v:\n#      setosa  versicolor   virginica \n# 0.002705945 0.065629375 0.046742035\n\nplot(s, normalize = FALSE, squared = FALSE) +\n  ggtitle(\"Unnormalized statistics\") +\n  scale_fill_viridis_d(begin = 0.1, end = 0.9)\n\nice(fit, v = \"Petal.Length\", X = iris, BY = \"Petal.Width\", n_max = 150) |&gt; \n  plot(center = TRUE) +\n  ggtitle(\"Centered ICE plots\")\n<\/pre><\/div>\n\n<\/div><\/div>\n\t\t<\/div>\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-1-1024x379.png\" alt=\"\" class=\"wp-image-1282\" width=\"831\" height=\"307\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-1-1024x379.png 1024w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-1-300x111.png 300w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-1-768x284.png 768w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-1.png 1192w\" sizes=\"auto, (max-width: 831px) 100vw, 831px\" \/><figcaption class=\"wp-element-caption\">Unnormalized H-statistics, i.e., values are roughly on the scale of the predictions (here: probabilities).<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-2-1024x385.png\" alt=\"\" class=\"wp-image-1283\" width=\"831\" height=\"312\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-2-1024x385.png 1024w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-2-300x113.png 300w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-2-768x289.png 768w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-2.png 1246w\" sizes=\"auto, (max-width: 831px) 100vw, 831px\" \/><figcaption class=\"wp-element-caption\">Centered ICE plots per class.<\/figcaption><\/figure>\n\n\n\n<p><strong>Interpretation:<\/strong> <\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>The features with strongest interactions are Petal Length and Petal Width. These interactions mainly affect species &#8220;virginica&#8221; and &#8220;versicolor&#8221;. The effect for &#8220;setosa&#8221; is almost additive.<\/li>\n\n\n\n<li>Unnormalized pairwise statistics show that the strongest absolute interaction happens indeed between Petal Length and Petal Width.<\/li>\n\n\n\n<li>The centered ICE plots shows <em>how<\/em> the interaction manifests: The effect of Petal Length heavily depends on Petal Width, except for species &#8220;setosa&#8221;. Would a SHAP analysis show the same?<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"> DALEX example<\/h2>\n\n\n\n<p>Here, we consider a random forest regression on &#8220;Sepal.Length&#8221;.<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:&quot;language&quot;,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;r&quot;,&quot;mime&quot;:&quot;text\/x-rsrc&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;R&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;r&quot;}\">library(DALEX)\nlibrary(ranger)\nlibrary(hstats)\n\nset.seed(1)\n\nfit &lt;- ranger(Sepal.Length ~ ., data = iris)\nex &lt;- explain(fit, data = iris[-1], y = iris[, 1])\n\ns &lt;- hstats(ex)  # 2 seconds\ns  # Non-additivity index 0.054\nplot(s)\nplot(ice(ex, v = &quot;Sepal.Width&quot;, BY = &quot;Petal.Width&quot;), center = TRUE)\n<\/pre><\/div>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"949\" height=\"664\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-3.png\" alt=\"\" class=\"wp-image-1284\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-3.png 949w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-3-300x210.png 300w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-3-768x537.png 768w\" sizes=\"auto, (max-width: 949px) 100vw, 949px\" \/><figcaption class=\"wp-element-caption\">H-statistics<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"919\" height=\"685\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-4.png\" alt=\"\" class=\"wp-image-1285\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-4.png 919w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-4-300x224.png 300w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/08\/image-4-768x572.png 768w\" sizes=\"auto, (max-width: 919px) 100vw, 919px\" \/><figcaption class=\"wp-element-caption\">Centered ICE plot of strongest relative interactions.<\/figcaption><\/figure>\n\n\n\n<p><strong>Interpretation<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Petal Length and Width show the strongest overall associations. Since we are considering normalized statistics, we can say: &#8220;About 3.5% of prediction variability comes from interactions with Petal Length&#8221;.<\/li>\n\n\n\n<li>The strongest relative pairwise interaction happens between Sepal Width and Petal Width: Again, because we study normalized H-statistics, we can say: &#8220;About 4% of total prediction variability of the two features Sepal Width and Petal Width can be attributed to their interactions.&#8221;<\/li>\n\n\n\n<li>Overall, all interactions explain only about 5% of prediction variability (see text output).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Try it out!<\/h3>\n\n\n\n<p>The complete R script can be found <a href=\"https:\/\/github.com\/lorentzenchr\/notebooks\/blob\/master\/blogposts\/2023-08-01%20hstats.R\">here<\/a>. More examples and background can be found on the <a href=\"https:\/\/github.com\/mayer79\/hstats\">Github<\/a> page of the project.<\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>What makes a ML model a black-box? It is the interactions. Without any interactions, the ML model is additive and can be exactly described. Studying interaction effects of ML models is challenging. The main XAI approaches are: This post is mainly about the third approach. Its beauty is that we get information about all interactions. [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[16,17,9],"tags":[5],"class_list":["post-1279","post","type-post","status-publish","format-standard","hentry","category-machine-learning","category-programming","category-statistics","tag-r"],"featured_image_src":null,"author_info":{"display_name":"Michael Mayer","author_link":"https:\/\/lorentzen.ch\/index.php\/author\/michael\/"},"_links":{"self":[{"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/posts\/1279","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/comments?post=1279"}],"version-history":[{"count":1,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/posts\/1279\/revisions"}],"predecessor-version":[{"id":1286,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/posts\/1279\/revisions\/1286"}],"wp:attachment":[{"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/media?parent=1279"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/categories?post=1279"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/tags?post=1279"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}