{"id":1264,"date":"2023-07-16T16:28:47","date_gmt":"2023-07-16T14:28:47","guid":{"rendered":"https:\/\/lorentzen.ch\/?p=1264"},"modified":"2023-07-17T08:01:14","modified_gmt":"2023-07-17T06:01:14","slug":"model-diagnostics-for-python","status":"publish","type":"post","link":"https:\/\/lorentzen.ch\/index.php\/2023\/07\/16\/model-diagnostics-for-python\/","title":{"rendered":"Model Diagnostics in Python"},"content":{"rendered":"\n<p>\ud83d\ude80Version 1.0.0 of the new Python package for <a href=\"https:\/\/lorentzenchr.github.io\/model-diagnostics\/\">model-diagnostics<\/a> was just released on PyPI. If you use (machine learning or statistical or other) models to predict a mean, median, quantile or expectile, this library offers tools to assess the calibration of  your models and to compare and decompose predictive model performance scores.\ud83d\ude80<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">pip install model-diagnostics<\/code><\/pre>\n\n\n\n<p>After having finished our paper (or better: user guide) <a href=\"https:\/\/arxiv.org\/abs\/2202.12780\">&#8220;Model Comparison and Calibration Assessment: User Guide for Consistent Scoring Functions in Machine Learning and Actuarial Practice&#8221;<\/a> last year, I realised that there is no Python package that supports the proposed diagnostic tools (which are not completely new). Most of the required building blocks are there, but putting them together to get a result amounts quickly to a large amount of code. Therefore, I decided to publish a new package.<\/p>\n\n\n\n<p>By the way, I really never wanted to write a plotting library. But it turned out that arranging results until they are ready to be visualised amounts to quite a large part of the source code. I hope this was worth the effort. Your feedback is very welcome, either here in the comments or as feature request or bug report under <a href=\"https:\/\/github.com\/lorentzenchr\/model-diagnostics\/issues\">https:\/\/github.com\/lorentzenchr\/model-diagnostics\/issues<\/a>.<\/p>\n\n\n\n<p>For a jump start, I recommend to go directly to the two examples:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/lorentzenchr.github.io\/model-diagnostics\/examples\/regression_on_workers_compensation\/\">Regression on Workers&#8217; Compensation Dataset<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/lorentzenchr.github.io\/model-diagnostics\/examples\/quantile_regression\/\">Quantile Regression on Synthetic Data<\/a><\/li>\n<\/ul>\n\n\n\n<p>To give a glimpse of the functionality, here are some short code snippets.<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:&quot;language&quot;,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text\/x-python&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}\">from model_diagnostics.calibration import compute_bias\nfrom model_diagnostics.calibration import plot_reliability_diagram\n\n\ny_obs = list(range(10))\ny_pred = [2, 1, 3, 3, 6, 8, 5, 5, 8, 9.]\nplot_reliability_diagram(\n    y_obs=y_obs,\n    y_pred=y_pred,\n    n_bootstrap=1000,\n    confidence_level=0.9,\n)<\/pre><\/div>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"566\" height=\"455\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/07\/image.png\" alt=\"\" class=\"wp-image-1265\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/07\/image.png 566w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2023\/07\/image-300x241.png 300w\" sizes=\"auto, (max-width: 566px) 100vw, 566px\" \/><\/figure>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:&quot;language&quot;,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text\/x-python&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}\">compute_bias(y_obs=y_obs, y_pred=y_pred)<\/pre><\/div>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><strong>bias_me<\/strong>an<\/td><td>bias_count<\/td><td>bias_weights<\/td><td>bias_stderr<\/td><td>p_value<\/td><\/tr><tr><td>f64<\/td><td>u32<\/td><td>f64<\/td><td>f64<\/td><td>f64<\/td><\/tr><tr><td>0.5<\/td><td>10<\/td><td>10.0<\/td><td>0.477261<\/td><td>0.322121<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:&quot;language&quot;,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text\/x-python&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}\">from model_diagnostics.scoring import SquaredError, decompose\n\n\ndecompose(\n    y_obs=y_obs,\n    y_pred=y_pred,\n    scoring_function=SquaredError(),\n)<\/pre><\/div>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><strong>miscalibration<\/strong><\/td><td><strong>discrimination<\/strong><\/td><td><strong>uncertainty<\/strong><\/td><td><strong>score<\/strong><\/td><\/tr><tr><td>f64<\/td><td>f64<\/td><td>f64<\/td><td>f64<\/td><\/tr><tr><td>1.283333<\/td><td>7.233333<\/td><td>8.25<\/td><td>2.3<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>This score decomposition is additive (and unique):<\/p>\n\n\n\n<div class=\"wp-block-katex-display-block katex-eq\" data-katex-display=\"true\"><pre>\\begin{equation*}\n\\mathrm{score} = \\mathrm{miscalibration} - \\mathrm{discrimination} + \\mathrm{uncertainty}\n\\end{equation*}<\/pre><\/div>\n\n\n\n<p>As usual, the code snippets are collected in a notebook: <a href=\"https:\/\/github.com\/lorentzenchr\/notebooks\/blob\/master\/blogposts\/2023-07-16%20model-diagnostics.ipynb\">https:\/\/github.com\/lorentzenchr\/notebooks\/blob\/master\/blogposts\/2023-07-16%20model-diagnostics.ipynb<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Version 1.0.0 of the new Python package for model-diagnostics was just released on PyPI.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[16,17,9],"tags":[6],"class_list":["post-1264","post","type-post","status-publish","format-standard","hentry","category-machine-learning","category-programming","category-statistics","tag-python"],"featured_image_src":null,"author_info":{"display_name":"Christian Lorentzen","author_link":"https:\/\/lorentzen.ch\/index.php\/author\/christian\/"},"_links":{"self":[{"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/posts\/1264","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/comments?post=1264"}],"version-history":[{"count":10,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/posts\/1264\/revisions"}],"predecessor-version":[{"id":1276,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/posts\/1264\/revisions\/1276"}],"wp:attachment":[{"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/media?parent=1264"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/categories?post=1264"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/tags?post=1264"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}