Mindblown: a blog about philosophy.
-
Dplyr-style without dplyr
How to get “dplyr” feeling without “dplyr”
-
Interpret Complex Linear Models with SHAP within Seconds
Peaking into richly parametrized linear models with SHAP? Yes!
-
Histograms, Gradient Boosted Trees, Group-By Queries and One-Hot Encoding
This post shows how filling histograms can be done in very different ways thereby connecting very different areas: from gradient boosted trees to SQL queries to one-hot encoding. Let’s jump into it! Modern gradient boosted trees (GBT) like LightGBM, XGBoost and the HistGradientBoostingRegressor of scikit-learn all use two techniques on top of standard gradient boosting:…
-
The Unfairness of AI Fairness
Fairness in Artificial Intelligence (AI) and Machine Learning (ML) is a recent and hot topic. As ML models are used in insurance pricing, the fairness topic also applies there. Just last month, Lindholm, Richman, Tsanakas and Wüthrich published a discussion paper on this subject that sheds new light on established AI fairness criteria. This post…
-
Kernel SHAP in R and Python
“R Python” continued… Kernel SHAP
-
Kernel SHAP
Standard Kernel SHAP has arrived in R. We show how well it plays together with deep learning in Keras
-
shapviz goes H2O
The “shapviz” package now plays well together with H2O.
-
Visualize SHAP Values without Tears
Visualize SHAP values without tears.
-
From Least Squares Benchmarks to the Marchenko–Pastur Distribution
In this blog post, I tell the story how I learned about a theorem for random matrices of the two Ukrainian🇺🇦 mathematicians Vladimir Marchenko and Leonid Pastur. It all started with benchmarking least squares solvers in scipy. Setting the Stage for Least Squares Solvers Least squares starts with a matrix and a vector and one…
-
Let the flashlight shine with plotly
How to combine model interpretation package “flashlight” with “plotly”?
-
DuckDB: Quacking SQL
“R Python” continued… DuckDB: Quacking SQL
-
Avoid loops in R! Really?
It must have been around the year 2000, when I wrote my first snipped of SPLUS/R code. One thing I’ve learned back then: Loops are slow. Replace them with vectorized calculations or if vectorization is not possible, use sapply() et al. Since then, the R core team and the community has invested tons of time…
Got any book recommendations?