{"id":258,"date":"2021-04-16T13:01:19","date_gmt":"2021-04-16T11:01:19","guid":{"rendered":"https:\/\/lorentzen.ch\/?p=258"},"modified":"2021-04-16T13:01:20","modified_gmt":"2021-04-16T11:01:20","slug":"a-curious-fact-on-the-diamonds-dataset","status":"publish","type":"post","link":"https:\/\/lorentzen.ch\/index.php\/2021\/04\/16\/a-curious-fact-on-the-diamonds-dataset\/","title":{"rendered":"A Curious Fact on the Diamonds Dataset"},"content":{"rendered":"\n<h1 class=\"wp-block-heading\">Lost in Translation between R and Python 5<\/h1>\n\n\n\n<p>Hello regression world<\/p>\n\n\n\n<p>This is the next article in our series <strong>&#8220;Lost in Translation between R and Python&#8221;<\/strong>. The aim of this series is to provide high-quality R <strong>and<\/strong> Python 3 code to achieve some non-trivial tasks. If you are to learn R, check out the R tab below. Similarly, if you are to learn Python, the Python tab will be your friend.<\/p>\n\n\n\n<p>The last two included a deep dive into historic <a href=\"https:\/\/lorentzen.ch\/index.php\/2021\/02\/19\/swiss-mortality\/\">mortality rates<\/a> as well as studying a <a href=\"https:\/\/lorentzen.ch\/index.php\/2021\/03\/14\/a-beautiful-regression-formula\/\">beautiful regression formula<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Diamonds data<\/h2>\n\n\n\n<p>One of the most used datasets to teach regression is the <em>diamonds<\/em> dataset. It describes 54&#8217;000 diamonds by<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>their <code>price<\/code>, <\/li><li>the four &#8220;C&#8221; variables (<code>carat<\/code>, <code>color<\/code>, <code>cut<\/code>, <code>clarity<\/code>),<\/li><li>as well as by perspective measurements <code>table<\/code>, <code>depth<\/code>, <code>x<\/code>, <code>y<\/code>, and <code>z<\/code>.<\/li><\/ul>\n\n\n\n<p>The dataset is readily available, e.g. in<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>R package <code>ggplot2<\/code>,<\/li><li>Python package <code>plotnine<\/code>,<\/li><li>and the fantastic <a href=\"https:\/\/openml.org\/d\/42225\">OpenML<\/a> database.<\/li><\/ul>\n\n\n\n<p><em><strong>Question: <\/strong><\/em>How many times did you use diamonds data to compare regression techniques like random forests and gradient boosting?<em><strong> <\/strong><\/em><\/p>\n\n\n\n<p><strong><em>Answer:<\/em><\/strong> Probably a lot!<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The curious fact<\/h2>\n\n\n\n<p>We recently stumbled over a curious fact regarding that dataset. <strong>26% of the diamonds are duplicates<\/strong> regarding <code>price<\/code> and the four &#8220;C&#8221; variables. Within duplicates, the perspective variables <code>table<\/code>, <code>depth<\/code>, <code>x<\/code>, <code>y<\/code>, and <code>z<\/code> would differ as if a diamond had been measured from different angles.<\/p>\n\n\n\n<p>In order to illustrate the issue, let us add the two auxilary variables<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><code>id<\/code>: group id of diamonds with identical price and four &#8220;C&#8221;, and<\/li><li><code>id_size<\/code>: number of rows in that id<\/li><\/ul>\n\n\n\n<p>to the dataset and consider a couple of examples. You can view both R and Python code &#8211; but the specific output will differ because language specific naming of group ids.<\/p>\n\n\n<div class=\"wp-block-ub-tabbed-content wp-block-ub-tabbed-content-holder wp-block-ub-tabbed-content-horizontal-holder-mobile wp-block-ub-tabbed-content-horizontal-holder-tablet\" id=\"ub-tabbed-content-3ce0597e-a860-492d-a768-c25e1a6881d9\" style=\"\">\n\t\t\t<div class=\"wp-block-ub-tabbed-content-tab-holder horizontal-tab-width-mobile horizontal-tab-width-tablet\">\n\t\t\t\t<div role=\"tablist\" class=\"wp-block-ub-tabbed-content-tabs-title wp-block-ub-tabbed-content-tabs-title-mobile-horizontal-tab wp-block-ub-tabbed-content-tabs-title-tablet-horizontal-tab\" style=\"justify-content: flex-start; \"><div role=\"tab\" id=\"ub-tabbed-content-3ce0597e-a860-492d-a768-c25e1a6881d9-tab-0\" aria-controls=\"ub-tabbed-content-3ce0597e-a860-492d-a768-c25e1a6881d9-panel-0\" aria-selected=\"true\" class=\"wp-block-ub-tabbed-content-tab-title-wrap active\" style=\"--ub-tabbed-title-background-color: #6d6d6d; --ub-tabbed-active-title-color: inherit; --ub-tabbed-active-title-background-color: #6d6d6d; text-align: center; \" tabindex=\"-1\">\n\t\t\t\t<div class=\"wp-block-ub-tabbed-content-tab-title\">R<\/div>\n\t\t\t<\/div><div role=\"tab\" id=\"ub-tabbed-content-3ce0597e-a860-492d-a768-c25e1a6881d9-tab-1\" aria-controls=\"ub-tabbed-content-3ce0597e-a860-492d-a768-c25e1a6881d9-panel-1\" aria-selected=\"false\" class=\"wp-block-ub-tabbed-content-tab-title-wrap\" style=\"--ub-tabbed-active-title-color: inherit; --ub-tabbed-active-title-background-color: #6d6d6d; text-align: center; \" tabindex=\"-1\">\n\t\t\t\t<div class=\"wp-block-ub-tabbed-content-tab-title\">Python<\/div>\n\t\t\t<\/div><\/div>\n\t\t\t<\/div>\n\t\t\t<div class=\"wp-block-ub-tabbed-content-tabs-content\" style=\"\"><div role=\"tabpanel\" class=\"wp-block-ub-tabbed-content-tab-content-wrap active\" id=\"ub-tabbed-content-3ce0597e-a860-492d-a768-c25e1a6881d9-panel-0\" aria-labelledby=\"ub-tabbed-content-3ce0597e-a860-492d-a768-c25e1a6881d9-tab-0\" tabindex=\"0\">\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting='{\"showPanel\":true,\"languageLabel\":\"language\",\"fullScreenButton\":true,\"copyButton\":true,\"mode\":\"r\",\"mime\":\"text\/x-rsrc\",\"theme\":\"material\",\"lineNumbers\":false,\"styleActiveLine\":false,\"lineWrapping\":false,\"readOnly\":true,\"fileName\":\"\",\"language\":\"R\",\"maxHeight\":\"400px\",\"modeName\":\"r\"}'>library(tidyverse)\n\n# We add group id and its size\ndia &lt;- diamonds %&gt;% \n  group_by(carat, cut, clarity, color, price) %&gt;% \n  mutate(id = cur_group_id(),\n         id_size = n()) %&gt;% \n  ungroup() %&gt;% \n  arrange(id)\n\n# Proportion of duplicates\n1 - max(dia$id) \/ nrow(dia)  # 0.26\n\n# Some examples\ndia %&gt;% \n  filter(id_size &gt; 1) %&gt;%\n  head(10)\n\n# Most frequent\ndia %&gt;% \n  arrange(-id_size) %&gt;% \n  head(.$id_size[1])\n\n# A random large diamond appearing multiple times\ndia %&gt;% \n  filter(id_size &gt; 3) %&gt;% \n  arrange(-carat) %&gt;% \n  head(.$id_size[1])<\/pre><\/div>\n\n<\/div><div role=\"tabpanel\" class=\"wp-block-ub-tabbed-content-tab-content-wrap ub-hide\" id=\"ub-tabbed-content-3ce0597e-a860-492d-a768-c25e1a6881d9-panel-1\" aria-labelledby=\"ub-tabbed-content-3ce0597e-a860-492d-a768-c25e1a6881d9-tab-1\" tabindex=\"0\">\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting='{\"showPanel\":true,\"languageLabel\":\"language\",\"fullScreenButton\":true,\"copyButton\":true,\"mode\":\"python\",\"mime\":\"text\/x-python\",\"theme\":\"material\",\"lineNumbers\":false,\"styleActiveLine\":false,\"lineWrapping\":false,\"readOnly\":true,\"fileName\":\"\",\"language\":\"Python\",\"maxHeight\":\"400px\",\"modeName\":\"python\"}'>import numpy as np\nimport pandas as pd\nfrom plotnine.data import diamonds\n\n# Variable groups\ncat_vars = [\"cut\", \"color\", \"clarity\"]\nxvars = cat_vars + [\"carat\"]\nall_vars = xvars + [\"price\"]\n\nprint(\"Shape: \", diamonds.shape)\n\n# Add id and id_size\ndf = diamonds.copy()\ndf[\"id\"] = df.groupby(all_vars).ngroup()\ndf[\"id_size\"] = df.groupby(all_vars)[\"price\"].transform(len)\ndf.sort_values(\"id\", inplace=True)\n\nprint(f'Proportion of dupes: {1 - df[\"id\"].max() \/ df.shape[0]:.0%}')\n\nprint(\"Random examples\")\nprint(df[df.id_size &gt; 1].head(10))\n\nprint(\"Most frequent\")\nprint(df.sort_values([\"id_size\", \"id\"]).tail(13))\n\nprint(\"A random large diamond appearing multiple times\")\ndf[df.id_size &gt; 3].sort_values(\"carat\").tail(6)<\/pre><\/div>\n\n<\/div><\/div>\n\t\t<\/div>\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"916\" height=\"602\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2021\/04\/image-8.png\" alt=\"\" class=\"wp-image-321\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2021\/04\/image-8.png 916w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2021\/04\/image-8-300x197.png 300w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2021\/04\/image-8-768x505.png 768w\" sizes=\"auto, (max-width: 916px) 100vw, 916px\" \/><figcaption>Table 1: Some duplicates in the four &#8220;C&#8221; variables and <code>price<\/code> (Python output).<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"852\" height=\"657\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2021\/04\/image-14.png\" alt=\"\" class=\"wp-image-347\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2021\/04\/image-14.png 852w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2021\/04\/image-14-300x231.png 300w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2021\/04\/image-14-768x592.png 768w\" sizes=\"auto, (max-width: 852px) 100vw, 852px\" \/><figcaption>Table 2: One of the two(!) diamonds appearing a whopping 43 times (Python output).<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1018\" height=\"369\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2021\/04\/image-11.png\" alt=\"\" class=\"wp-image-324\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2021\/04\/image-11.png 1018w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2021\/04\/image-11-300x109.png 300w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2021\/04\/image-11-768x278.png 768w\" sizes=\"auto, (max-width: 1018px) 100vw, 1018px\" \/><figcaption>Table 3: A large, 2.01 carat diamond appears six times (Python output).<\/figcaption><\/figure>\n\n\n\n<p>Of course, having the same id does not necessarily mean that the rows really describe the same diamond. <code>price<\/code> and the four &#8220;C&#8221;s could coincide purely by chance. Nevertheless: there are exactly six diamonds of 2.01 carat and a price of 16,778 USD in the dataset. And they all have the same color, cut and clarity. This cannot be coincidence!<\/p>\n\n\n\n<p><strong>Why would this be problematic? <\/strong><\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p>In the presence of grouped data, standard validation techniques tend to reward overfitting.<\/p><\/blockquote>\n\n\n\n<p>This becomes immediately clear having in mind the 2.01 carat diamond from Table 3. Standard cross-validation (CV) uses random or stratified sampling and would scatter the six rows of that diamond across multiple CV folds. Highly flexible algorithms like random forests or nearest-neighbour regression could exploit this by memorizing the price of this diamond in-fold and do very well out-of-fold. As a consequence, the stated CV performance would be too good and the choice of the modeling technique and its hyperparameters suboptimal.<\/p>\n\n\n\n<p>With grouped data, a good approach is often to randomly sample the <em>whole group<\/em> instead of single rows. Using such <strong>grouped splitting<\/strong> ensures that all rows in the same group would end up in the same fold, removing the above described tendency to overfit.<\/p>\n\n\n\n<p><strong>Note 1.<\/strong> In our case of duplicates, a simple alternative to grouped splitting would be to remove the duplicates altogether. However, the occurrence of duplicates is just one of many situations where grouped or clustered samples appear in reality.<\/p>\n\n\n\n<p><strong>Note 2. <\/strong>The same considerations not only apply to cross-validation but also to simple train\/validation\/test splits.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Evaluation<\/h2>\n\n\n\n<p>What does this mean regarding our diamonds dataset? Using five-fold CV, we will estimate the true root-mean-squared error (RMSE) of a random forest predicting log price by the four &#8220;C&#8221;. We run this experiment twice: one time, we create the folds by random splitting and the other time by grouped splitting. How heavily will the results from random splitting be biased?<\/p>\n\n\n<div class=\"wp-block-ub-tabbed-content wp-block-ub-tabbed-content-holder wp-block-ub-tabbed-content-horizontal-holder-mobile wp-block-ub-tabbed-content-horizontal-holder-tablet\" id=\"ub-tabbed-content-0ce04266-fcfb-4e4b-b809-61077297785d\" style=\"\">\n\t\t\t<div class=\"wp-block-ub-tabbed-content-tab-holder horizontal-tab-width-mobile horizontal-tab-width-tablet\">\n\t\t\t\t<div role=\"tablist\" class=\"wp-block-ub-tabbed-content-tabs-title wp-block-ub-tabbed-content-tabs-title-mobile-horizontal-tab wp-block-ub-tabbed-content-tabs-title-tablet-horizontal-tab\" style=\"justify-content: flex-start; \"><div role=\"tab\" id=\"ub-tabbed-content-0ce04266-fcfb-4e4b-b809-61077297785d-tab-0\" aria-controls=\"ub-tabbed-content-0ce04266-fcfb-4e4b-b809-61077297785d-panel-0\" aria-selected=\"true\" class=\"wp-block-ub-tabbed-content-tab-title-wrap active\" style=\"--ub-tabbed-title-background-color: #6d6d6d; --ub-tabbed-active-title-color: inherit; --ub-tabbed-active-title-background-color: #6d6d6d; text-align: center; \" tabindex=\"-1\">\n\t\t\t\t<div class=\"wp-block-ub-tabbed-content-tab-title\">R<\/div>\n\t\t\t<\/div><div role=\"tab\" id=\"ub-tabbed-content-0ce04266-fcfb-4e4b-b809-61077297785d-tab-1\" aria-controls=\"ub-tabbed-content-0ce04266-fcfb-4e4b-b809-61077297785d-panel-1\" aria-selected=\"false\" class=\"wp-block-ub-tabbed-content-tab-title-wrap\" style=\"--ub-tabbed-active-title-color: inherit; --ub-tabbed-active-title-background-color: #6d6d6d; text-align: center; \" tabindex=\"-1\">\n\t\t\t\t<div class=\"wp-block-ub-tabbed-content-tab-title\">Python<\/div>\n\t\t\t<\/div><\/div>\n\t\t\t<\/div>\n\t\t\t<div class=\"wp-block-ub-tabbed-content-tabs-content\" style=\"\"><div role=\"tabpanel\" class=\"wp-block-ub-tabbed-content-tab-content-wrap active\" id=\"ub-tabbed-content-0ce04266-fcfb-4e4b-b809-61077297785d-panel-0\" aria-labelledby=\"ub-tabbed-content-0ce04266-fcfb-4e4b-b809-61077297785d-tab-0\" tabindex=\"0\">\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting='{\"showPanel\":true,\"languageLabel\":\"language\",\"fullScreenButton\":true,\"copyButton\":true,\"mode\":\"r\",\"mime\":\"text\/x-rsrc\",\"theme\":\"material\",\"lineNumbers\":false,\"styleActiveLine\":false,\"lineWrapping\":false,\"readOnly\":true,\"fileName\":\"\",\"language\":\"R\",\"maxHeight\":\"400px\",\"modeName\":\"r\"}'>library(ranger)\nlibrary(splitTools) # one of our packages on CRAN\n\nset.seed(8325)\n\n# We model log(price)\ndia &lt;- dia %&gt;% \n  mutate(y = log(price))\n\n# Helper function: calculate rmse\nrmse &lt;- function(obs, pred) {\n  sqrt(mean((obs - pred)^2))\n}\n\n# Helper function: fit model on one fold and evaluate\nfit_on_fold &lt;- function(fold, data) {\n  fit &lt;- ranger(y ~ carat + cut + color + clarity, data = data[fold, ])\n  rmse(data$y[-fold], predict(fit, data[-fold, ])$pred)\n}\n  \n# 5-fold CV for different split types\ncross_validate &lt;- function(type, data) {\n  folds &lt;- create_folds(data$id, k = 5, type = type)\n  mean(sapply(folds, fit_on_fold, data = dia))\n}\n\n# Apply and plot\n(results &lt;- sapply(c(\"basic\", \"grouped\"), cross_validate, data = dia))\nbarplot(results, col = \"orange\", ylab = \"RMSE by 5-fold CV\")<\/pre><\/div>\n\n<\/div><div role=\"tabpanel\" class=\"wp-block-ub-tabbed-content-tab-content-wrap ub-hide\" id=\"ub-tabbed-content-0ce04266-fcfb-4e4b-b809-61077297785d-panel-1\" aria-labelledby=\"ub-tabbed-content-0ce04266-fcfb-4e4b-b809-61077297785d-tab-1\" tabindex=\"0\">\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting='{\"showPanel\":true,\"languageLabel\":\"language\",\"fullScreenButton\":true,\"copyButton\":true,\"mode\":\"python\",\"mime\":\"text\/x-python\",\"theme\":\"material\",\"lineNumbers\":false,\"styleActiveLine\":false,\"lineWrapping\":false,\"readOnly\":true,\"fileName\":\"\",\"language\":\"Python\",\"maxHeight\":\"400px\",\"modeName\":\"python\"}'>from sklearn.ensemble import RandomForestRegressor\nfrom sklearn.model_selection import cross_val_score, GroupKFold, KFold\nfrom sklearn.metrics import make_scorer, mean_squared_error\nimport seaborn as sns\n\nrmse = make_scorer(mean_squared_error, squared=False)\n\n# Prepare y, X\ndf = df.sample(frac=1, random_state=6345)\ny = np.log(df.price)\nX = df[xvars].copy()\n\n# Correctly ordered integer encoding\nX[cat_vars] = X[cat_vars].apply(lambda x: x.cat.codes)\n\n# Cross-validation\nresults = {}\nrf = RandomForestRegressor(n_estimators=500, max_features=\"sqrt\", \n                           min_samples_leaf=5, n_jobs=-1)\nfor nm, strategy in zip((\"basic\", \"grouped\"), (KFold, GroupKFold)):\n    results[nm] = cross_val_score(\n        rf, X, y, cv=strategy(), scoring=rmse, groups=df.id\n    ).mean()\nprint(results)\n\nres = pd.DataFrame(results.items())\nsns.barplot(x=0, y=1, data=res);<\/pre><\/div>\n\n<\/div><\/div>\n\t\t<\/div>\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"507\" src=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2021\/04\/image-12-1024x507.png\" alt=\"\" class=\"wp-image-325\" srcset=\"https:\/\/lorentzen.ch\/wp-content\/uploads\/2021\/04\/image-12-1024x507.png 1024w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2021\/04\/image-12-300x148.png 300w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2021\/04\/image-12-768x380.png 768w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2021\/04\/image-12-1200x594.png 1200w, https:\/\/lorentzen.ch\/wp-content\/uploads\/2021\/04\/image-12.png 1253w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Figure 1: Test root-mean-squared error using different splitting methods (R output).<\/figcaption><\/figure>\n\n\n\n<p>The RMSE (11%) of grouped CV is <strong>8%-10% higher<\/strong> than of random CV (10%). The standard technique therefore seems to be <strong>considerably biased<\/strong>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Final remarks<\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li>The diamonds dataset is not only a brilliant example to demonstrate regression techniques but also a great way to show the importance of a <strong>clean<\/strong> validation strategy (in this case: grouped splitting).<\/li><li>Blind or automatic ML would most probably fail to detect non-trivial data structures like in this case and therefore use inappropriate validation strategies. The resulting model would be somewhere between suboptimal and dangerous. Just that nobody would know it!<\/li><li>The first step towards a good model validation strategy is <strong>data understanding<\/strong>. This is a mix of knowing the data source, how the data was generated, the meaning of columns and rows, descriptive statistics etc.<\/li><\/ul>\n\n\n\n<p>The Python notebook and R code can be found at:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><a href=\"https:\/\/github.com\/lorentzenchr\/notebooks\/blob\/master\/blogposts\/2021-04-16%20diamonds_curious_fact.R\">R-Code on github<\/a><\/li><li><a href=\"https:\/\/github.com\/lorentzenchr\/notebooks\/blob\/master\/blogposts\/2021-04-16%20diamonds_curious_fact.ipynb\">Python-Code on github<\/a><\/li><\/ul>\n","protected":false},"excerpt":{"rendered":"<p>&#8220;R <-> Python&#8221; continued&#8230; A Curious Fact on the Diamonds Dataset<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[9],"tags":[10,6,5],"class_list":["post-258","post","type-post","status-publish","format-standard","hentry","category-statistics","tag-lost-in-translation","tag-python","tag-r"],"featured_image_src":null,"author_info":{"display_name":"Michael Mayer","author_link":"https:\/\/lorentzen.ch\/index.php\/author\/michael\/"},"_links":{"self":[{"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/posts\/258","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/comments?post=258"}],"version-history":[{"count":74,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/posts\/258\/revisions"}],"predecessor-version":[{"id":362,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/posts\/258\/revisions\/362"}],"wp:attachment":[{"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/media?parent=258"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/categories?post=258"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lorentzen.ch\/index.php\/wp-json\/wp\/v2\/tags?post=258"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}