--- title: "Random Forest, using Ranger" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Random Forest, using Ranger} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(dplyr) library(tidypredict) library(parsnip) library(ranger) set.seed(100) ``` | Function |Works| |---------------------------------------------------------------|-----| |`tidypredict_fit()`, `tidypredict_sql()`, `parse_model()` | ✔ | |`tidypredict_to_column()` | ✗ | |`tidypredict_test()` | ✔ | |`tidypredict_interval()`, `tidypredict_sql_interval()` | ✗ | |`parsnip` | ✔ | ## How it works Here is a simple `ranger()` model using the `mtcars` dataset: ```{r} library(dplyr) library(tidypredict) library(ranger) model <- ranger(mpg ~ ., data = mtcars, num.trees = 5, max.depth = 2) ``` ## Under the hood The parser is based on the output from the `ranger::treeInfo()` function. It will return as many decision paths as there are non-NA rows in the `prediction` field. ```{r} treeInfo(model) %>% head() ``` The output from `parse_model()` is transformed into a `dplyr`, a.k.a Tidy Eval, formula. Each decision tree becomes one `dplyr::case_when()` statement, which are then combined. ```{r} tidypredict_fit(model) ``` From there, the Tidy Eval formula can be used anywhere where it can be operated. `tidypredict` provides three paths: - Use directly inside `dplyr`, `mutate(iris, !! tidypredict_fit(model))` - Use `tidypredict_to_column(model)` to a piped command set - Use `tidypredict_to_sql(model)` to retrieve the SQL statement ## parsnip `tidypredict` also supports `ranger` model objects fitted via the `parsnip` package. ```{r} library(parsnip) parsnip_model <- rand_forest(mode = "regression", trees = 5) %>% set_engine("ranger", max.depth = 2) %>% fit(mpg ~ ., data = mtcars) tidypredict_fit(parsnip_model) ```