ctree function - RDocumentation (2024)

Description

Recursive partitioning for continuous, censored, ordered, nominal and multivariate response variables in a conditional inference framework.

Usage

ctree(formula, data, subset, weights, na.action = na.pass, offset, cluster, control = ctree_control(...), ytrafo = NULL, converged = NULL, scores = NULL, doFit = TRUE, ...)

Value

An object of class party.

Arguments

formula

a symbolic description of the model to be fit.

data

a data frame containing the variables in the model.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

weights

an optional vector of weights to be used in the fitting process. Only non-negative integer valued weights are allowed.

offset

an optional vector of offset values.

cluster

an optional factor indicating independent clusters. Highly experimental, use at your own risk.

na.action

a function which indicates what should happen when the data contain missing value.

control

a list with control parameters, see ctree_control.

ytrafo

an optional named list of functions to be applied to the response variable(s) before testing their association with the explanatory variables. Note that this transformation is only performed once for the root node and does not take weights into account. Alternatively, ytrafo can be a function of data and weights. In this case, the transformation is computed for every node with corresponding weights. This feature is experimental and the user interface likely to change.

converged

an optional function for checking user-defined criteria before splits are implemented. This is not to be used and very likely to change.

scores

an optional named list of scores to be attached to ordered factors.

doFit

a logical, if FALSE, the tree is not fitted.

...

arguments passed to ctree_control.

Details

Function partykit::ctree is a reimplementation of (most of) party::ctree employing the new party infrastructure of the partykit infrastructure. The vignette vignette("ctree", package = "partykit") explains internals of the different implementations.

Conditional inference trees estimate a regression relationship by binary recursive partitioning in a conditional inference framework. Roughly, the algorithm works as follows: 1) Test the global null hypothesis of independence between any of the input variables and the response (which may be multivariate as well). Stop if this hypothesis cannot be rejected. Otherwise select the input variable with strongest association to the response. This association is measured by a p-value corresponding to a test for the partial null hypothesis of a single input variable and the response. 2) Implement a binary split in the selected input variable. 3) Recursively repeate steps 1) and 2).

The implementation utilizes a unified framework for conditional inference, or permutation tests, developed by Strasser and Weber (1999). The stop criterion in step 1) is either based on multiplicity adjusted p-values (testtype = "Bonferroni" in ctree_control) or on the univariate p-values (testtype = "Univariate"). In both cases, the criterion is maximized, i.e., 1 - p-value is used. A split is implemented when the criterion exceeds the value given by mincriterion as specified in ctree_control. For example, when mincriterion = 0.95, the p-value must be smaller than $0.05$ in order to split this node. This statistical approach ensures that the right-sized tree is grown without additional (post-)pruning or cross-validation. The level of mincriterion can either be specified to be appropriate for the size of the data set (and 0.95 is typically appropriate for small to moderately-sized data sets) or could potentially be treated like a hyperparameter (see Section~3.4 in Hothorn, Hornik and Zeileis, 2006). The selection of the input variable to split in is based on the univariate p-values avoiding a variable selection bias towards input variables with many possible cutpoints. The test statistics in each of the nodes can be extracted with the sctest method. (Note that the generic is in the strucchange package so this either needs to be loaded or sctest.constparty has to be called directly.) In cases where splitting stops due to the sample size (e.g., minsplit or minbucket etc.), the test results may be empty.

Predictions can be computed using predict, which returns predicted means, predicted classes or median predicted survival times and more information about the conditional distribution of the response, i.e., class probabilities or predicted Kaplan-Meier curves. For observations with zero weights, predictions are computed from the fitted tree when newdata = NULL.

By default, the scores for each ordinal factor x are 1:length(x), this may be changed for variables in the formula using scores = list(x = c(1, 5, 6)), for example.

For a general description of the methodology see Hothorn, Hornik and Zeileis (2006) and Hothorn, Hornik, van de Wiel and Zeileis (2006).

References

Hothorn T, Hornik K, Van de Wiel MA, Zeileis A (2006). A Lego System for Conditional Inference. The American Statistician, 60(3), 257--263.

Hothorn T, Hornik K, Zeileis A (2006). Unbiased Recursive Partitioning: A Conditional Inference Framework. Journal of Computational and Graphical Statistics, 15(3), 651--674.

Hothorn T, Zeileis A (2015). partykit: A Modular Toolkit for Recursive Partytioning in R. Journal of Machine Learning Research, 16, 3905--3909.

Strasser H, Weber C (1999). On the Asymptotic Theory of Permutation Statistics. Mathematical Methods of Statistics, 8, 220--250.

Examples

Run this code

### regressionairq <- subset(airquality, !is.na(Ozone))airct <- ctree(Ozone ~ ., data = airq)airctplot(airct)mean((airq$Ozone - predict(airct))^2)### classificationirisct <- ctree(Species ~ .,data = iris)irisctplot(irisct)table(predict(irisct), iris$Species)### estimated class probabilities, a listtr <- predict(irisct, newdata = iris[1:10,], type = "prob")### survival analysisif (require("TH.data") && require("survival") &&  require("coin") && require("Formula")) { data("GBSG2", package = "TH.data") (GBSG2ct <- ctree(Surv(time, cens) ~ ., data = GBSG2)) predict(GBSG2ct, newdata = GBSG2[1:2,], type = "response")  plot(GBSG2ct) ### with weight-dependent log-rank scores ### log-rank trafo for observations in this node only (= weights > 0) h <- function(y, x, start = NULL, weights, offset, estfun = TRUE, object = FALSE, ...) { if (is.null(weights)) weights <- rep(1, NROW(y)) s <- logrank_trafo(y[weights > 0,,drop = FALSE]) r <- rep(0, length(weights)) r[weights > 0] <- s list(estfun = matrix(as.double(r), ncol = 1), converged = TRUE) } ### very much the same tree (ctree(Surv(time, cens) ~ ., data = GBSG2, ytrafo = h))}### multivariate responsesairct2 <- ctree(Ozone + Temp ~ ., data = airq)airct2plot(airct2)

Run the code above in your browser using DataLab

ctree function - RDocumentation (2024)

FAQs

What is CTree in R? ›

CTree is a non-parametric class of regression trees embedding tree-structured regression models into a well defined theory of conditional inference pro- cedures.

What is a CTree? ›

Conditional inference trees estimate a regression relationship by binary recursive partitioning in a conditional inference framework. Roughly, the algorithm works as follows: 1) Test the global null hypothesis of independence between any of the input variables and the response (which may be multivariate as well).

What is the conditional inference tree algorithm? ›

Conditional Inference Trees is a different kind of decision tree that uses recursive partitioning of dependent variables based on the value of correlations. It avoids biasing just like other algorithms of classification and regression in machine learning.

How do you interpret a decision tree graph? ›

To interpret a decision tree, you need to follow the path from the root node to the leaf node that corresponds to your data point or scenario. Each node and branch will tell you what feature and value are used to split the data, and what proportion and value of the outcome variable are associated with each group.

How do decision trees work with an example? ›

A. A decision tree is a tree-like structure that represents a series of decisions and their possible consequences. It is used in machine learning for classification and regression tasks. An example of a decision tree is a flowchart that helps a person decide what to wear based on the weather conditions.

What is a Ctree database? ›

The c-tree database utilizes the simplified concepts of sessions, databases, and tables in addition to the standard concepts of records, fields, indexes, and segments. This database API allows for effortless and productive management of database systems.

What is the difference between tree and Cotree? ›

A tree is a connected sub graph of a network which consists of all the nodes of the original graph but no closed paths. The number of nodes in the graphs is equal to the number of nodes in the tree. Co-tree: It is a sub graph which is formed after disconnecting a tree from the given graph.

What is the decision tree model? ›

A decision tree is a non-parametric supervised learning algorithm, which is utilized for both classification and regression tasks. It has a hierarchical, tree structure, which consists of a root node, branches, internal nodes and leaf nodes.

How do you explain decision tree algorithm? ›

A decision tree is a supervised learning algorithm that is used for classification and regression modeling. Regression is a method used for predictive modeling, so these trees are used to either classify data or predict what will come next.

How many decision tree algorithms are there? ›

There are two types of decision trees; classification trees and regression trees. From there, they are split into different algorithms and use various nodes and branches to make them whole.

What is a Bayesian decision tree? ›

The Bayesian Decision Tree assumes the distribution of Yx is constant at each leaf. Given x , the tree will return the posterior distribution of the parameters θ generating Y within the leaf x belongs to.

What is the function of C () in R? ›

The c() function, an abbreviation for combine, is deceptively simple yet profoundly impactful in R programming. It is the go-to method for creating and merging vectors, laying the groundwork for data structure manipulation.

What is the meaning of chr in R? ›

The class of an object that holds character strings in R is “character”. A string in R can be created using single quotes or double quotes. chr = 'this is a string' chr = "this is a string"

What is tibble in R code? ›

Tibbles are data frames, but they tweak some older behaviours to make life a little easier. R is an old language, and some things that were useful 10 or 20 years ago now get in your way. It's difficult to change base R without breaking existing code, so most innovation occurs in packages.

What is a chr variable in R? ›

A character object is used to represent string values in R. We convert objects into character values with the as.character() function: > x = as.character(3.14) > x # print the character string.

Top Articles
Bookworm (Original) - Play Online on SolitaireParadise.com
Wvrja Daily Incarceration
Navin Dimond Net Worth
Family Day returns to Dobbins bigger than before
Ohio State Football Wiki
895 Area Code Time Zone
Steve Wallis Wife Age
Cbs Week 10 Trade Value Chart
Audrey Boustani Age
Lovex Load Data | xxlreloading.com
Homepoint Financial Wholesale Login
Stadium Seats Near Me
Bank Of America Operating Hours Today
Craigslist Pets Huntsville Alabama
Anchoring in Masonry Structures – Types, Installation, Anchorage Length and Strength
Paperless Pay.talx/Nestle
Mid-Autumn Festival 2024: The Best Lantern Displays and Carnivals in Hong Kong 
Hotfixes: September 13, 2024
Julie Green Ministries International On Rumble
Battlenet We Couldn't Verify Your Account With That Information
Weird Al.setlist
Crowder Hite Crews Funeral Home Obituaries
Wbap Iheart
craigslist: northern MI jobs, apartments, for sale, services, community, and events
Urbfsdreamgirl
That Is No Sword X Kakushi By Nez_R
Unit 9 Exam Joshua'S Law - dawson
Pokimane Titty Pops Out
REGULAMENTUL CAMPANIEI "Extra Smart Week" valabil in perioada 12-18 septembrie 2024
Resident Evil Netflix Wiki
Sign in to Office - Microsoft Support
Freehold Township Patch
Dicks Sporting Good Lincoln Ne
Chevalier Showtimes Near Island 16 Cinema De Lux
Arialectra Baby Alien
Ohio Licensing Lookup
Strange World Showtimes Near Harkins Theatres Christown 14
ARK Fjordur: Ultimate Resource Guide | Where to Find All Materials - Games Fuze
Crossword Answers, Crossword Solver
Damaged car, damaged cars for sale
Tapana Movie Online Watch 2022
This Meteorologist Was Wardrobe Shamed, So She Fought Back | Star 101.3 | Marcus & Corey
Section 528 Sofi Stadium
Fetid Emesis
Craigslist Free Stuff Bellingham
File Annual Report - Division of Corporations
Saratoga Otb Results
big island real estate - craigslist
3220 Nevada Terrace Ottawa Ks 66067
Guy Ritchie's The Covenant Showtimes Near Century 16 Eastport Plaza
Christian Publishers Outlet Rivergate
Sdn Michigan State Osteopathic 2023
Latest Posts
Article information

Author: Aron Pacocha

Last Updated:

Views: 5655

Rating: 4.8 / 5 (48 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Aron Pacocha

Birthday: 1999-08-12

Address: 3808 Moen Corner, Gorczanyport, FL 67364-2074

Phone: +393457723392

Job: Retail Consultant

Hobby: Jewelry making, Cooking, Gaming, Reading, Juggling, Cabaret, Origami

Introduction: My name is Aron Pacocha, I am a happy, tasty, innocent, proud, talented, courageous, magnificent person who loves writing and wants to share my knowledge and understanding with you.