Using the Non-Parametric Classifier CART to Model Wood Density

To identify the stand attributes that best explain the variability in wood density, Pinus radiata plantations located in the Chilean coastal sector were studied and modeled. The study area corresponded to stands located in sedimentary soil between the zones of Constitucion and Cobquecura. Within each sampling sector, individual tree variables were recorded and the most relevant stand parameters were estimated. Fifty trees were sampled in each sector, obtaining from each one six wood discs from different stem heights. Each disc was weighed in green and then dried to anhydrous weight, and its basic density was calculated. The profile identification to classify basic density according to stand characteristics was performed through regression trees, a technique based in the use of predictor variables to partition the database using recursive algorithms in regions with similar responses. The objective of the regression tree method is to obtain highly homogenous groups (branches), which are identified using pruning techniques that successively eliminate the branches that least contribute to the classification of the variable of interest. The results found that the stand attributes that contributed significantly to basic density classification were the basal area, the number of trees per hectare, and the mean height.


Introduction
Forestry modeling has commonly been based in attribute quantification and prediction at the stand and individual tree level. Due to the increased intensity of stand management, with the goal of increasing both wood production and quality under different management conditions, more and better information with respect to forest growth and performance is also necessary. Growth simulators are a valuable decision-making tool in the forestry sector. These are based in models that estimate future tree growth over a certain time period, and consequently make forecasts on the wood resources to be produced, evaluate silvicultural prescriptions, and economically analyze different management alternatives (Vanclay, 1994). However, these simulators are not capable of predicting wood quality under different silvicultural options (Tassisa and Burkhart, 1998). Of the wood properties that affect quality, basic density is one of the most important because it determines its utilization in saw mills, manufacturing factories, cellulose plants, and as planks. Due to this relation with the final product quality, saw mills and paper industries are interested in both density and its variability (Tian et al., 1995). Haygreen and Bowyer (1996) found several factors that affect the variability in basic density, such as the site, climate, geographic location, species, age, and silviculture. Valencia and López (1999) indicate that the value of basic density and its variation depends to a great degree on the height and tree section where the sample is extracted. In general, the majority of the studies on wood properties are based in the description and evaluation of the variability and its causal factors without providing tools to estimate these properties.
Wood property estimation is commonly performed within specific zones and using traditional statistical tools such as regression models. Other non-traditional model construction methods are the so-called classification and regression trees (CART) (Breiman et al., 1984), whose objective is to obtain highly homogenous groups (branches), which are achieved using pruning techniques that consist in successively eliminating the branches that contribute least to the classification of variable of interest (Larsen and Speckman, 2002;Skinner et al., 2002;Moisen and Frescino, 2003;Tamminen et al., 2003). The decision trees can be used to define wood use in function of stand characteristics, identifying wood production areas with defined characteristics and according to market specifications.
Due to the need of more disaggregated information with respect to the wood density of Pinus radiata (radiate pine) in Chile, the general objective of this study is to identify the stand attributes that best explain wood variability in this species. The specific objectives are to identify profiles that classify wood density according to its most relevant stand characteristics.

Study Area
The study area corresponded to radiata pine tree plantations, aged between 20 and 28 years, established in the Chilean coastal sector between Constitución and Cobquecura (34 • 50 a 36 • 25 S) in predominantly sedimentary-origin soils.

Description of Sampling Sector
The sampling sector was characterized at three levels: site, stand, and individual tree. At site level, the soil type was recorded; at stand level, the site index (dominant and codominant trees height), age, density (trees/hectare), height, mean diameter, and basal area (occupation area tree) were recorded. The breast height diameter, total height, and crown initiation height (first live branch height), crown class, and height at which the wood discs were obtained were recorded for each tree sampled. The table 1 shows the main statistics of stand characteristics for sampled trees.

Sampling and Laboratory Procedures to Basic Wood Density Determination
In the tree selection process, the diameter classes that were principally represented in the distribution that characterized the stand structure (regular, with classes 22 to 32 cm diameter) were considered, choosing stems that were cylindrical, straight, and free of bifurcations and defects. Eighty two trees were sampled on five stands (standard deviation of 5 kg/m 3 and a confidence level of 95%). From each sampled tree, six discs were extracted: at the stump, at 5% (Dap), 25%, 50%, and 75% of the tree's commercial height as well as at the utilization limit diameter (ULD) 8 cm (sample total: 82 trees×6 disc/tree = 492). Each one of the samples (wood disc) were weighed and measured green, following Chilean Standards NCh 176/1 and 176/2 (INN, 1985(INN, , 1986, and subsequently dried in a stove at 103 ± 2 • until obtaining anhydrous weight. The basic density was calculated relating the sample's anhydrous weight with respect to its green state volume.

Regression Trees Construction
With the descriptive stand variables for the study area (Table 1), regression trees were constructed using software S-PLUS 2000 (Mathsoft, 2000). The process is the following: Consider the multiple regression problem y i = f (x i1 , · · · , x ip ) + ε i , i = 1, · · · , n, where f is unknown and not easily parameterized; x ij are independent known variables, and ε i are random error terms with zero mean. A node N is a subgroup of indexes {1, · · · , n}. The deviance of node N is defined as: Whereȳ(N ) is the mean of the observations in node N . The root node consists in all the observations, and in each step, the parental node is divided recursively in two child nodes: a left node (N L ) and a right node (N R ) in order to minimize (N L )+D(N R ). Node partition is performed considering, in the case of continuous variables, all the divisions of the formula For each independent variable, all the possible partitions are considered, calculating the deviance for the following node to be divided D(N L ) + D(N R ). The candidate partitions are calculated for each independent variable, and variables that produce the best divisions (with less deviance) are selected to partition node N . The algorithm proceeds recursively until the next partition cannot be performed according to predetermined criteria. Normally, a number of nodes or stops is specified a priori when the deviance of the node is above a certain level (Larsen and Speckman, 2002).
The selection of one tree with respect to another will generally depend on the estimation of its error rate R(T ). This rate can be estimated in several manners, where the most notable is cross validation. This estimation method consists in estimating R(T ) to the estimator by validation sample in a reiterate and analogous manner. Each time, a fraction k −1 of the total sample size is removed from the tree construction sample. In this way, k estimates R (1) (T ), · · · , R (k) (T ) are obtained and averaged in using the following formula: where R cv (T ) means R cross-validation. In the case that the tree constructed for each one of the sub-samples is different from the others, the previous expression would not be valid. A basic technique in tree construction suggests the construction of leafy trees, arriving to the maximum possible tree Amáx without considering error rates, and pruning can be performed after their construction by choosing the tree that provides the lowest error rate. Once the entire tree Amáx has been constructed, and is adjusted to fit the data, a pruning algorithm is applied to obtain a sequence of sub-trees through the successive suppression of the branches that provide less information in terms of discrimination between the class of the response variable Y. Tree pruning is a procedure that is analogous to the "Backward" selection in regression: removes some of the terminal nodes. Finally, the sub-tree A * that provides the lowest error rate is selected (Puerta, 2002).
According to Larsen and Speckman (2002), a bonding criterion for adjustment of tree T with {N k } terminal nodes is defined as: where * means summing over all the terminal nodes N k . If a tree T is a subtree of T , clearly D(T ) ≤ D(T ). The pruning algorithm successively removes pairs of terminal nodes corresponding to the partition with the least deviance.
In other words, if T has terminal nodes {N k }, then each pair {N 2j , N 2j+1 } is the result of the division of a larger node N j with D(N j ) ≥ D(N 2j )+D(N 2j+1 ). The totality of the terminal node pairs are examined, and the pair with the lowest D(N j )−D(N 2j )−D(N 2j+1 ) is removed to create a new sub-tree T . This method is analogous to removing the least significant variable in the "backward" selection process. The process is repeated to create a nestled set of trees T m ⊂ · · · ⊂ T 0 , where T 0 is the entire tree and T m corresponds to the tree with only a root node.
To select the quantity of tree sequences, Breiman et al. (1984) propose a measure of cost-complexity for the tree T , Where α is a parameter chosen to adjust for cost-complexity. For a certain α, there is at least one tree that minimizes D α (T ). Where α = 2σ 2 , it corresponds to the automatic information criterion (AIC). The deviance of the nestled sequence is a decreasing function of α. Breiman et al. (1984) suggests that an optimal tree should be one with the least possible quantity of terminal nodes, with a standard minimum error, and with the lowest cost from the point of view of the information that it should contribute.

Results and Discussion
The constructed regression tree found that the stand variables that principally contribute to the wood density classification are in their order of importance in the discriminatory process: the basal area per hectare, the mean height, stand density, and site index (Figure 1). The tree's left branch has grouped the stands with low average basic density (415.7 kg/m 3 ), corresponding to those stands with a basal area below 22.1 m 2 /ha. The right branch has grouped the stands with higher basic density defined by the variables mean height, stand density, and site index. The stands with the highest basic density (482.3 kg/m 3 ) are found at a mean height lower than 32.2 m and a stand density below 455 trees/ha ( Figure  1). Since the complete regression tree is usually generated by the CART method, the model is over-adjusted. Consequently, pruning methods need to be used to eliminate the terminal nodes that least contribute to the classification of the variable of interest. For this reason, the tree was pruned using the cross-validation method (Breiman et al., 1984). Figure 2 shows the deviance behavior considering several tree sizes (number of terminal nodes), indicating that with three nodes it is possible to diminish model deviance, reducing the dimensionality of the original tree from five to only three terminal nodes, avoiding in this way over-adjusting the model.
After pruning, the regression tree was constituted by only two stand variables (basal area per hectare and mean height), and with which the model could adequately discriminate the stands according to wood density. The highest average basic density (477.5 kg/m 3 ) is given by stands with a mean height lower than 32.2 m and a basal area larger than 22.1 m 2 /ha (Figure 3).
If the stand's mean height is considered as a quality predictor variable, given its close relation with the site index, then it reasonable to think that the stands  that presented a higher basic density corresponded to those with lower mean height. With respect to this point, numerous authors have demonstrated that wood density increases as site quality diminishes. Schutz et al. (1991), in a study of Pinus patula in different sites, demonstrated that the site influenced 61% of the total variation in basic density. Morales (1968) found for radiata pine in Chile that the wood's specific weight increased as the site's quality diminished, increasing 21% between a good and a bad site. This result can be explained considering that the specific weight is a function of the ratio that exists between the volume occupied by the cellular walls and the volume of empty spaces. Logically, if the length of the tracheids is lower in poor sites, then the volume occupied by the cellular walls will grow, translating into an increase of wood density because less space would be occupied by cellular cavities (Haygreen and Bowyer, 1996). Several authors have demonstrated as well that ring width is relatively important in the specific weight. Consequently, if growth velocity decreases in a lower quality site, this will translate into a diminishment of ring width, increasing wood density (DeBell et al., 1994).
With respect to the basal area, as the first discriminatory variable of wood density and given that it corresponds to a measure of stand density, it is logical to conclude that at higher basal area values, the basic density will be higher. According to González and Molina (1989), a forest's growth rate will be affected by the quantity of trees per surface unit since the growth potential is distributed on these. With less individual per surface unit, and consequently lower basal area, growth will be faster, generating wider rings, and consequently less density. According to Larocque and Marshall (1995), stand density closely affects wood density in Pinus resinosa, generally presenting a decreasing tendency in wood density as tree spacing increases. With respect to this point, Cown and McConchie (1982) signal that the principal factor affecting wood's intrinsic properties is tree age, which closely controls wood density and the development of later wood development. The growth rate per se has been demonstrated to have a minimal effect on wood density, although several studies have demonstrated that stand density levels and the wood density are negatively correlated (Cown and McConchie, 1982).

Conclusions
Considering the complete regression tree, the stand attributes that significantly contribute to the classification of basic density are: basal area, number of trees per hectare, mean height, and site index.
The regression tree reduced by cross validation diminishes the dimensionality of the final model, incorporating the basal area and mean height as the only stand variables in the classification of basic wood density.
References Schutz, C., Christie, S. and Herman, B. (1991). Site relationship for some wood properties of pine species in plantation forests of southern Africa. South Africa Forestry Journal 156, 1-6.