Stability and Structure of CART and SPAN Search Generated Data Partitions for the Analysis of Low Birth Weight
Volume 10, Issue 1 (2012), pp. 61–73
Pub. online: 4 August 2022 Type: Research Article Open Access
4 August 2022
4 August 2022
Abstract: Searching for data structure and decision rules using classification and regression tree (CART) methodology is now well established. An alternative procedure, search partition analysis (SPAN), is less well known. Both provide classifiers based on Boolean structures; in CART these are generated by a hierarchical series of local sub-searches and in SPAN by a global search. One issue with CART is its perceived instability, another the awkward nature of the Boolean structures generated by a hierarchical tree. Instability arises because the final tree structure is sensitive to early splits. SPAN, as a global search, seems more likely to render stable partitions. To examine these issues in the context of identifying mothers at risk of giving birth to low birth weight babies, we have taken a very large sample, divided it at random into ten non-overlapping sub-samples and performed SPAN and CART analyses on each sub-sample. The stability of the SPAN and CART models is described and, in addition, the structure of the Boolean representation of classifiers is examined. It is found that SPAN partitions have more intrinsic stability and less prone to Boolean structural irregularities.