Abstract: For two independent random variables, X and Y, let p = P(X > Y ) + 0.5P(X = Y ), which is sometimes described as a probabilistic measure of effect size. It has been argued that for various reasons, p represents an important and useful way of characterizing how groups differ. In clinical trials, for example, an issue is the likelihood that one method of treatment will be more effective than another. The paper deals with making inferences about p when three or more groups are to be compared. When tied values can occur, the results suggest using a multiple comparison procedure based on an extension of Cliffâ€™s method used in conjunction with Hochbergâ€™s sequentially rejective technique. If tied values occur with probability zero, an alternative method can be argued to have a practical advantage. As for a global test, extant rank-based methods are unsatisfactory given the goal of comparing groups based on p. The one method that performed well in simulations is based in part on the distribution of the difference between each pair of random variables. A bootstrap method is used where a p-value is based on the projection depth of the null vector relative to the bootstrap cloud. The proposed methods are illustrated using data from an intervention study.