Functions for ordering the set of pdqr-functions supplied in a list. This might be useful for doing comparative statistical inference for several groups of data.

summ_order(f_list, method = "compare", decreasing = FALSE)

summ_sort(f_list, method = "compare", decreasing = FALSE)

summ_rank(f_list, method = "compare")

## Arguments

f_list List of pdqr-functions. Method to be used for ordering. Should be one of "compare", "mean", "median", "mode". If TRUE ordering is done decreasingly.

## Value

summ_order() works essentially like order(). It returns an integer vector representing a permutation which rearranges f_list in desired order.

summ_sort() returns a sorted (in desired order) variant of f_list.

summ_rank() returns a numeric vector representing ranks of f_list elements: 1 for the "smallest", length(f_list) for the "biggest".

## Details

Ties for all methods are handled so as to preserve the original order.

Method "compare" is using the following ordering relation: pdqr-function f is greater than g if and only if P(f >= g) > 0.5, or in code summ_prob_true(f >= g) > 0.5 (see pdqr methods for "Ops" group generic family for more details on comparing pdqr-functions). This method orders input based on this relation and order() function. Notes:

• This relation doesn't define strictly ordering because it is not transitive: there can be pdqr-functions f, g, and h, for which f is greater than g, g is greater than h, and h is greater than f (but should be otherwise). If not addressed, this might result into dependence of output on order of the input. It is solved by first preordering f_list based on method "mean" and then calling order().

• Because comparing two pdqr-functions can be time consuming, this method becomes rather slow as number of f_list elements grows.

Methods "mean", "median", and "mode" are based on summ_center(): ordering of f_list is defined as ordering of corresponding measures of distribution's center.

Other summary functions: summ_center(), summ_classmetric(), summ_distance(), summ_entropy(), summ_hdr(), summ_interval(), summ_moment(), summ_prob_true(), summ_pval(), summ_quantile(), summ_roc(), summ_separation(), summ_spread()

## Examples

d_fun <- as_d(dunif)
f_list <- list(a = d_fun, b = d_fun + 1, c = d_fun - 1)
summ_order(f_list)#> [1] 3 1 2summ_sort(f_list)#> $c #> Density function of continuous type #> Support: [-1, 0] (10000 intervals) #> #>$a
#> Density function of continuous type
#> Support: [0, 1] (10000 intervals)
#>
#> \$b
#> Density function of continuous type
#> Support: [1, 2] (10000 intervals)
#> summ_rank(f_list)#> a b c
#> 2 3 1
# All methods might give different results on some elaborated pdqr-functions
# Methods "compare" and "mean" are not equivalent
non_mean_list <- list(
new_d(data.frame(x = c(0.56, 0.815), y = c(1, 1)), "continuous"),
new_d(data.frame(x = 0:1, y = c(0, 1)), "continuous")
)
summ_order(non_mean_list, method = "compare")#> [1] 1 2summ_order(non_mean_list, method = "mean")#> [1] 2 1
# Methods powered by summ_center() are not equivalent
m <- c(0, 0.2, 0.1)
s <- c(1.1, 1.2, 1.3)
dlnorm_list <- lapply(seq_along(m), function(i) {
as_d(dlnorm, meanlog = m[i], sdlog = s[i])
})
summ_order(dlnorm_list, method = "mean")#> [1] 1 2 3summ_order(dlnorm_list, method = "median")#> [1] 1 3 2summ_order(dlnorm_list, method = "mode")#> [1] 3 2 1
# Method "compare" handles inherited non-transitivity. Here third element is
# "greater" than second (P(f >= g) > 0.5), second - than first, and first
# is "greater" than third.
non_trans_list <- list(
new_d(data.frame(x = c(0.39, 0.44, 0.46), y = c(17, 14, 0)), "continuous"),
new_d(data.frame(x = c(0.05, 0.3, 0.70), y = c(4, 0, 4)), "continuous"),
new_d(data.frame(x = c(0.03, 0.40, 0.80), y = c(1, 1, 1)), "continuous")
)
summ_sort(non_trans_list)#> [[1]]
#> Density function of continuous type
#> Support: [0.05, 0.7] (2 intervals)
#>
#> [[2]]
#> Density function of continuous type
#> Support: [0.03, 0.8] (2 intervals)
#>
#> [[3]]
#> Density function of continuous type
#> Support: [0.39, 0.46] (2 intervals)
#>   # Output doesn't depend on initial order
summ_sort(non_trans_list[c(2, 3, 1)])#> [[1]]
#> Density function of continuous type
#> Support: [0.05, 0.7] (2 intervals)
#>
#> [[2]]
#> Density function of continuous type
#> Support: [0.03, 0.8] (2 intervals)
#>
#> [[3]]
#> Density function of continuous type
#> Support: [0.39, 0.46] (2 intervals)
#>