These functions provide ways of working with a region: a data frame with numeric "left" and "right" columns, each row of which represents a unique finite interval (open, either type of half-open, or closed). Values of "left" and "right" columns should create an "ordered" set of intervals: left[1] <= right[1] <= left[2] <= right[2] <= ... (intervals with zero width are accepted). Originally, region_*() functions were designed to work with output of summ_hdr() and summ_interval(), but can be used for any data frame which satisfies the definition of a region.

region_is_in(region, x, left_closed = TRUE, right_closed = TRUE)

region_prob(region, f, left_closed = TRUE, right_closed = TRUE)

region_height(region, f, left_closed = TRUE, right_closed = TRUE)

region_width(region)

region_distance(region, region2, method = "Jaccard")

region_draw(region, col = "blue", alpha = 0.2)

## Arguments

region A data frame representing region. Numeric vector to be tested for being inside region. A single logical value representing whether to treat left ends of intervals as their parts. A single logical value representing whether to treat right ends of intervals as their parts. A pdqr-function. A data frame representing region. Method for computing distance between regions in region_distance(). Should be one of "Jaccard" or methods of summ_distance(). Single color of rectangles to be used. Should be appropriate for col argument of col2rgb(). Single number representing factor modifying the opacity alpha; typically in [0; 1].

## Value

region_is_in() returns a logical vector (with length equal to length of x) representing whether certain element of x is inside a region.

region_prob() returns a single number between 0 and 1 representing total probability of region.

region_height() returns a single number representing a height of a region with respect to f, i.e. minimum value that corresponding d-function can return based on relevant points inside a region.

region_width() returns a single number representing total width of a region.

region_draw() draws colored rectangles filling region intervals.

## Details

region_is_in() tests each value of x for being inside interval. In other words, if there is a row for which element of x is between "left" and "right" value (respecting left_closed and right_closed options), output for that element will be TRUE. Note that for zero-width intervals one of left_closed or right_closed being TRUE is enough to accept that point as "in region".

region_prob() computes total probability of region according to pdqr-function f. If f has "discrete" type, output is computed as sum of probabilities for all "x" values from "x_tbl" metadata which lie inside a region (respecting left_closed and right_closed options while using region_is_in()). If f has "continuous" type, output is computed as integral of density over a region (*_closed options having any effect).

region_height() computes "height" of a region (with respect to f): minimum value of corresponding to f d-function can return based on relevant points inside a region. If f has "discrete" type, those relevant points are computed as "x" values from "x_tbl" metadata which lie inside a region (if there are no such points, output is 0). If f has "continuous" type, the whole intervals are used as relevant points. The notion of "height" comes from summ_hdr() function: if region is summ_hdr(f, level) for some level, then region_height(region, f) is what is called in summ_hdr()'s docs as "target height" of HDR. That is, a maximum value of d-function for which a set consisting from points at which d-function has values not less than target height and total probability of the set being not less than level.

region_width() computes total width of a region, i.e. sum of differences between "right" and "left" columns.

region_distance() computes distance between a pair of regions. As in summ_distance(), it is a single non-negative number representing how much two regions differ from one another (bigger values indicate bigger difference). Argument method represents method of computing distance. Method "Jaccard" computes Jaccard distance: one minus ratio of intersection width and union width. Other methods come from summ_distance() and represent distance between regions as probability distributions:

• If total width of region is zero (i.e. it consists only from points), distribution is a uniform discrete one based on points from region.

• If total width is positive, then distribution is a uniform continuous one based on intervals with positive width.

region_draw() draws (on current plot) intervals stored in region as colored rectangles vertically starting from zero and ending in the top of the plot (technically, at "y" value of 2e8).

summ_hdr() for computing of Highest Density Region.

summ_interval() for computing of single interval summary of distribution.

## Examples

# Type "discrete"
d_binom <- as_d(dbinom, size = 10, prob = 0.7)
hdr_dis <- summ_hdr(d_binom, level = 0.6)
region_is_in(hdr_dis, 0:10)
#>  [1] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE## This should be not less than 0.6
region_prob(hdr_dis, d_binom)
#> [1] 0.7004233region_height(hdr_dis, d_binom)
#> [1] 0.2001209region_width(hdr_dis)
#> [1] 2
# Type "continuous"
d_norm <- as_d(dnorm)
hdr_con <- summ_hdr(d_norm, level = 0.95)
region_is_in(hdr_con, c(-Inf, -2, 0, 2, Inf))
#> [1] FALSE FALSE  TRUE FALSE FALSE## This should be approximately equal to 0.95
region_prob(hdr_con, d_norm)
#> [1] 0.9500426## This should be equal to d_norm(hdr_con[["left"]][1])
region_height(hdr_con, d_norm)
#> [1] 0.05840531region_width(hdr_con)
#> [1] 3.920624
# Usage of *_closed options
region <- data.frame(left = 1, right = 3)
## Closed intervals
region_is_in(region, 1:3)
#> [1] TRUE TRUE TRUE## Open from left, closed from right
region_is_in(region, 1:3, left_closed = FALSE)
#> [1] FALSE  TRUE  TRUE## Closed from left, open from right
region_is_in(region, 1:3, right_closed = FALSE)
#> [1]  TRUE  TRUE FALSE## Open intervals
region_is_in(region, 1:3, left_closed = FALSE, right_closed = FALSE)
#> [1] FALSE  TRUE FALSE
# Handling of intervals with zero width
region <- data.frame(left = 1, right = 1)
## If at least one of *_closed options is TRUE, 1 will be considered as
## "in a region"
region_is_in(region, 1)
#> [1] TRUEregion_is_in(region, 1, left_closed = FALSE)
#> [1] TRUEregion_is_in(region, 1, right_closed = FALSE)
#> [1] TRUE## Only this will return FALSE
region_is_in(region, 1, left_closed = FALSE, right_closed = FALSE)
#> [1] FALSE
# Distance between regions
region1 <- data.frame(left = c(0, 2), right = c(1, 2))
region2 <- data.frame(left = 0.5, right = 1.5)
region_distance(region1, region2, method = "Jaccard")
#> [1] 0.6666667region_distance(region1, region2, method = "KS")
#> [1] 0.5
# Drawing
d_mix <- form_mix(list(as_d(dnorm), as_d(dnorm, mean = 5)))
plot(d_mix)
region_draw(summ_hdr(d_mix, 0.95))