Modify grid of pdqr-function (rows of "x_tbl" metadata) to increase (upgrid) or decrease (downgrid) granularity using method of choice. Upgridding might be useful in order to obtain more information during certain type of transformations. Downgridding might be useful for decreasing amount of used memory for storing pdqr-function without losing much information.

form_regrid(f, n_grid, method = "x")

Arguments

f

A pdqr-function.

n_grid

A desired number of grid elements in output.

method

Regrid method. Should be one of "x" or "q".

Value

A pdqr-function with modified grid.

Details

The goal here is to create pdqr-function which is reasonably similar to f and has n_grid rows in "x_tbl" metadata.

General algorithm of regridding is as follows:

  • Compute reference grid. For method "x" it is a sequence of equidistant points between edges of f's support. For method "q" - sequence of quantiles for equidistant probabilities from 0 to 1. Lengths of reference grids for both methods are n_grid.

  • Adjust f's grid to reference one. This is done depending on f's type and which kind or regridding is done (upgridding is the case when n_grid is strictly more than number of rows in "x_tbl" metadata, downgridding - when it is strictly less):

    • Type "discrete":

      • UPgridding "discrete" functions is not possible as it is assumed that input "discrete" functions can't have any "x" values other then present ones. In this case input is returned, the only case when output doesn't have desired n_grid rows in "x_tbl" metadata.

      • DOWNgridding "discrete" functions is done by computing nearest match of reference grid to f's one and collapsing (by summing probabilities) all "x" values from input to the nearest matched ones. Here "computing nearest match" means that every element of reference grid is one-one matched with subset of unique values from f's "x" elements. Matching is done in greedy iterative fashion in order to minimize total distance between reference grid and matched subset. Note that this can result in not optimal (with not minimum total distance) match and can take a while to compute in some cases.

    • Type "continuous":

      • UPgridding "continuous" functions is done by adding rows to "x_tbl" metadata with "x" values equal to those elements of reference grid which are the furthest away from input "x" grid as a set. Distance from point to set is meant as minimum of distances between point and all points of set. Values of "y" and "cumprob" columns are taken as values of corresponding to f d- and p-functions.

      • DOWNgridding "continuous" functions is done by computing nearest match of reference grid to f's one (as for "discrete" type) and removing all unmatched rows from "x_tbl" metadata.

Special cases of n_grid:

  • If n_grid is the same as number of rows in "x_tbl" metadata, then input f is returned.

  • If n_grid is 1, appropriate new_*() function is used with single numeric input equal to distribution's median.

See also

form_resupport() for changing support of pdqr-function.

form_retype() for changing type of pdqr-function.

Other form functions: form_estimate(), form_mix(), form_resupport(), form_retype(), form_smooth(), form_tails(), form_trans()

Examples

# Type "discrete" d_dis <- new_d(data.frame(x = 1:10, prob = 1:10 / 55), type = "discrete") # Downgridding meta_x_tbl(form_regrid(d_dis, n_grid = 4))
#> x prob cumprob #> 1 1 0.05454545 0.05454545 #> 2 4 0.21818182 0.27272727 #> 3 7 0.38181818 0.65454545 #> 4 10 0.34545455 1.00000000
meta_x_tbl(form_regrid(d_dis, n_grid = 4, method = "q"))
#> x prob cumprob #> 1 1 0.1090909 0.1090909 #> 2 6 0.4000000 0.5090909 #> 3 9 0.3090909 0.8181818 #> 4 10 0.1818182 1.0000000
# Upgridding for "discrete" type isn't possible. Input is returned identical(d_dis, form_regrid(d_dis, n_grid = 100))
#> [1] TRUE
# Type "continuous" # Downgridding d_norm <- as_d(dnorm) plot(d_norm)
lines(form_regrid(d_norm, n_grid = 10), col = "blue")
lines(form_regrid(d_norm, n_grid = 10, method = "q"), col = "green")
# Upgridding d_con <- new_d(data.frame(x = 1:3, y = rep(0.5, 3)), type = "continuous") meta_x_tbl(form_regrid(d_con, n_grid = 6))
#> x y cumprob #> 1 1.0 0.5 0.0 #> 2 1.4 0.5 0.2 #> 3 2.0 0.5 0.5 #> 4 2.2 0.5 0.6 #> 5 2.6 0.5 0.8 #> 6 3.0 0.5 1.0
# Pdqr-function with center at median is returned in case `n_grid` is 1 form_regrid(d_dis, n_grid = 1)
#> Probability mass function of discrete type #> Support: [7, 7] (1 element)
# Dirac-like function is returned form_regrid(d_con, n_grid = 1)
#> Density function of continuous type #> Support: ~[2, 2] (2 intervals)