Tools for getting metadata of pdqr-function: a function which represents distribution with finite support and finite values of probability/density. The key metadata which defines underline distribution is "x_tbl". If two pdqr-functions have the same "x_tbl" metadata, they represent the same distribution and can be converted to one another with as_*() family of functions.

meta_all(f)

meta_class(f)

meta_type(f)

meta_support(f)

meta_x_tbl(f)

Arguments

f

A pdqr-function.

Value

meta_all() returns a list of all metadata. meta_class(), meta_type(), meta_support, and meta_x_tbl() return corresponding metadata.

Details

Internally storage of metadata is implemented as follows:

  • Pdqr class is a first "appropriate" ("p", "d", "q", or "r") S3 class of pdqr-function. All "proper" pdqr-functions have full S3 class of the form: c(cl, "pdqr", "function"), where cl is pdqr class.

  • Pdqr type, support, and "x_tbl" are stored into function's environment.

Pdqr class

Pdqr class is returned by meta_class(). This can be one of "p", "d", "q", "r". Represents how pdqr-function describes underlying distribution:

  • P-function (i.e. of class "p") returns value of cumulative distribution function (probability of random variable being not more than certain value) at points q (its numeric vector input). Internally it is implemented as direct integration of corresponding (with the same "x_tbl" metadata) d-function.

  • D-function returns value of probability mass or density function (depending on pdqr type) at points x (its numeric vector input). Internally it is implemented by directly using "x_tbl" metadata (see section '"x_tbl" metadata' for more details).

  • Q-function returns value of quantile function at points p (its numeric vector input). Internally it is implemented as inverse of corresponding p-function (returns the smallest "x" value which has cumulative probability not less than input).

  • R-function generates random sample of size n (its single number input) from distribution. Internally it is implemented using inverse transform sampling: certain amount of points from standard uniform distribution is generated, and the output is values of corresponding q-function at generated points.

These names are chosen so as to follow base R convention of naming distribution functions. All pdqr-functions take only one argument with the same meaning as the first ones in base R. It has no other arguments specific to some parameters of distribution family. To emulate their other common arguments, use the following transformations (here d_f means a function of class "d", etc.):

  • For d_f(x, log = TRUE) use log(d_f(x)).

  • For p_f(q, lower.tail = FALSE) use 1 - p_f(q).

  • For p_f(q, log.p = TRUE) use log(p_f(q)).

  • For q_f(p, lower.tail = FALSE) use q_f(1 - p).

  • For q_f(p, log.p = TRUE) use q_f(exp(p)).

Pdqr type

Pdqr type is returned by meta_type(). This can be one of "discrete" (short for "finite") or "continuous". Represents type of underlying distribution:

  • Type "discrete" is used for distributions with finite number of outcomes. Functions with "discrete" type has a fixed set of "x" values ("x" column in "x_tbl" metadata) on which d-function returns possibly non-zero output (values from "prob" column in "x_tbl" metadata).

  • Type "continuous" is used to represent continuous distributions with piecewise-linear density with finite values and on finite support. Density goes through points defined by "x" and "y" columns in "x_tbl" metadata.

Pdqr support

Pdqr support is returned by meta_support(). This is a numeric vector with two finite values. Represents support of underlying distribution: closed interval, outside of which d-function is equal to zero. Note that inside of support d-function can also be zero, which especially true for "discrete" functions.

Technically, pdqr support is range of values from "x" column of "x_tbl" metadata.

"x_tbl" metadata

Metadata "x_tbl" is returned by meta_x_tbl(). This is a key metadata which completely defines distribution. It is a data frame with three numeric columns, content of which partially depends on pdqr type.

Type "discrete" functions have "x_tbl" with columns "x", "prob", "cumprob". D-functions return a value from "prob" column for input which is very near (should be equal up to ten digits, defined by round(*, digits = 10)) to corresponding value of "x" column. Rounding is done to account for issues with representation of numerical values (see Note section of =='s help page). For any other input, d-functions return zero.

Type "continuous" functions have "x_tbl" with columns "x", "y", "cumprob". D-functions return a value of piecewise-linear function passing through points that have "x" and "y" coordinates. For any value outside support (i.e. strictly less than minimum "x" and strictly more than maximum "x") output is zero.

Column "cumprob" always represents the probability of underlying random variable being not more than corresponding value in "x" column.

Change of metadata

All metadata of pdqr-functions are not meant to be changed directly. Also change of pdqr type, support, and "x_tbl" metadata will lead to a complete change of underlying distribution.

To change pdqr class, for example to convert p-function to d-function, use as_*() family of functions: as_p(), as_d(), as_q(), as_r().

To change pdqr type, use form_retype(). It changes underlying distribution in the most suitable for user way.

To change pdqr support, use form_resupport() or form_tails().

Change of "x_tbl" metadata is not possible, because basically it means creating completely new pdqr-function. To do that, supply data frame with "x_tbl" format suitable for desired "type" to appropriate new_*() function: new_p(), new_d(), new_q(), new_r(). Also, there is a form_regrid() function which will increase or decrease granularity of pdqr-function.

Examples

d_unif <- as_d(dunif) str(meta_all(d_unif))
#> List of 4 #> $ class : chr "d" #> $ type : chr "continuous" #> $ support: num [1:2] 0 1 #> $ x_tbl :'data.frame': 10001 obs. of 3 variables: #> ..$ x : num [1:10001] 0e+00 1e-04 2e-04 3e-04 4e-04 5e-04 6e-04 7e-04 8e-04 9e-04 ... #> ..$ y : num [1:10001] 1 1 1 1 1 1 1 1 1 1 ... #> ..$ cumprob: num [1:10001] 0e+00 1e-04 2e-04 3e-04 4e-04 5e-04 6e-04 7e-04 8e-04 9e-04 ...
meta_class(d_unif)
#> [1] "d"
meta_type(d_unif)
#> [1] "continuous"
meta_support(d_unif)
#> [1] 0 1
head(meta_x_tbl(d_unif))
#> x y cumprob #> 1 0e+00 1 0e+00 #> 2 1e-04 1 1e-04 #> 3 2e-04 1 2e-04 #> 4 3e-04 1 3e-04 #> 5 4e-04 1 4e-04 #> 6 5e-04 1 5e-04