Tools for getting metadata of pdqr-function: a function which represents
distribution with finite support and finite values of probability/density.
The key metadata which defines underline distribution is "x_tbl". If two
pdqr-functions have the same "x_tbl" metadata, they represent the same
distribution and can be converted to one another with
as_*() family of
meta_all(f) meta_class(f) meta_type(f) meta_support(f) meta_x_tbl(f)
meta_all() returns a list of all metadata.
meta_x_tbl() return corresponding
Internally storage of metadata is implemented as follows:
Pdqr class is a first "appropriate" ("p", "d", "q", or "r") S3 class of
pdqr-function. All "proper" pdqr-functions have full S3 class of the form:
c(cl, "pdqr", "function"), where
cl is pdqr class.
Pdqr type, support, and "x_tbl" are stored into function's environment.
Pdqr class is returned by
meta_class(). This can be one of "p", "d", "q",
"r". Represents how pdqr-function describes underlying distribution:
P-function (i.e. of class "p") returns value of cumulative distribution
function (probability of random variable being not more than certain value)
q (its numeric vector input). Internally it is implemented as
direct integration of corresponding (with the same "x_tbl" metadata)
D-function returns value of probability mass or density function (depending
on pdqr type) at points
x (its numeric vector input). Internally it is
implemented by directly using "x_tbl" metadata (see section '"x_tbl"
metadata' for more details).
Q-function returns value of quantile function at points
p (its numeric
vector input). Internally it is implemented as inverse of corresponding
p-function (returns the smallest "x" value which has cumulative probability
not less than input).
R-function generates random sample of size
n (its single number input)
from distribution. Internally it is implemented using inverse transform
sampling: certain amount of points from standard uniform distribution is generated, and the output is values of
corresponding q-function at generated points.
These names are chosen so as to follow base R convention of naming distribution functions. All
pdqr-functions take only one argument with the same meaning as the first ones
in base R. It has no other arguments specific to some parameters of
distribution family. To emulate their other common arguments, use the
following transformations (here
d_f means a function of class "d", etc.):
d_f(x, log = TRUE) use
p_f(q, lower.tail = FALSE) use
1 - p_f(q).
p_f(q, log.p = TRUE) use
q_f(p, lower.tail = FALSE) use
q_f(1 - p).
q_f(p, log.p = TRUE) use
Pdqr type is returned by
meta_type(). This can be one of "discrete" (short
for "finite") or "continuous". Represents type of underlying
Type "discrete" is used for distributions with finite number of outcomes. Functions with "discrete" type has a fixed set of "x" values ("x" column in "x_tbl" metadata) on which d-function returns possibly non-zero output (values from "prob" column in "x_tbl" metadata).
Type "continuous" is used to represent continuous distributions with piecewise-linear density with finite values and on finite support. Density goes through points defined by "x" and "y" columns in "x_tbl" metadata.
Pdqr support is returned by
meta_support(). This is a numeric vector with
two finite values. Represents support of underlying distribution: closed
interval, outside of which d-function is equal to zero. Note that inside
of support d-function can also be zero, which especially true for "discrete"
Technically, pdqr support is range of values from "x" column of "x_tbl" metadata.
Metadata "x_tbl" is returned by
meta_x_tbl(). This is a key metadata which
completely defines distribution. It is a data frame with three numeric
columns, content of which partially depends on pdqr type.
Type "discrete" functions have "x_tbl" with columns "x", "prob", "cumprob".
D-functions return a value from "prob" column for input which is very near
(should be equal up to ten digits, defined by round(*, digits = 10)) to corresponding value of "x" column. Rounding is done to
account for issues with representation of numerical values (see Note section
=='s help page). For any other input, d-functions return
Type "continuous" functions have "x_tbl" with columns "x", "y", "cumprob". D-functions return a value of piecewise-linear function passing through points that have "x" and "y" coordinates. For any value outside support (i.e. strictly less than minimum "x" and strictly more than maximum "x") output is zero.
Column "cumprob" always represents the probability of underlying random variable being not more than corresponding value in "x" column.
All metadata of pdqr-functions are not meant to be changed directly. Also change of pdqr type, support, and "x_tbl" metadata will lead to a complete change of underlying distribution.
To change pdqr type, use
form_retype(). It changes underlying
distribution in the most suitable for user way.
Change of "x_tbl" metadata is not possible, because basically it means
creating completely new pdqr-function. To do that, supply data frame with
"x_tbl" format suitable for desired "type" to appropriate
new_r(). Also, there is a
function which will increase or decrease granularity of pdqr-function.
#> List of 4 #> $ class : chr "d" #> $ type : chr "continuous" #> $ support: num [1:2] 0 1 #> $ x_tbl :'data.frame': 10001 obs. of 3 variables: #> ..$ x : num [1:10001] 0e+00 1e-04 2e-04 3e-04 4e-04 5e-04 6e-04 7e-04 8e-04 9e-04 ... #> ..$ y : num [1:10001] 1 1 1 1 1 1 1 1 1 1 ... #> ..$ cumprob: num [1:10001] 0e+00 1e-04 2e-04 3e-04 4e-04 5e-04 6e-04 7e-04 8e-04 9e-04 ...meta_class(d_unif)#>  "d"meta_type(d_unif)#>  "continuous"meta_support(d_unif)#>  0 1head(meta_x_tbl(d_unif))#> x y cumprob #> 1 0e+00 1 0e+00 #> 2 1e-04 1 1e-04 #> 3 2e-04 1 2e-04 #> 4 3e-04 1 3e-04 #> 5 4e-04 1 4e-04 #> 6 5e-04 1 5e-04