Tools for getting metadata of pdqr-function: a function which represents
distribution with finite support and finite values of probability/density.
The key metadata which defines underline distribution is "x_tbl". If two
pdqr-functions have the same "x_tbl" metadata, they represent the same
distribution and can be converted to one another with as_*()
family of
functions.
meta_all(f)
meta_class(f)
meta_type(f)
meta_support(f)
meta_x_tbl(f)
f | A pdqr-function. |
---|
meta_all()
returns a list of all metadata. meta_class()
,
meta_type()
, meta_support
, and meta_x_tbl()
return corresponding
metadata.
Internally storage of metadata is implemented as follows:
Pdqr class is a first "appropriate" ("p", "d", "q", or "r") S3 class of
pdqr-function. All "proper" pdqr-functions have full S3 class of the form:
c(cl, "pdqr", "function")
, where cl
is pdqr class.
Pdqr type, support, and "x_tbl" are stored into function's environment.
Pdqr class is returned by meta_class()
. This can be one of "p", "d", "q",
"r". Represents how pdqr-function describes underlying distribution:
P-function (i.e. of class "p") returns value of cumulative distribution
function (probability of random variable being not more than certain value)
at points q
(its numeric vector input). Internally it is implemented as
direct integration of corresponding (with the same "x_tbl" metadata)
d-function.
D-function returns value of probability mass or density function (depending
on pdqr type) at points x
(its numeric vector input). Internally it is
implemented by directly using "x_tbl" metadata (see section '"x_tbl"
metadata' for more details).
Q-function returns value of quantile function at points p
(its numeric
vector input). Internally it is implemented as inverse of corresponding
p-function (returns the smallest "x" value which has cumulative probability
not less than input).
R-function generates random sample of size n
(its single number input)
from distribution. Internally it is implemented using inverse transform
sampling: certain amount of points from standard uniform distribution is generated, and the output is values of
corresponding q-function at generated points.
These names are chosen so as to follow base R convention of naming distribution functions. All
pdqr-functions take only one argument with the same meaning as the first ones
in base R. It has no other arguments specific to some parameters of
distribution family. To emulate their other common arguments, use the
following transformations (here d_f
means a function of class "d", etc.):
For d_f(x, log = TRUE)
use log(d_f(x))
.
For p_f(q, lower.tail = FALSE)
use 1 - p_f(q)
.
For p_f(q, log.p = TRUE)
use log(p_f(q))
.
For q_f(p, lower.tail = FALSE)
use q_f(1 - p)
.
For q_f(p, log.p = TRUE)
use q_f(exp(p))
.
Pdqr type is returned by meta_type()
. This can be one of "discrete" or
"continuous". Represents type of underlying distribution:
Type "discrete" is used for distributions with finite number of outcomes. Functions with "discrete" type has a fixed set of "x" values ("x" column in "x_tbl" metadata) on which d-function returns possibly non-zero output (values from "prob" column in "x_tbl" metadata).
Type "continuous" is used to represent continuous distributions with piecewise-linear density with finite values and on finite support. Density goes through points defined by "x" and "y" columns in "x_tbl" metadata.
Pdqr support is returned by meta_support()
. This is a numeric vector with
two finite values. Represents support of underlying distribution: closed
interval, outside of which d-function is equal to zero. Note that inside
of support d-function can also be zero, which especially true for "discrete"
functions.
Technically, pdqr support is range of values from "x" column of "x_tbl" metadata.
Metadata "x_tbl" is returned by meta_x_tbl()
. This is a key metadata which
completely defines distribution. It is a data frame with three numeric
columns, content of which partially depends on pdqr type.
Type "discrete" functions have "x_tbl" with columns "x", "prob", "cumprob".
D-functions return a value from "prob" column for input which is very near
(should be equal up to ten digits, defined by round(*, digits = 10)) to corresponding value of "x" column. Rounding is done to
account for issues with representation of numerical values (see Note section
of ==
's help page). For any other input, d-functions return
zero.
Type "continuous" functions have "x_tbl" with columns "x", "y", "cumprob". D-functions return a value of piecewise-linear function passing through points that have "x" and "y" coordinates. For any value outside support (i.e. strictly less than minimum "x" and strictly more than maximum "x") output is zero.
Column "cumprob" always represents the probability of underlying random variable being not more than corresponding value in "x" column.
All metadata of pdqr-functions are not meant to be changed directly. Also change of pdqr type, support, and "x_tbl" metadata will lead to a complete change of underlying distribution.
To change pdqr class, for example to convert p-function to d-function,
use as_*()
family of functions: as_p()
, as_d()
, as_q()
, as_r()
.
To change pdqr type, use form_retype()
. It changes underlying
distribution in the most suitable for user way.
To change pdqr support, use form_resupport()
or form_tails()
.
Change of "x_tbl" metadata is not possible, because basically it means
creating completely new pdqr-function. To do that, supply data frame with
"x_tbl" format suitable for desired "type" to appropriate new_*()
function:
new_p()
, new_d()
, new_q()
, new_r()
. Also, there is a form_regrid()
function which will increase or decrease granularity of pdqr-function.
#> List of 4
#> $ class : chr "d"
#> $ type : chr "continuous"
#> $ support: num [1:2] 0 1
#> $ x_tbl :'data.frame': 10001 obs. of 3 variables:
#> ..$ x : num [1:10001] 0e+00 1e-04 2e-04 3e-04 4e-04 5e-04 6e-04 7e-04 8e-04 9e-04 ...
#> ..$ y : num [1:10001] 1 1 1 1 1 1 1 1 1 1 ...
#> ..$ cumprob: num [1:10001] 0e+00 1e-04 2e-04 3e-04 4e-04 5e-04 6e-04 7e-04 8e-04 9e-04 ...
meta_class(d_unif)
#> [1] "d"meta_type(d_unif)
#> [1] "continuous"meta_support(d_unif)
#> [1] 0 1#> x y cumprob
#> 1 0e+00 1 0e+00
#> 2 1e-04 1 1e-04
#> 3 2e-04 1 2e-04
#> 4 3e-04 1 3e-04
#> 5 4e-04 1 4e-04
#> 6 5e-04 1 5e-04