Convert some function to be a proper pdqr-function of specific class, i.e. a function describing distribution with finite support and finite values of probability/density.

```
as_p(f, ...)
# S3 method for default
as_p(f, support = NULL, ..., n_grid = 10001)
# S3 method for pdqr
as_p(f, ...)
as_d(f, ...)
# S3 method for default
as_d(f, support = NULL, ..., n_grid = 10001)
# S3 method for pdqr
as_d(f, ...)
as_q(f, ...)
# S3 method for default
as_q(f, support = NULL, ..., n_grid = 10001)
# S3 method for pdqr
as_q(f, ...)
as_r(f, ...)
# S3 method for default
as_r(f, support = NULL, ..., n_grid = 10001,
n_sample = 10000, args_new = list())
# S3 method for pdqr
as_r(f, ...)
```

f | Appropriate function to be converted (see Details). |
---|---|

... | Extra arguments to |

support | Numeric vector with two increasing elements describing desired
support of output. If |

n_grid | Number of grid points at which |

n_sample | Number of points to sample from |

args_new | List of extra arguments for |

A pdqr-function of corresponding class.

General purpose of `as_*()`

functions is to create a proper
pdqr-function of desired class from input which doesn't satisfy these
conditions. Here is described sequence of steps which are taken to achieve
that goal.

If ** f is already a pdqr-function**,

`as_*()`

functions properly update it
to have specific class. They take input's "x_tbl" metadata
and type to use with corresponding new_*()
function. For example, `as_p(f)`

in case of pdqr-function `f`

is essentially
the same as `new_p(x = meta_x_tbl(f), type = meta_type(f))`

.If ** f is a function describing "honored" distribution**, it is detected
and output is created in predefined way taking into account extra arguments
in

`...`

. For more details see "Honored distributions" section.If ** f is some other unknown function**,

`as_*()`

functions use heuristics
for approximating input distribution with a "proper" pdqr-function. Outputs
of `as_*()`

can be only pdqr-functions of type "continuous" (because of
issues with support detection). It is assumed that `f`

returns values
appropriate for desired output class of `as_*()`

function and output type
"continuous". For example, input for `as_p()`

should return values of some
continuous cumulative distribution function (monotonically non-increasing
values from 0 to 1). To manually create function of type "discrete", supply
data frame input describing it to appropriate `new_*()`

function.General algorithm of how `as_*()`

functions work for unknown function is as
follows:

**Detect support**. See "Support detection" section for more details.**Create data frame input for**. The exact process differs:`new_*()`

In

`as_p()`

equidistant grid of`n_grid`

points is created inside detected support. After that, input's values at the grid is taken as reference points of cumulative distribution function used to*approximate*density at that same grid. This method showed to work more reliably in case density goes to infinity. That grid and density values are used as "x" and "y" columns of data frame input for`new_p()`

.In

`as_d()`

"x" column of data frame is the same equidistant grid is taken as in`as_p()`

. "y" column is taken as input's values at this grid after possibly imputing infinity values. This imputation is done by taking maximum from left and right linear extrapolations on mentioned grid.In

`as_q()`

, at first inverse of input`f`

function is computed on [0; 1] interval. It is done by approximating it with piecewise-linear function on [0; 1] equidistant grid with`n_grid`

points, imputing infinity values (which ensures finite support), and computing inverse of approximation. This inverse of`f`

is used to create data frame input with`as_p()`

.In

`as_r()`

at first d-function with`new_d()`

is created based on the same sample used for support detection and extra arguments supplied as list in`args_new`

argument. In other words, density estimation is done based on sample, generated from input`f`

. After that, its values are used to create data frame with`as_d()`

.

**Use appropriate**with data frame from previous step and`new_*()`

function`type = "continuous"`

. This step implies that all tails outside detected support are trimmed and data frame is normalized to represent proper piecewise-linear density.

For efficient workflow, some commonly used distributions are recognized as
special ("honored"). Those receive different treatment in `as_*()`

functions.

Basically, there is a manually selected list of "honored" distributions with all their information enough to detect them. Currently that list has all common univariate distributions from 'stats' package, i.e. all except multinomial and "less common distributions of test statistics".

"Honored" distribution is **recognized only if f is one of p*(), d*(),
q*(), or r*() function describing honored distribution and is supplied as
variable with original name**. For example,

`as_d(dunif)`

will be treated as
"honored" distribution but `as_d(function(x) {dunif(x)})`

will not.After it is recognized that input `f`

represents "honored" distribution,
**its support is computed based on predefined rules**. Those take into
account special features of distribution (like infinite support or infinite
density values) and supplied extra arguments in `...`

. Usually output support
"loses" only around `1e-6`

probability on each infinite tail.

After that, for "discrete" type output `new_d()`

is used for appropriate data
frame input and for "continuous" - `as_d()`

with appropriate `d*()`

function
and support. D-function is then converted to desired class with `as_*()`

.

In case input is a function without any extra information, `as_*()`

functions
must know which finite support its output should have. User can supply
desired support directly with `support`

argument, which can also be `NULL`

(mean automatic detection of both edges) or have `NA`

to detect only those
edges.

Support is detected in order to preserve as much information as practically reasonable. Exact methods differ:

In

`as_p()`

support is detected as values at which input function is equal to`1e-6`

(left edge detection) and`1 - 1e-6`

(right edge), which means "losing"`1e-6`

probability on each tail.**Note**that those values are searched inside [-10^100; 10^100] interval.In

`as_d()`

, at first an attempt at finding one point of non-zero density is made by probing 10000 points spread across wide range of real line (approximately from`-1e7`

to`1e7`

). If input's value at all of them is zero, error is thrown. After finding such point, cumulative distribution function is made by integrating input with integrate() using found point as reference (without this there will be poor accuracy of`integrate()`

). Created CDF function is used to find`1e-6`

and`1 - 1e-6`

quantiles as in`as_p()`

, which serve as detected support.In

`as_q()`

quantiles for 0 and 1 are probed for being infinite. If they are,`1e-6`

and`1 - 1e-6`

quantiles are used respectively instead of infinite values to form detected support.In

`as_r()`

sample of size`n_sample`

is generated and detected support is its range stretched by mean difference of sorted points (to account for possible tails at which points were not generated).**Note**that this means that original input`f`

"demonstrates its randomness" only once inside`as_r()`

, with output then used for approximation of "original randomness".

`pdqr_approx_error()`

for computing approximation errors compared to
some reference function (usually input to `as_*()`

family).

```
# Convert existing "proper" pdqr-function
set.seed(101)
x <- rnorm(10)
my_d <- new_d(x, "continuous")
my_p <- as_p(my_d)
# Convert "honored" function to be a proper pdqr-function. To use this
# option, supply originally named function.
p_unif <- as_p(punif)
r_beta <- as_r(rbeta, shape1 = 2, shape2 = 2)
d_pois <- as_d(dpois, lambda = 5)
## `pdqr_approx_error()` computes pdqr approximation error
summary(pdqr_approx_error(as_d(dnorm), dnorm))
#> grid error abserror
#> Min. :-4.753 Min. :-7.979e-07 Min. :9.900e-12
#> 1st Qu.:-2.377 1st Qu.:-4.000e-07 1st Qu.:1.975e-09
#> Median : 0.000 Median :-5.552e-08 Median :5.552e-08
#> Mean : 0.000 Mean :-2.104e-07 Mean :2.104e-07
#> 3rd Qu.: 2.377 3rd Qu.:-1.975e-09 3rd Qu.:4.000e-07
#> Max. : 4.753 Max. :-9.900e-12 Max. :7.979e-07
## This will work as if input is unkonw function because of unsupported
## variable name
my_runif <- function(n) {
runif(n)
}
r_unif_2 <- as_r(my_runif)
plot(as_d(r_unif_2))
# Convert some other function to be a "proper" pdqr-function
my_d_quadr <- as_d(function(x) {
0.75 * (1 - x^2)
}, support = c(-1, 1))
# Support detection
unknown <- function(x) {
dnorm(x, mean = 1)
}
## Completely automatic support detection
as_d(unknown)
#> Density function of continuous type
#> Support: ~[-37.36926, 39.36951] (10000 intervals)#> Density function of continuous type
#> Support: ~[-4, 39.36951] (10000 intervals)#> Density function of continuous type
#> Support: ~[-37.36926, 5] (10000 intervals)
## If support is very small and very distant from zero, it probably won't
## get detected in `as_d()` (throwing a relevant error)
if (FALSE) {
as_d(function(x) {
dnorm(x, mean = 10000, sd = 0.1)
})
}
# Using different level of granularity
as_d(unknown, n_grid = 1001)
#> Density function of continuous type
#> Support: ~[-37.36926, 39.36951] (1000 intervals)
```