Selects the bandwidth of a local polynomial kernel (regression, density or variogram) estimator using (standart or modified) CV, GCV or MASE criteria.
h.cv(bin, ...) # S3 method for bin.data h.cv( bin, objective = c("CV", "GCV", "MASE"), h.start = NULL, h.lower = NULL, h.upper = NULL, degree = 1, ncv = ifelse(objective == "CV", 2, 0), cov.bin = NULL, DEalgorithm = FALSE, warn = TRUE, tol.mask = npsp.tolerance(2), ... ) # S3 method for bin.den h.cv( bin, h.start = NULL, h.lower = NULL, h.upper = NULL, degree = 1, ncv = 2, DEalgorithm = FALSE, ... ) # S3 method for svar.bin h.cv( bin, loss = c("MRSE", "MRAE", "MSE", "MAE"), h.start = NULL, h.lower = NULL, h.upper = NULL, degree = 1, ncv = 1, DEalgorithm = FALSE, warn = FALSE, ... ) hcv.data( bin, objective = c("CV", "GCV", "MASE"), h.start = NULL, h.lower = NULL, h.upper = NULL, degree = 1, ncv = ifelse(objective == "CV", 1, 0), cov.dat = NULL, DEalgorithm = FALSE, warn = TRUE, ... )
bin | object used to select a method (binned data, binned density or binned semivariogram). |
---|---|
... | further arguments passed to or from other methods (e.g. parameters of the optimization routine). |
objective | character; optimal criterion to be used ("CV", "GCV" or "MASE"). |
h.start | vector; initial values for the parameters (diagonal elements) to be optimized over.
If |
h.lower | vector; lower bounds on each parameter (diagonal elements) to be optimized.
Defaults to |
h.upper | vector; upper bounds on each parameter (diagonal elements) to be optimized.
Defaults to |
degree | degree of the local polynomial used. Defaults to 1 (local linear estimation). |
ncv | integer; determines the number of cells leaved out in each dimension. (0 to GCV considering all the data, \(>0\) to traditional or modified cross-validation). See "Details" bellow. |
cov.bin | (optional) covariance matrix of the binned data or semivariogram model
( |
DEalgorithm | logical; if |
warn | logical; sets the handling of warning messages
(normally due to the lack of data in some neighborhoods).
If |
tol.mask | tolerance used in the aproximations. Defaults to |
loss | character; CV error. See "Details" bellow. |
cov.dat | covariance matrix of the data or semivariogram model
(of class extending |
Returns a list containing the following 3 components:
the best (diagonal) bandwidth matrix found.
the value of the objective function corresponding to h
.
the criterion used.
Currently, only diagonal bandwidths are supported.
h.cv
methods use binning approximations to the objective function values
(in almost all cases, an averaged squared error).
If ncv > 0
, estimates are computed by leaving out binning cells with indexes within
the intervals \([x_i - ncv + 1, x_i + ncv - 1]\), at each dimension i, where \(x\)
denotes the index of the estimation location. \(ncv = 1\) corresponds with
traditional cross-validation and \(ncv > 1\) with modified CV
(it may be appropriate for dependent data; see e.g. Chu and Marron, 1991, for the one dimensional case).
Setting ncv >= 2
would be recommended for sparse data (as linear binning is used).
For standard GCV, set ncv = 0
(the whole data would be used).
For theoretical MASE, set bin = binning(x, y = trend.teor)
, cov = cov.teor
and ncv = 0
.
If DEalgorithm == FALSE
, the "L-BFGS-B"
method in optim
is used.
The different options for the argument loss
in h.cv.svar.bin()
define the CV error
considered in semivariogram estimation:
"MSE"
Mean squared error
"MRSE"
Mean relative squared error
"MAE"
Mean absolute error
"MRAE"
Mean relative absolute error
hcv.data
evaluates the objective function at the original data
(combining a binning approximation to the nonparametric estimates with a linear interpolation),
this can be very slow (and memory demanding; consider using h.cv
instead).
If ncv > 1
(modified CV), a similar algorithm to that in h.cv
is used,
estimates are computed by leaving out binning cells with indexes within
the intervals \([x_i - ncv + 1, x_i + ncv - 1]\).
Chu, C.K. and Marron, J.S. (1991) Comparison of Two Bandwidth Selectors with Dependent Errors. The Annals of Statistics, 19, 1906-1918.
Francisco-Fernandez M. and Opsomer J.D. (2005) Smoothing parameter selection methods for nonparametric regression with spatially correlated errors. Canadian Journal of Statistics, 33, 539-558.
bin <- binning(earthquakes[, c("lon", "lat")], earthquakes$mag) hcv <- h.cv(bin, ncv = 2) lp <- locpol(bin, h = hcv$h) ## Alternatively: ## lp <- locpolhcv(earthquakes[, c("lon", "lat")], earthquakes$mag, ncv = 2) simage(lp, main = 'Smoothed magnitude')## Density estimation hden <- h.cv(as.bin.den(bin)) den <- np.den(bin, h = hden$h) plot(den, main = 'Estimated log(density)')