Computational Statistics in Data Science. Группа авторов. Математика. . Читать онлайн. Литмир. LITMIR.BIZ

Computational Statistics in Data Science. Группа авторов

Читать онлайн.

Информация о произведении:

Название Computational Statistics in Data Science

Год выпуска 0

isbn 9781119561088

Автор произведения Группа авторов

Жанр Математика

Серия

Издательство John Wiley & Sons Limited

Computational Statistics in Data Science - Группа авторов

Скачать книгу

rel="nofollow" href="#fb3_img_img_4c121fd8-0111-5f12-a7fb-e7d80bf02eb9.png" alt="lamda"/>,

upper X vertical-bar theta tilde upper N Subscript p Baseline left-parenthesis theta comma upper I Subscript p Baseline right-parenthesis and theta tilde upper N Subscript p Baseline left-parenthesis 0 comma lamda upper I Subscript p Baseline right-parenthesis

The posterior distribution of (given ) is

theta bar x comma lamda tilde upper N left-parenthesis StartFraction lamda x Over lamda plus 1 EndFraction comma StartFraction lamda upper I Subscript p Baseline Over lamda plus 1 EndFraction right-parenthesis

If the true value of is unknown, it is often estimated from the marginal distribution of , via maximum‐likelihood estimation as

ModifyingAbove lamda With Ì‚ equals left-parenthesis StartFraction double-vertical-bar x double-vertical-bar squared Over p EndFraction minus 1 right-parenthesis Superscript plus

Robert and Casella [4] consider estimating h left-parenthesis theta right-parenthesis equals double-vertical-bar theta double-vertical-bar squared using the posterior mean normal upper E left-bracket double-vertical-bar theta double-vertical-bar squared vertical-bar x comma ModifyingAbove lamda With Ì‚ right-bracket . Under a quadratic loss, the Bayes estimator is

modifying above h with caret Subscript e b Baseline equals left-parenthesis double-vertical-bar x double-vertical-bar squared minus p right-parenthesis Superscript plus

The risk for modifying above h with caret Subscript e b

eta Subscript e b Baseline left-parenthesis double-vertical-bar theta double-vertical-bar right-parenthesis equals normal upper E left-bracket left-parenthesis double-vertical-bar theta double-vertical-bar squared minus left-parenthesis double-vertical-bar x double-vertical-bar squared minus p right-parenthesis Superscript plus Baseline right-parenthesis squared bar theta right-bracket

is difficult to obtain analytically (although not impossible, see Robert and Casella [4]). Instead, we can estimate the risk over a grid of double-vertical-bar theta double-vertical-bar values using Monte Carlo. To do this, we fix choices theta 1 comma ellipsis comma theta Subscript m Baseline over a grid, and for each k equals 1 comma ellipsis comma m , generate Monte Carlo samples from upper X vertical-bar theta Subscript k Baseline tilde upper N left-parenthesis theta Subscript k Baseline comma 1 right-parenthesis yielding estimates

ModifyingAbove eta With Ì‚ Subscript e b Baseline left-parenthesis double-vertical-bar theta Subscript k Baseline double-vertical-bar right-parenthesis equals StartFraction 1 Over n EndFraction sigma-summation Underscript t equals 1 Overscript n Endscripts left-parenthesis double-vertical-bar theta Subscript k Baseline double-vertical-bar minus left-parenthesis double-vertical-bar upper X Subscript t Baseline double-vertical-bar squared minus p right-parenthesis Superscript plus Baseline right-parenthesis squared

The resulting estimate of the risk is an ‐dimensional vector of means, for which we can utilize the sampling distribution in Theorem 1 to construct large‐sample confidence regions. An appropriate choice of a sequential stopping rule here is the relative‐magnitude sequential stopping rule, which stops simulation when the Monte Carlo variance is small relative to the average risk over all values of theta considered. It is important to note that the risk at a particular theta could be zero, but it is unlikely.

For illustration, we set p equals 5 and simulate a data point from the true model with lamda equals 1 . To evaluate risk we choose a grid of theta values with m equals 50 . In order to assess the appropriate Monte Carlo sample size , we set n Superscript asterisk Baseline equals 1 0 cubed so that at least Monte Carlo samples are obtained. With epsilon equals 0.05 , and normal upper Lamda estimated using the sample covariance matrix, the sequential stopping rule terminates simulation at 21 100 steps. Figure 2 demonstrates the estimated risk at n Superscript asterisk Baseline equals 1 0 cubed iterations and the estimated risk at termination. Pointwise Bonferroni corrected confidence intervals are presented as an indication of variability for each component¹.

Figure 2 Estimated risk at

(a) and at

(b) with pointwise Bonferroni corrected confidence intervals.

7.3 Bayesian Nonlinear Regression

Consider the biomedical oxygen demand (BOD) data collected by Marske [39] where BOD levels were measured periodically from cultured bottles of stream water. Bates and Watts [40] and Newton and Raftery [41] study a Bayesian nonlinear model with a fixed

Скачать книгу

Новинки

Популярные

Наши рекомендации

ТОП просматриваемых книг сайта:

Computational Statistics in Data Science. Группа авторов

Информация о произведении:

7.3 Bayesian Nonlinear Regression