layout: true <div class="my-footer"><span> douglasrm.azevedo@gmail.com </span></div> --- class: title-page <div> <h3 style="text-align: justify; padding-top:50px">Spatial Confounding Beyond Generalized Linear Mixed Models</h3> <h4 style="text-align: right;">Extension to Shared Components and Spatial Frailty Models</h4> <div> <div style="padding-top:130px; color:#858585"> <div style="float:left"> <h5> Douglas R. Mesquita Azevedo </br> Advisor <- Marcos Oliveira Prates </br> Co-advisor <- Dipankar Bandyopadhyay </h5> </div> <div style="float:right"> <img src="img/logo_ufmg.png" alt="ufmg" height="100"/> <img src="img/logo_vcu.png" alt="ufmg" height="100"/> </div> </div> <div style="align:center; padding-top:135px; color:#858585"> <h5 style="text-align:center"> February 2020 <br> Belo Horizonte, MG </h5> </div> --- class: toc-page <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.5.0/css/font-awesome.min.css"> ##
Summary <span class="brsmall"></span> <span class="brsmall"></span> + Motivation + ICAR + Spatial confounding + Shared component model + Method + Simulation + Application + Spatial frailty model + Method + Simulation + Application + Conclusion --- class: slide-page <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.5.0/css/font-awesome.min.css"> ##
Motivation The spatial confounding happens when .enfase[fixed] and .enfase[latent effects] carry .enfase[similar information] to the model <center><img src="img/motivation.svg" alt="graph" height="320"/></center> Example: Time until death by .enfase[lung and bronchus cancer] in California --- class: section-page <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.5.0/css/font-awesome.min.css"> <div class="my-section"> <h2>
Spatial models </h2> </div> --- class: slide-page ##
Spatial models Spatial data are mainly collected by: + .enfase[Area]: Areal models + .enfase[Points]: Georeferenced data/point process Our work focus on .enfase[areal spatial models]. Several models are available in the literature: + .enfase[SAR]: Whittle (1954); Ord (1975) + .enfase[ICAR/CAR]: Besag (1974); Banerjee, Carlin and Gelfand (2014) + .enfase[Leroux]: Leroux, Lei and Breslow (1999) + .enfase[Mixture neighborhood structure]: Rodrigues and Assunção (2012) + .enfase[DAGAR]: Datta, Banerjee, Hodges and Gao (2019) Altough all models are interesting, we will focus on the most traditional one: .enfase[ICAR models]. --- class: slide-page ##
ICAR model + Undirected graph <center><img src="img/adjacency_matrix.png" alt="graph" height="150"/></center> + Let `\(\psi\)` be a vector of random variables `$$\displaystyle \small{\psi_i|\psi_{-i} \sim \text{Normal}\Bigg(\sum_{j \sim i}\frac{\psi_j}{w_{i+}}, \frac{\sigma^2}{w_{i+}}\Bigg) ~ \forall ~ i \in \{1, \ldots, n\}}$$` + `\(j\sim i\)`: regions `\(j\)` and `\(i\)` are neighboors + `\(w_{i+}\)`: number of neighboors of region `\(i\)` --- class: section-page <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.5.0/css/font-awesome.min.css"> <div class="my-section"> <h2>
Spatial Confounding </h2> </div> --- class: slide-page ##
Reich, Hodges and Zadnik (2006) + Let's consider a .enfase[gaussian spatial regression] .mathbox[ `$$\displaystyle y|\beta, \psi, \tau_{\epsilon} \sim N(X\beta + \psi, \tau_{\epsilon}I_n),$$` `$$\psi|\tau_s, \sim \text{ICAR}(W, \tau_{\psi}Q),$$` ] where + `\(Q = (D_w-W)\)` + `\(\tau_{\epsilon}\)` and `\(\tau_{\psi}\)` are precision parameters <br> Normal is represented here by its .enfase[precision] instead of its .error[variance]. --- class: slide-page ##
Reich, Hodges and Zadnik (2006) + Reich et al. (2006) showed that .mathbox[ `$$E(\beta|\tau_{\epsilon}, \tau_{\psi}, y) = \displaystyle (X'X)^{-1}X'(y - \hat{\psi}) = \beta_{ols} - (X'X)^{-1}X'\hat{\psi}$$` `$$Var^{-1}(\beta|\tau_{\epsilon}, \tau_{\psi}, y) = \displaystyle \tau_{\epsilon}(X'X) - X' Var(\psi|\beta, \tau_{\epsilon}, \tau_{\psi}, y)X,$$` ] where + `\(\hat{\psi} = E(\psi|\tau_{\epsilon}, \tau_{\psi}, y)\)` We can notice that + `\(E(\beta|\tau_{\epsilon}, \tau_{\psi}, y)\)` is the OLS estimator minus a `\(\psi\)`-related factor + `\(Var(\beta|\tau_{\epsilon}, \tau_{\psi}, y)\)` always increase --- class: slide-page ##
Reich, Hodges and Zadnik (2006) + Original model .mathbox[ `$$\displaystyle y|\beta, \psi, \tau_{\epsilon} \sim N(X\beta + P\psi + P^c\psi, \tau_{\epsilon}I_n),$$` ] + Restricted model .mathbox[ `$$\displaystyle y|\beta, \psi, \tau_{\epsilon} \sim N(X\beta + P^c\psi, \tau_{\epsilon}I_n),$$` ] + `\(P = X(X'X)^{-1}X'\)` is the .enfase[projection matrix] onto space of `\(X\)` + `\(P^c = (I-X(X'X)^{-1}X')\)` is the .enfase[projection matrix] onto orthogonal space of `\(X\)` + `\(P\psi + P^c\psi = P\psi + (I - P)\psi = \psi\)` --- class: slide-page ##
Comments <br><br> + It is .enfase[not possible] (or really difficult) to calculate the expectation value and variance under .enfase[non-gaussian] models + However, this approach seems to also work for the GLMM family of models + It is a .enfase[constraint] `\(\implies\)` .enfase[computational limitations] + The computational improvement does not make a huge difference in modeling + ICAR model has <span class="enfase">n + q + 1</span> parameters + Proposed restricted model has <span class="enfase">n + 1</span> parameters --- class: slide-page ##
Hughes and Haran (2013) + Although Reich, Hodges and Zadnik (2006) proposal alleviates the spatial confounding, their method is .enfase[computationally inefficient] + Hughes e Haran (2013) proposed the use of the .enfase[Moran operator] `\(M = P^cWP\)` <center><img src="img/hughes.png" alt="graph" height="300"/></center> --- class: slide-page ##
Hughes and Haran (2013) + We can use only the first `\(m \ll n\)` Moran eigenvectors + The positive and negative eigenvalues correspond to variations of .enfase[positive] and .enfase[negative] spatial .enfase[dependence] + One could select the .enfase[eigenvectors] related to the eigenvalues .enfase[greater than 0] (positive dependence) .mathbox[ `$$\displaystyle y|\beta, \theta, \tau_{\epsilon} \sim N(X\beta + M'\theta, \tau_{\epsilon}I_n),$$` ] + where `\(\theta = M\psi\)` + This model has `\(m + q + 1\)` paramters instead of `\(n + 1\)` of Reich, Hodges and Zadnik (2006) proposal. + The method is implemented in the .enfase[_ngspatial_] R package --- class: slide-page <h2 style="font-size:40px">
Hanks, Schliep, Hooten and Hoeting (2015)</h2> + Hanks et al. (2015) focused their effort into .enfase[geostatistical data] instead of areal data + However, they showed how to have a .enfase[sample] from .enfase[restricted] and .enfase[unrestricted] models .enfase[concurrently] .mathbox[ $$ `\begin{aligned} E(Y_i|\beta) &= X\beta_{rsr} + \psi_{rsr} \\ &= X\beta_{rsr} + (I-P)\psi_{sr} \\ &= X\beta_{rsr} + \psi_{sr} - P\psi_{sr} \\ &= X\beta_{rsr} + \psi_{sr} - X(X'X)^{-1}X'\psi_{sr} \\ &= X(\beta_{rsr} - (X'X)^{-1}X'\psi_{sr}) + \psi_{sr} \\ &= X\beta_{sr} + \psi_{sr}, \end{aligned}` $$ ] where `rsr` means restricted spatial regression. + We .error[do not need] to fit the .error[restricted model] (that is sometimes problematic) --- class: slide-page <h2>
Prates, Assunção, Rodrigues (2019)</h2> + .enfase[SP]atial .enfase[O]rthogonal .enfase[C]entroid .enfase["K"]orrection - SPOCK (Prates et al. 2018) <center> <img src="img/SPOCK.jpg" alt="graph" height="125px"/> </center> + Alleviating spatial confounding for areal data problems by displacing the geographical centroids <center> <img src="img/spock_ilustration.png" alt="graph" height="150px"/> </center> --- class: slide-page <h2>
Prates, Assunção, Rodrigues (2019)</h2> + Let `\(c_i = [c_{1i}, c_{2i}] ~ \forall i \in [1, ..., n]\)` + We want a .enfase[new set of centroids] `\(c^{\ast} = P^c\times[c_1, c_2]\)` + We want to find `\(W^{\ast}\)`: an adjancency matrix .enfase["free" of spatial confounding] <center> <img src="img/centroid_projection.png" alt="graph" height="150px"/> </center> To create `\(W^{\ast}\)` we can use, for example, the .enfase[k-nearest neighbors algorithm] <center> <!-- <img src="img/knn.gif" alt="graph" height="100px"/> --> <img src="img/knn.png" alt="graph" height="100px"/> </center> --- class: slide-page ##
Comments <br><br> + There are other approaches for spatial confounding + <span class="enfase">Lasso regression</span>: Hefley et al., 2017, + <span class="enfase">Structural equation models</span>: Thaden and Kneib, 2018 + <span class="enfase">Causal inference</span>: Osama, Zachariah and Schön, 2019 + Most works are focused on .enfase[GLMM models] + .error[No software] is .error[available] to unify approaches --- class: section-page <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.5.0/css/font-awesome.min.css"> <div class="my-section"> <h2>
Contributions </h2> </div> --- class: slide-page ##
Our contributions .mathbox[ <center> <h4 style="color:#4a4545; margin-top:0px; margin-bottom:0px">What if we have .enfase[more than one spatial effect] or .enfase[more sample units than areas]?</h4> </center> ] <span class="brnegative"></span> + <b>Restricted shared component model</b> + <span class="success">Alleviating the spatial confounding</span> in the presence of <span class="success">multiple</span> spatial effects + The <span class="success">correction</span> is made <span class="success">before the analysis is performed</span> + <b>Restricted spatial frailty model</b> + <span class="success">Alleviating the spatial confounding</span> when the <span class="success">support</span> of <span class="success">fixed</span> and <span class="success">latent</span> effecs <span class="success">differ</span> + <span class="success">Samples</span> from <span class="success">restricted</span> and <span class="success">unrestricted</span> models <span class="success">concurrently</span> + <span class="success">Efficient method</span> to have sample from both models by using the <span class="success">reduction operator</span> + <b>_RASCO_: An R Package to Alleviate Spatial Confounding</b> + GLMM, <span class="success">shared component</span> and <span class="success">spatial frailty</span> models + RHZ, HH and SPOCK besides <span class="success">our contributions</span> --- class: section-page <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.5.0/css/font-awesome.min.css"> <div class="my-section"> <h2>
Restricted shared component model </h2> </div> --- class: slide-page ##
Poisson model <br> + Most common model used for univariate outcomes .mathbox[ `$$Y_i \sim \text{Poisson}(E_i\theta_i) ~ \forall ~ i \in \{1, \ldots, n\}$$` `$$\log(\theta_i) = \beta_0 + \beta_1X_{1i} + \ldots + \beta_pX_{pi}$$` ] + `\(E_i = T_i\times r\)` is the .enfase[expected] number of new cases of the disease `\(Y\)` in region `\(i\)`. + `\(T_i\)` is the .enfase[population] in region `\(i\)`. + `\(r = \frac{\sum_iY_i}{\sum_iT_i}\)` is the .enfase[overall incidence rate]. + `\(\theta_i\)` is the .enfase[incidence rate] in region `\(i\)`. --- class: slide-page ##
Disease mapping + We can add a latent effect in the Poisson model and assing an ICAR structure to it .mathbox[ `$$Y_i \sim \text{Poisson}(E_i\theta_i) ~ \forall ~ i \in \{1, \ldots, n\}$$` `$$\log(\theta_i) = \beta_0 + \beta_1X_{1i} + \ldots + \beta_pX_{pi} + \psi_i$$` `$$\psi_i \sim \text{ICAR}(W, \tau_{\psi}Q)$$` ] + Our main focus is still on the .enfase[interpretation of coefficients], but now we also want to check whether we have spatial patterns or not + It is possible to guess wich <span class="enfase">variable is missing</span> + It is possible to create polices in the <span class="enfase">hotspot places</span> --- class: slide-page ##
Shared component model + Knorr‐Held, L., & Best, N. G. (2001) + Sometimes we have interest in .enfase[more than one outcome] + Univariate models may .enfase[not] be .enfase[realistic] .mathbox[ `$$Y_{id} \sim \text{Poisson}(E_{id}\theta_{id}) ~ \forall ~ i \in \{1, \ldots, n\}, ~ d \in \{1, 2\}$$` `$$\log(\theta_{i1}) = X_{i1}\beta_1 + \delta\psi_i + \phi_{i1}$$` `$$\log(\theta_{i2}) = X_{i2}\beta_2 + \frac{1}{\delta}\psi_i + \phi_{i2}$$` `$$\psi_i \sim \text{ICAR}(W, \tau_{\psi}Q); ~ \phi_1 \sim \text{ICAR}(W, \tau_{\phi_1}Q); ~ \phi_2 \sim \text{ICAR}(W, \tau_{\phi_2}Q)$$` ] where `\(\delta\)` is a scale parameter to allow <span class="enfase">different levels of dependence</span> on the shared component --- class: slide-page ##
Restricted shared component model <span class="brnegative"> + For the shared component effect: .mathbox[ + `\(X_{\psi_k}\)`: union of `\(X_d\)` columns for diseases `\(d = 1, \ldots, D\)` related to shared component `\(k, k = 1, \ldots, K\)` + `\(P_{\psi_k}\)`: `\(X_{\psi_k}(X'_{\psi_k}X_{\psi_k})^{-1}X'_{\psi_k}\)` + `\(c_{\psi_k}\)`: `\(P^c_{\psi_k}c\)`, where `\(c\)` is the original set of centroids + `\(W_{\psi_k}\)`: neighborhood structure based on `\(c_{\psi_k}\)` (KNN) ] + For the specific spatial effects: .mathbox[ + `\(X_{\phi_i}\)`: `\(X_d\)` covariates for diseases `\(d = 1, \ldots, D\)` + `\(P_{\phi_i}\)`: `\(X_{\phi_i}(X'_{\phi_i}X_{\phi_i})^{-1}X'_{\phi_i}\)` + `\(c_{\phi_i}\)`: `\(P^c_{\phi_i}c\)`, where `\(c\)` is the original set of centroids + `\(W_{\phi_i}\)`: neighborhood structure based on `\(c_{\phi_i}\)` (KNN) ] </span> --- class: slide-page ##
RSCM - simulation <span class="brnegative"></span> + Case `\(D = 2\)`, `\(X_1\)` is a two column matrix as well as `\(X_2\)`. `\(X_{12}\)` and `\(X_{22}\)` are: + <span class="enfase">S1</span>: Random; + <span class="enfase">S2</span>: `\(X_{12}\)` is set of latitudes; <span class="enfase">S3</span>: `\(X_{22}\)` is set of latitudes + <span class="enfase">S4</span>: `\(X_{12}\)` and `\(X_{22}\)` are the set of latitudes + Parameters + `\(\beta_1 = [0.5, -0.5, -0.2]\)`, `\(\beta_2 = [0.1, -0.8, -0.4]\)` + `\(\tau_{\psi} = 1\)`, `\(\tau_{\phi_1} = 10\)`, `\(\tau_{\phi_2} = 10\)` + `\(\delta = \{1.00, 1.50, 1.75\}\)` + Priors + `\(\beta_{dj} \sim \text{Normal}(0, 0.001), ~ d = 1, 2; ~ j = 0, 1, 2,\)` + `\(\log(\gamma) \sim \text{Normal}(0, 0.1) ~~ (\delta = \sqrt{\gamma})\)` + `\(\tau_{\psi} \sim \Gamma(0.5, 0.05), \tau_{\phi_1} \sim \Gamma(0.5, 0.05), \tau_{\phi_2} \sim \Gamma(0.5, 0.05)\)` --- class: slide-page ##
RSCM - simulation <center><img src="img/sim_sc.png" alt="graph" height="480"/></center> --- class: slide-page ##
RSCM - application + .enfase[New cases] of .enfase[lung and bronchus cancer] for .enfase[men] and .enfase[women] in California (58 counties) + Same priors as in the simulation study + Covariates: <span class="brsmall"></span> <center><img src="img/covariates.png" alt="graph" height="225"/></center> --- class: slide-page ##
RSCM - application `\(\mathcal{M}_1\)` - .enfase[Univariate non-spatial model]: .mathbox[ `\(Y_{id} \sim P(E_{id}\theta_{id}),\)` `\(\log(\theta_{id}) = \beta_{d0} + X_{id}\beta,\)` `\(Y_{1} \perp Y_{2}.\)` ] `\(\mathcal{M}_2\)` - .enfase[Univariate spatial model]: .mathbox[ `\(Y_{id} \sim P(E_{id}\theta_{id}),\)` `\(\log(\theta_{id}) = \beta_{d0} + X_{id}\beta + \psi_{id},\)` `\(\psi_1 \sim \text{ICAR}(W, \tau_{\psi_1}Q); ~~ \psi_2 \sim \text{ICAR}(W, \tau_{\psi_2}Q),\)` `\(Y_{1} \perp Y_{2}\)` ] --- class: slide-page ##
RSCM - application `\(\mathcal{M}_3\)` - .enfase[Shared Component model without specific spatial term]: .mathbox[ `\(Y_{id} \sim P(E_{id}\theta_{id}),\)` `\(\log(\theta_{i1}) = \beta_{10} + X_{i1}\beta + \delta\psi_{i}; ~~ \log(\theta_{i2}) = \beta_{20} + X_{i2}\beta + \frac{1}{\delta}\psi_{i},\)` `\(\psi \sim \text{ICAR}(W, \tau_{\psi}Q).\)` ] `\(\mathcal{M}_4\)` - .enfase[Shared Component model with specific spatial term]: .mathbox[ `\(Y_{id} \sim P(E_{id}\theta_{id}),\)` `\(\log(\theta_{i1}) = \beta_{10} + X_{i1}\beta + \delta\psi_{i} + \phi_{1i}; ~~ \log(\theta_{i2}) = \beta_{20} + X_{i2}\beta + \frac{1}{\delta}\psi_{i} + \phi_{2i},\)` `\(\psi \sim \text{ICAR}(W, \tau_{\psi}Q); ~~ \phi_1 \sim \text{ICAR}(W, \tau_{\phi_1}Q); ~~ \phi_2 \sim \text{ICAR}(W, \tau_{\phi_2}Q).\)` ] --- class: slide-page ##
RSCM - application + Model results <center><img src="img/tab_sc_app.svg" alt="graph" height="350"/></center> --- class: slide-page ##
RSCM - application + Aggregate spatial effects <center><img src="img/sc_eff_sum.svg" alt="graph" height="350"/></center> --- class: section-page <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.5.0/css/font-awesome.min.css"> <div class="my-section"> <h2>
Restricted spatial frailty model </h2> </div> --- class: slide-page ##
Proportional hazards models + Interest: study the time until an event + Exponential, log-normal, <span class="enfase">Weibull</span>, ... + `\(f_{\theta}(t) = h_{\theta}(t)S_{\theta}(t)\)` + `\(f_{\theta}(t)\)`: probability density function + `\(h_{\theta}(t)\)`: hazard function + `\(S_{\theta}(t)\)`: survival function + Censoring schemes + <span class="enfase">Right censoring</span>, left censoring, interval censoring + We want to identify .enfase[risk factors] (covariates) + Proportional hazards models: `$$h_{\theta}(t_i) = h^{\ast}_{\theta}(t_i)\exp\{X_i\beta\}$$` --- class: slide-page ##
Spatial frailty model <br> + `\(h_{\theta}(t_{ij}) = h^{\ast}_{\theta}(t_{ij})\gamma_i\exp\{X_{ij}\beta\}\)` + `\(\gamma_i\)` must be positive (Gamma distribution) + Banerjee, Wall e Carlin (2003) proposed the following structure: .mathbox[ $$ `\begin{aligned} h_{\theta}(t_{ij}) & = h_{\theta}^{\ast}(t_{ij}) \times \gamma_i \times e^{X_{ij} \beta} \\ & = h_{\theta}^{\ast}(t_{ij}) \times e^{X_{ij}\beta + log(\gamma_i)} \\ & = h_{\theta}^{\ast}(t_{ij}) \times e^{X_{ij}\beta + \psi_i}. \end{aligned}` $$ ] We can model `\(\psi\)` with an ICAR model, for example. --- class: slide-page ##
Restricted spatial frailty model + We have .enfase[more subjects] than .enfase[areas] + `\(\displaystyle P_{N\times N}\)`, `\(N = \sum_{i = 1}^{n_i}n_i\)` + `\(n_i\)` is the number of individuals in area `\(i\)`, `\(i = 1, \ldots, n\)` + `\(\psi\)` is a `\(n\times 1\)` column vector + `\(c\)` is a `\(n\times 2\)` vector (centroids) + Therefore, it is .error[not possible] to make the projections because: RHZ: `$$P^c_{N\times N} ~~ \text{and} ~~ \psi_{n\times 1}$$` SPOCK: `$$P^c_{N\times N} ~~ \text{and} ~~ c_{n\times 2}$$` --- class: slide-page ##
Restricted spatial frailty model + Solution: create a vector `\(\Psi = [\psi_1\times 1_{n_1}, \ldots, \psi_n\times 1_{n_n}]'\)` + `\(1_{m}\)` is a length `\(m\)` line vector of ones .mathbox[ $$ `\begin{equation} \psi_{n\times 1} = \begin{bmatrix} \psi_1 \\ \psi_2 \\ \vdots \\ \psi_n \end{bmatrix}; ~~~~ \Psi_{N\times 1} = \begin{array}{c@{\!\!\!}l} \left[ \begin{array}[c]{ccccc} \psi_1 \\ \vdots \\ \psi_1 \\ \vdots \\ \psi_n \\ \vdots \\ \psi_n \end{array} \right] & \begin{array}[c]{@{}l@{\,}l} \left. \begin{array}{c} \vphantom{0} \\ \vphantom{\vdots} \\ \vphantom{0} \end{array} \right\} & \text{$n_1$ times} \\ \vphantom{\vdots} \\ \left. \begin{array}{c} \vphantom{0} \\ \vphantom{\vdots} \\ \vphantom{0} \end{array} \right\} & \text{$n_n$ times} \end{array} \end{array}. \end{equation}` $$ ] Now `\(P^c_{N\times N}\Psi_{N\times 1}\)` .success[is possible] --- class: slide-page ##
Restricted spatial frailty model To have sample from .enfase[restricted] and .enfase[unrestricted] models .enfase[concurrently] we found the .enfase[equivalence between models]: .mathbox[ $$ `\begin{align} h_{\theta}(t) & = h_{\theta}^0(t)\exp\{X\beta_{rsf} + \Psi_{rsf} + \epsilon_{rsf}\} \\ & = h_{\theta}^0(t)\exp\{X\beta_{rsf} + \Psi_{rsf} + \tilde{\psi} + \epsilon_{sf}\} \\ & = h_{\theta}^0(t)\exp\{X\beta_{rsf} + P^c\Psi_{sf} + \epsilon_{sf}\} \\ & = h_{\theta}^0(t)\exp\{X(\beta_{rsf} - (X'X)^{-1}X'\Psi_{sf}) + \Psi_{sf} + \epsilon_{sf}\} \\ & = h_{\theta}^0(t)\exp\{X\beta_{sf} + \Psi_{sf} + \epsilon_{sf}\}, \end{align}` $$ ] where "rsf" means _restricted spatial frailty_ + `\(\Psi_{rsf} = [\psi_{rsf_1}\times 1_{n_1}, \ldots, \psi_{rsf_n}\times 1_{n_n}]'\)` + `\(\psi_{rsf_n}\)` is the mean by area of `\(P^c\Psi_{sf}\)` + `\(\tilde{\psi} = P^c\Psi_{sf} - \Psi_{rsf}\)` --- class: slide-page ##
Reduction operator + `\(X_{N\times p}\)`: matrix with entries `\(X_{ijk}\)` for an index `\(i\)`, an element `\(j\)` and column `\(k\)`, + `\(G_{N\times 1}\)`: vector of indexes for each line of `\(X_{N\times p}\)` 1 until `\(n\)`, `\(n \ll N\)`. Then the reduction operator `\(\circledR\)` is defined by: `$$X_{N\times p}~\circledR~G = x_{n\times p},$$` in which `\(\displaystyle x_{ik} = \sum_{j = 1}^{n_i}X_{ijk}\)`, and `\(n_i\)` is the number of elements related with index `\(i\)`. Then: .mathbox[ `$$\beta_{rsf} = \beta_{sf} + (X'X)^{-1}X'\Psi_{sf} = \beta_{sf} + (X'X)^{-1}(X~\circledR~G)'\psi_{sf}$$` `$$\psi_{rsf} = (N^{-1}P^c\Psi_{sf})~\circledR~G$$` ] --- class: slide-page ##
RSFM - simulation Reduction operator improvement: <br> <center><img src="img/sim_sf_reduction.png" alt="graph" height="300"/></center> --- class: slide-page ##
RSFM - simulation + `\(X\)` is a two column matrix. `\(X_{1}\)` and `\(X_{2}\)` are: + <span class="enfase">Case 1</span>: random variables + <span class="enfase">Case 2</span>: `\(X_{1}\)` is random and `\(X_{2}\)` corresponds to the set of latitudes (spatially correlated) + Weibull proportional hazards model + Censure in `\(\{0\%, 25\%, 50\%, 75\%\}\)` + Parameters + `\(\alpha = 1.2\)`, `\(\beta_1 = −0.3\)`, `\(\beta_2 = 0.3\)`, `\(\tau_{\psi} = 0.75\)` + Priors + `\(\alpha \sim \text{Gamma}(0.001, 0.001)\)` + `\(\beta_j \sim \text{Normal}(0, 0.001), ~ j = 0, 1, 2\)` + `\(\tau_{\psi} \sim \Gamma(0.5, 0.0005)\)` --- class: slide-page ##
RSFM - simulation <center><img src="img/sim_sf.svg" alt="graph" height="450"/></center> --- class: slide-page ##
RSFM - application + .enfase[Months until death] by .enfase[lung and bronchus cancer] in California + Same priors as in the simulation study <center><img src="img/sf_tab.png" alt="graph" height="350"/></center> --- class: slide-page ##
RSFM - application + Model results <center><img src="img/tab_sf.svg" alt="graph" height="350"/></center> --- class: slide-page ##
RSFM - application + Spatial effects <br> <center><img src="img/sf_eff.svg" alt="graph" height="225"/></center> --- class: slide-page <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.5.0/css/font-awesome.min.css"> ##
RASCO + Available at: https://github.com/DouglasMesquita/RASCO ```r mod <- rsglmm(data = data, area = "reg", formula = Y ~ X1 + X2, family = "poisson", neigh = neigh_RJ, model = "restricted_besag", proj = "rhz", nsamp = 1000) mod <- rscm(data = data, area = "reg", formula1 = Y1 ~ X11 + X12, formula2 = Y2 ~ X21 + X12, family = c("poisson", "poisson"), neigh = neigh_RJ, proj = "spock", nsamp = 1000) mod <- rsfm(data = data, area = "reg", formula = surv(time = L, event = status) ~ X1 + X2, family = "weibull", neigh = neigh_RJ, model = "restricted_besag", proj = "rhz", nsamp = 1000) ``` --- class: slide-page <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.5.0/css/font-awesome.min.css"> ##
Conclusion <span class="brnegative"> + Spatial confouding + It is an <span class="success">important limitation</span> and should be verified in models beyond GLMM family + Shared component model + We proposed a model <span class="success">changing the multiple spatial structures</span> present in the model + It is interesting since the user can <span class="success">correct the structure before running analysis</span> + Spatial frailty model + We proposed a restricted model <span class="success">increasing the spatial effects dimension</span> + The reduction operator brought <span class="success">computational efficiency</span> to our model + RSCM and RSFM + The <span class="success">simulation showed the effectiveness</span> of both approaches + In the application we showed that one <span class="success">important covariate</span> were <span class="success">confounded</span> and <span class="success">our methods alleviated such confounding</span> in both cases </span> --- class: slide-page <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.5.0/css/font-awesome.min.css"> ##
Conclusion <span class="brnegative"></span> + RASCO + Our package is available at: https://github.com/DouglasMesquita/RASCO + We have solutions for + GLMM + <span class="success">SCM</span> + <span class="success">SFM</span> + The available approaches are: + RHZ + HH + SPOCK + <span class="success">RSCM</span> + <span class="success">RSFM</span> --- class: slide-page <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.5.0/css/font-awesome.min.css"> ##
Future work <br> + To sample from restricted and unrestricted models using HH and SPOCK models + Conditioning by kriging + To employ the reduction operator for HH and SPOCK in survival models + HH: `\(((N^{-1}P^c~\circledR~G')~\circledR~G))W((N^{-1}P^c~\circledR~G')~\circledR~G)\)` + SPOCK: `\(c^{\ast} = ((N^{-1}P^c~\circledR~G')~\circledR~G)c\)` + Spatial and temporal confounding effects in spatio-temporal models + To add more methods to RASCO package + HH for SCM and SFM + MCMC versions --- class: slide-page <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.5.0/css/font-awesome.min.css"> ##
References .enfase[Reich, B. J., Hodges, J. S., and Zadnik, V. (2006)]. **Effects of residual smoothing on the posterior of the fixed effects in disease‐mapping models**. _Biometrics, 62(4), 1197-1206_. <span class="brtiny"></span> .enfase[Prates, M. O., Assunção, R. M., and Rodrigues, E. C. (2019)]. **Alleviating Spatial Confounding for Areal Data Problems by Displacing the Geographical Centroids**. _Bayesian Analysis 14.2 (2019): 623-647_. <span class="brtiny"></span> .enfase[Hughes, J., and Murali H. (2013)]. **Dimension reduction and alleviation of confounding for spatial generalized linear mixed models**. _Journal of the Royal Statistical Society: Series B (Statistical Methodology) 75.1 (2013): 139-159_. <span class="brtiny"></span> .enfase[Hanks, E. M., Schliep, E. M., Hooten, M. B., and Hoeting, J. A. (2015)]. **Restricted spatial regression in practice: geostatistical models, confounding, and robustness under model misspecification**. _Environmetrics 26.4 (2015): 243-254_. <span class="brtiny"></span> .enfase[Knorr‐Held, L., and Best, N. G. (2001)]. **A shared component model for detecting joint and selective clustering of two diseases**. _Journal of the Royal Statistical Society: Series A (Statistics in Society), 164(1), 73-85_. <span class="brtiny"></span> .enfase[Banerjee, S., Melanie M. W., and Bradley P. C. (2003)]. **Frailty modeling for spatially correlated survival data, with application to infant mortality in Minnesota**. _Biostatistics 4.1 (2003): 123-142_. --- class: slide-page <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.5.0/css/font-awesome.min.css"> ##
References .enfase[Hefley, T. J. and Hooten, M. B., Hanks, E. M. and Russell, R. E., and Walsh, D. P. (2017)]. **The Bayesian group lasso for confounded spatial data**. _Journal of Agricultural, Biological and Environmental Statistics 22.1 (2017): 42-59_. <br> .enfase[Thaden, H., and Thomas K. (2018)]. **Structural equation models for dealing with spatial confounding**. _The American Statistician 72.3 (2018): 239-252_. <br> .enfase[Whittle, P. (1954)]. **On stationary processes in the plane.*"**. _Biometrika (1954): 434-449_. <br> .enfase[Ord, K. (1975)]. **Estimation methods for models of spatial interaction.**. _Journal of the American Statistical Association 70.349 (1975): 120-126_. <br> .enfase[Besag, J., Jeremy Y., and Annie M. (1991)]. **Bayesian image restoration, with two applications in spatial statistics**. _Annals of the institute of statistical mathematics 43.1 (1991): 1-20_ <br> .enfase[Leroux, B. G., Xingye, L., and Norman B. (2000)]. **Estimation of disease rates in small areas: a new mixed model for spatial dependence**. _Statistical models in epidemiology, the environment, and clinical trials. Springer, New York, NY, 2000. 179-191_. <br> .enfase[Datta, A., Banerjee, S., Hodges, J. S., and Gao, L. (2019)]. **Spatial disease mapping using directed acyclic graph auto-regressive (DAGAR) models**. _Bayesian Analysis 14.4 (2019):1221-1244_. <br> .enfase[Rodrigues, E. C., and Renato, M. A. (2012)]. **Bayesian spatial models with a mixture neighborhood structure**. _Journal of Multivariate Analysis 109 (2012): 88-102_. --- class: center, inverse <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.5.0/css/font-awesome.min.css"> <h1 style="color:#378fbb; font-size:90px">
THANK YOU! </h1> .mathbox[ <img src="img/leste/leste.jpg" alt="graph" height="210"/> ] <br><br><br> <h5> <span style="color:#ff6868;">
</span> <span style="color:#00000;">require-r.com</span> <span style="color:#ff6868;">
</span> <span style="color:#00000;">douglas-mesquita</span> <span style="color:#ff6868;">
</span> <span style="color:#00000;">DouglasMesquita</span> <span style="color:#ff6868;">
</span> <span style="color:#00000;">douglas-mesquita</span> <span style="color:#ff6868;">
</span> <span style="color:#00000;">DouglasMesqita</span> <br> <span style="color:#ff6868;">
</span> <span style="color:#00000;">douglasmesqita</span> <span style="color:#ff6868;">
</span> <span style="color:#00000;">douglas.mesquita.94</span> </h5>