Methodology
See recent articles
- [1] arXiv:2409.11497 [pdf, html, other]
-
Title: Decomposing Gaussians with Unknown CovarianceSubjects: Methodology (stat.ME); Machine Learning (stat.ML)
Common workflows in machine learning and statistics rely on the ability to partition the information in a data set into independent portions. Recent work has shown that this may be possible even when conventional sample splitting is not (e.g., when the number of samples $n=1$, or when observations are not independent and identically distributed). However, the approaches that are currently available to decompose multivariate Gaussian data require knowledge of the covariance matrix. In many important problems (such as in spatial or longitudinal data analysis, and graphical modeling), the covariance matrix may be unknown and even of primary interest. Thus, in this work we develop new approaches to decompose Gaussians with unknown covariance. First, we present a general algorithm that encompasses all previous decomposition approaches for Gaussian data as special cases, and can further handle the case of an unknown covariance. It yields a new and more flexible alternative to sample splitting when $n>1$. When $n=1$, we prove that it is impossible to partition the information in a multivariate Gaussian into independent portions without knowing the covariance matrix. Thus, we use the general algorithm to decompose a single multivariate Gaussian with unknown covariance into dependent parts with tractable conditional distributions, and demonstrate their use for inference and validation. The proposed decomposition strategy extends naturally to Gaussian processes. In simulation and on electroencephalography data, we apply these decompositions to the tasks of model selection and post-selection inference in settings where alternative strategies are unavailable.
- [2] arXiv:2409.11525 [pdf, other]
-
Title: Interpretability Indices and Soft Constraints for Factor ModelsSubjects: Methodology (stat.ME)
Factor analysis is a way to characterize the relationships between many (observable) variables in terms of a smaller number of unobservable random variables which are called factors. However, the application of factor models and its success can be subjective or difficult to gauge, since infinitely many factor models that produce the same correlation matrix can be fit given sample data. Thus, there is a need to operationalize a criterion that measures how meaningful or "interpretable" a factor model is in order to select the best among many factor models.
While there are already techniques that aim to measure and enhance interpretability, new indices, as well as rotation methods via mathematical optimization based on them, are proposed to measure interpretability. The proposed methods directly incorporate semantics with the help of natural language processing and are generalized to incorporate any "prior information". Moreover, the indices allow for complete or partial specification of relationships at a pairwise level. Aside from these, two other main benefits of the proposed methods are that they do not require the estimation of factor scores, which avoids the factor score indeterminacy problem, and that no additional explanatory variables are necessary.
The implementation of the proposed methods is written in Python 3 and is made available together with several helper functions through the package interpretablefa on the Python Package Index. The methods' application is demonstrated here using data on the Experiences in Close Relationships Scale, obtained from the Open-Source Psychometrics Project. - [3] arXiv:2409.11701 [pdf, html, other]
-
Title: Bias Reduction in Matched Observational Studies with Continuous Treatments: Calipered Non-Bipartite Matching and Bias-Corrected Estimation and InferenceSubjects: Methodology (stat.ME); Applications (stat.AP)
Matching is a commonly used causal inference framework in observational studies. By pairing individuals with different treatment values but with the same values of covariates (i.e., exact matching), the sample average treatment effect (SATE) can be consistently estimated and inferred using the classic Neyman-type (difference-in-means) estimator and confidence interval. However, inexact matching typically exists in practice and may cause substantial bias for the downstream treatment effect estimation and inference. Many methods have been proposed to reduce bias due to inexact matching in the binary treatment case. However, to our knowledge, no existing work has systematically investigated bias due to inexact matching in the continuous treatment case. To fill this blank, we propose a general framework for reducing bias in inexactly matched observational studies with continuous treatments. In the matching stage, we propose a carefully formulated caliper that incorporates the information of both the paired covariates and treatment doses to better tailor matching for the downstream SATE estimation and inference. In the estimation and inference stage, we propose a bias-corrected Neyman estimator paired with the corresponding bias-corrected variance estimator to leverage the information on propensity density discrepancies after inexact matching to further reduce the bias due to inexact matching. We apply our proposed framework to COVID-19 social mobility data to showcase differences between classic and bias-corrected SATE estimation and inference.
- [4] arXiv:2409.11967 [pdf, other]
-
Title: Incremental effects for continuous exposuresSubjects: Methodology (stat.ME); Statistics Theory (math.ST)
Causal inference problems often involve continuous treatments, such as dose, duration, or frequency. However, continuous exposures bring many challenges, both with identification and estimation. For example, identifying standard dose-response estimands requires that everyone has some chance of receiving any particular level of the exposure (i.e., positivity). In this work, we explore an alternative approach: rather than estimating dose-response curves, we consider stochastic interventions based on exponentially tilting the treatment distribution by some parameter $\delta$, which we term an incremental effect. This increases or decreases the likelihood a unit receives a given treatment level, and crucially, does not require positivity for identification. We begin by deriving the efficient influence function and semiparametric efficiency bound for these incremental effects under continuous exposures. We then show that estimation of the incremental effect is dependent on the size of the exponential tilt, as measured by $\delta$. In particular, we derive new minimax lower bounds illustrating how the best possible root mean squared error scales with an effective sample size of $n/\delta$, instead of usual sample size $n$. Further, we establish new convergence rates and bounds on the bias of double machine learning-style estimators. Our novel analysis gives a better dependence on $\delta$ compared to standard analyses, by using mixed supremum and $L_2$ norms, instead of just $L_2$ norms from Cauchy-Schwarz bounds. Finally, we show that taking $\delta \to \infty$ gives a new estimator of the dose-response curve at the edge of the support, and we give a detailed study of convergence rates in this regime.
- [5] arXiv:2409.12081 [pdf, other]
-
Title: Optimising the Trade-Off Between Type I and Type II Errors: A Review and ExtensionsSubjects: Methodology (stat.ME)
In clinical studies upon which decisions are based there are two types of errors that can be made: a type I error arises when the decision is taken to declare a positive outcome when the truth is in fact negative, and a type II error arises when the decision is taken to declare a negative outcome when the truth is in fact positive. Commonly the primary analysis of such a study entails a two-sided hypothesis test with a type I error rate of 5% and the study is designed to have a sufficiently low type II error rate, for example 10% or 20%. These values are arbitrary and often do not reflect the clinical, or regulatory, context of the study and ignore both the relative costs of making either type of error and the sponsor's prior belief that the drug is superior to either placebo, or a standard of care if relevant. This simplistic approach has recently been challenged by numerous authors both from a frequentist and Bayesian perspective since when resources are constrained there will be a need to consider a trade-off between type I and type II errors. In this paper we review proposals to utilise the trade-off by formally acknowledging the costs to optimise the choice of error rates for simple, point null and alternative hypotheses and extend the results to composite, or interval hypotheses, showing links to the Probability of Success of a clinical study.
- [6] arXiv:2409.12173 [pdf, html, other]
-
Title: Poisson approximate likelihood compared to the particle filterSubjects: Methodology (stat.ME)
Filtering algorithms are fundamental for inference on partially observed stochastic dynamic systems, since they provide access to the likelihood function and hence enable likelihood-based or Bayesian inference. A novel Poisson approximate likelihood (PAL) filter was introduced by Whitehouse et al. (2023). PAL employs a Poisson approximation to conditional densities, offering a fast approximation to the likelihood function for a certain subset of partially observed Markov process models. A central piece of evidence for PAL is the comparison in Table 1 of Whitehouse et al. (2023), which claims a large improvement for PAL over a standard particle filter algorithm. This evidence, based on a model and data from a previous scientific study by Stocks et al. (2020), might suggest that researchers confronted with similar models should use PAL rather than particle filter methods. Taken at face value, this evidence also reduces the credibility of Stocks et al. (2020) by indicating a shortcoming with the numerical methods that they used. However, we show that the comparison of log-likelihood values made by Whitehouse et al. (2023) is flawed because their PAL calculations were carried out using a dataset scaled differently from the previous study. If PAL and the particle filter are applied to the same data, the advantage claimed for PAL disappears. On simulations where the model is correctly specified, the particle filter outperforms PAL.
New submissions for Thursday, 19 September 2024 (showing 6 of 6 entries )
- [7] arXiv:2409.11658 (cross-list from stat.AP) [pdf, html, other]
-
Title: Forecasting age distribution of life-table death counts via {\alpha}-transformationComments: 25 pages, 6 tables, 5 figuresSubjects: Applications (stat.AP); Methodology (stat.ME)
We introduce a compositional power transformation, known as an {\alpha}-transformation, to model and forecast a time series of life-table death counts, possibly with zero counts observed at older ages. As a generalisation of the isometric log-ratio transformation (i.e., {\alpha} = 0), the {\alpha} transformation relies on the tuning parameter {\alpha}, which can be determined in a data-driven manner. Using the Australian age-specific period life-table death counts from 1921 to 2020, the {\alpha} transformation can produce more accurate short-term point and interval forecasts than the log-ratio transformation. The improved forecast accuracy of life-table death counts is of great importance to demographers and government planners for estimating survival probabilities and life expectancy and actuaries for determining annuity prices and reserves for various initial ages and maturity terms.
Cross submissions for Thursday, 19 September 2024 (showing 1 of 1 entries )
- [8] arXiv:2204.02872 (replaced) [pdf, html, other]
-
Title: Cluster randomized trials designed to support generalizable inferencesSubjects: Methodology (stat.ME)
Background: When planning a cluster randomized trial, evaluators often have access to an enumerated cohort representing the target population of clusters. Practicalities of conducting the trial, such as the need to oversample clusters with certain characteristics to improve trial economy or to support inference about subgroups of clusters, may preclude simple random sampling from the cohort into the trial, and thus interfere with the goal of producing generalizable inferences about the target population.
Methods: We describe a nested trial design where the randomized clusters are embedded within a cohort of trial-eligible clusters from the target population and where clusters are selected for inclusion in the trial with known sampling probabilities that may depend on cluster characteristics (e.g., allowing clusters to be chosen to facilitate trial conduct or to examine hypotheses related to their characteristics). We develop and evaluate methods for analyzing data from this design to generalize causal inferences to the target population underlying the cohort.
Results: We present identification and estimation results for the expectation of the average potential outcome and for the average treatment effect, in the entire target population of clusters and in its non-randomized subset. In simulation studies we show that all the estimators have low bias but markedly different precision.
Conclusions: Cluster randomized trials where clusters are selected for inclusion with known sampling probabilities that depend on cluster characteristics, combined with efficient estimation methods, can precisely quantify treatment effects in the target population, while addressing objectives of trial conduct that require oversampling clusters on the basis of their characteristics. - [9] arXiv:2307.15205 (replaced) [pdf, html, other]
-
Title: A new robust graph for graph-based methodsSubjects: Methodology (stat.ME); Statistics Theory (math.ST)
Graph-based two-sample tests and change-point detection are powerful tools for analyzing high-dimensional and non-Euclidean data, as they do not impose distributional assumptions and perform effectively across a wide range of scenarios. These methods utilize a similarity graph constructed from the observations, with $K$-nearest neighbor graphs or $K$-minimum spanning trees being the current state-of-the-art choices. However, in high-dimensional settings, these graphs tend to form hubs -- nodes with disproportionately large degrees -- and graph-based methods are sensitive to hubs. To address this issue, we propose a robust graph that is significantly less prone to forming hubs in high-dimensional settings. Incorporating this robust graph can substantially improve the power of graph-based methods across various scenarios. Furthermore, we establish a theoretical foundation for graph-based methods using the proposed robust graph, demonstrating its consistency under fixed alternatives in both low-dimensional and high-dimensional contexts.
- [10] arXiv:2308.12181 (replaced) [pdf, html, other]
-
Title: Consistency of common spatial estimators under spatial confoundingComments: revision including new simulationSubjects: Methodology (stat.ME)
This paper addresses the asymptotic performance of popular spatial regression estimators of the linear effect of an exposure on an outcome under ``spatial confounding" -- the presence of an unmeasured spatially-structured variable influencing both the exposure and the outcome. We first show that the estimators from ordinary least squares (OLS) and restricted spatial regression are asymptotically biased under spatial confounding. We then prove a novel main result on the consistency of the generalized least squares (GLS) estimator using a Gaussian process (GP) working covariance matrix in the presence of spatial confounding under infill (fixed domain) asymptotics. The result holds under very general conditions -- for any exposure with some non-spatial variation (noise), for any spatially continuous fixed confounder function, using any Matèrn or square exponential kernel used to construct the GLS estimator, and without requiring Gaussianity of errors. Finally, we prove that spatial estimators from GLS, GP regression, and spline models that are consistent under confounding by a fixed function will also be consistent under endogeneity or confounding by a random function, i.e., a stochastic process. We conclude that, contrary to claims in some literature on spatial confounding, traditional spatial estimators are capable of estimating linear exposure effects under spatial confounding as long as there is some noise in the exposure. We support our theoretical arguments with simulation studies.
- [11] arXiv:2401.04036 (replaced) [pdf, html, other]
-
Title: A regularized MANOVA test for semicontinuous high-dimensional dataSubjects: Methodology (stat.ME)
We propose a MANOVA test for semicontinuous data that is applicable also when the dimensionality exceeds the sample size. The test statistic is obtained as a likelihood ratio, where numerator and denominator are computed at the maxima of penalized likelihood functions under each hypothesis. Closed form solutions for the regularized estimators allow us to avoid computational overheads. We derive the null distribution using a permutation scheme. The power and level of the resulting test are evaluated in a simulation study. We illustrate the new methodology with two original data analyses, one regarding microRNA expression in human blastocyst cultures, and another regarding alien plant species invasion in the island of Socotra (Yemen).
- [12] arXiv:2402.12548 (replaced) [pdf, html, other]
-
Title: Composite likelihood inference for space-time point processesComments: This paper is still under revisionSubjects: Methodology (stat.ME)
The dynamics of a rain forest is extremely complex involving births, deaths and growth of trees with complex interactions between trees, animals, climate, and environment. We consider the patterns of recruits (new trees) and dead trees between rain forest censuses. For a current census we specify regression models for the conditional intensity of recruits and the conditional probabilities of death given the current trees and spatial covariates. We estimate regression parameters using conditional composite likelihood functions that only involve the conditional first order properties of the data. When constructing assumption lean estimators of covariance matrices of parameter estimates we only need mild assumptions of decaying conditional correlations in space while assumptions regarding correlations over time are avoided by exploiting conditional centering of composite likelihood score functions. Time series of point patterns from rain forest censuses are quite short while each point pattern covers a fairly big spatial region. To obtain asymptotic results we therefore use a central limit theorem for the fixed timespan - increasing spatial domain asymptotic setting. This also allows us to handle the challenge of using stochastic covariates constructed from past point patterns. Conveniently, it suffices to impose weak dependence assumptions on the innovations of the space-time process. We investigate the proposed methodology by simulation studies and applications to rain forest data.
- [13] arXiv:2405.14208 (replaced) [pdf, other]
-
Title: An Empirical Comparison of Methods to Produce Business Statistics Using Non-Probability DataComments: Submitted to the Journal of Official Statistics, and is currently under reviewSubjects: Methodology (stat.ME)
There is a growing trend among statistical agencies to explore non-probability data sources for producing more timely and detailed statistics, while reducing costs and respondent burden. Coverage and measurement error are two issues that may be present in such data. The imperfections may be corrected using available information relating to the population of interest, such as a census or a reference probability sample.
In this paper, we compare a wide range of existing methods for producing population estimates using a non-probability dataset through a simulation study based on a realistic business population. The study was conducted to examine the performance of the methods under different missingness and data quality assumptions. The results confirm the ability of the methods examined to address selection bias. When no measurement error is present in the non-probability dataset, a screening dual-frame approach for the probability sample tends to yield lower sample size and mean squared error results. The presence of measurement error and/or nonignorable missingness increases mean squared errors for estimators that depend heavily on the non-probability data. In this case, the best approach tends to be to fall back to a model-assisted estimator based on the probability sample. - [14] arXiv:2409.01521 (replaced) [pdf, html, other]
-
Title: Modelling Volatility of Spatio-temporal Integer-valued Data with Network Structure and AsymmetrySubjects: Methodology (stat.ME)
This paper proposes a spatial threshold GARCH-type model for dynamic spatio-temporal integer-valued data with network structure. The proposed model can simplify the parameterization by using network structure in data, and can capture the asymmetric property in dynamic volatility by adopting a threshold structure. The proposed model assumes the conditional distribution is Poisson distribution. Asymptotic theory of maximum likelihood estimation (MLE) for the spatial model is derived when both sample size and network dimension are large. We obtain asymptotic statistical inferences via investigation of the weak dependence of components of the model and application of limit theorems for weakly dependent random fields. Simulation studies and a real data example are presented to support our methodology.
- [15] arXiv:2409.07795 (replaced) [pdf, html, other]
-
Title: Robust and efficient estimation in the presence of a randomly censored covariateSubjects: Methodology (stat.ME)
In Huntington's disease research, a current goal is to understand how symptoms change prior to a clinical diagnosis. Statistically, this entails modeling symptom severity as a function of the covariate 'time until diagnosis', which is often heavily right-censored in observational studies. Existing estimators that handle right-censored covariates have varying statistical efficiency and robustness to misspecified models for nuisance distributions (those of the censored covariate and censoring variable). On one extreme, complete case estimation, which utilizes uncensored data only, is free of nuisance distribution models but discards informative censored observations. On the other extreme, maximum likelihood estimation is maximally efficient but inconsistent when the covariate's distribution is misspecified. We propose a semiparametric estimator that is robust and efficient. When the nuisance distributions are modeled parametrically, the estimator is doubly robust, i.e., consistent if at least one distribution is correctly specified, and semiparametric efficient if both models are correctly specified. When the nuisance distributions are estimated via nonparametric or machine learning methods, the estimator is consistent and semiparametric efficient. We show empirically that the proposed estimator, implemented in the R package sparcc, has its claimed properties, and we apply it to study Huntington's disease symptom trajectories using data from the Enroll-HD study.
- [16] arXiv:2409.11265 (replaced) [pdf, html, other]
-
Title: Performance of Cross-Validated Targeted Maximum Likelihood EstimationComments: 20 pages, 3 figures, 1 tableSubjects: Methodology (stat.ME); Applications (stat.AP); Machine Learning (stat.ML)
Background: Advanced methods for causal inference, such as targeted maximum likelihood estimation (TMLE), require certain conditions for statistical inference. However, in situations where there is not differentiability due to data sparsity or near-positivity violations, the Donsker class condition is violated. In such situations, TMLE variance can suffer from inflation of the type I error and poor coverage, leading to conservative confidence intervals. Cross-validation of the TMLE algorithm (CVTMLE) has been suggested to improve on performance compared to TMLE in settings of positivity or Donsker class violations. We aim to investigate the performance of CVTMLE compared to TMLE in various settings.
Methods: We utilised the data-generating mechanism as described in Leger et al. (2022) to run a Monte Carlo experiment under different Donsker class violations. Then, we evaluated the respective statistical performances of TMLE and CVTMLE with different super learner libraries, with and without regression tree methods.
Results: We found that CVTMLE vastly improves confidence interval coverage without adversely affecting bias, particularly in settings with small sample sizes and near-positivity violations. Furthermore, incorporating regression trees using standard TMLE with ensemble super learner-based initial estimates increases bias and variance leading to invalid statistical inference.
Conclusions: It has been shown that when using CVTMLE the Donsker class condition is no longer necessary to obtain valid statistical inference when using regression trees and under either data sparsity or near-positivity violations. We show through simulations that CVTMLE is much less sensitive to the choice of the super learner library and thereby provides better estimation and inference in cases where the super learner library uses more flexible candidates and is prone to overfitting. - [17] arXiv:2409.11385 (replaced) [pdf, html, other]
-
Title: Probability-scale residuals for event-time dataSubjects: Methodology (stat.ME)
The probability-scale residual (PSR) is defined as $E\{sign(y, Y^*)\}$, where $y$ is the observed outcome and $Y^*$ is a random variable from the fitted distribution. The PSR is particularly useful for ordinal and censored outcomes for which fitted values are not available without additional assumptions. Previous work has defined the PSR for continuous, binary, ordinal, right-censored, and current status outcomes; however, development of the PSR has not yet been considered for data subject to general interval censoring. We develop extensions of the PSR, first to mixed-case interval-censored data, and then to data subject to several types of common censoring schemes. We derive the statistical properties of the PSR and show that our more general PSR encompasses several previously defined PSR for continuous and censored outcomes as special cases. The performance of the residual is illustrated in real data from the Caribbean, Central, and South American Network for HIV Epidemiology.
- [18] arXiv:2205.08609 (replaced) [pdf, html, other]
-
Title: Bagged Polynomial Regression and Neural NetworksSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Methodology (stat.ME)
Series and polynomial regression are able to approximate the same function classes as neural networks. However, these methods are rarely used in practice, although they offer more interpretability than neural networks. In this paper, we show that a potential reason for this is the slow convergence rate of polynomial regression estimators and propose the use of \textit{bagged} polynomial regression (BPR) as an attractive alternative to neural networks. Theoretically, we derive new finite sample and asymptotic $L^2$ convergence rates for series estimators. We show that the rates can be improved in smooth settings by splitting the feature space and generating polynomial features separately for each partition. Empirically, we show that our proposed estimator, the BPR, can perform as well as more complex models with more parameters. Our estimator also performs close to state-of-the-art prediction methods in the benchmark MNIST handwritten digit dataset. We demonstrate that BPR performs as well as neural networks in crop classification using satellite data, a setting where prediction accuracy is critical and interpretability is often required for addressing research questions.
- [19] arXiv:2311.04318 (replaced) [pdf, html, other]
-
Title: Estimation for multistate models subject to reporting delays and incomplete event adjudicationSubjects: Statistics Theory (math.ST); Methodology (stat.ME)
Complete observation of event histories is often impossible due to sampling effects such as right-censoring and left-truncation, but also due to reporting delays and incomplete event adjudication. This is for example the case for health insurance claims and during interim stages of clinical trials. In this paper, we develop a parametric method that takes the aforementioned effects into account, treating the latter two as partially exogenous. The method, which takes the form of a two-step M-estimation procedure, is applicable to multistate models in general, including competing risks and recurrent event models. The effect of reporting delays is derived via thinning, offering an alternative to existing results for Poisson models. To address incomplete event adjudication, we propose an imputed likelihood approach which, compared to existing methods, has the advantage of allowing for dependencies between the event history and adjudication processes as well as allowing for unreported events and multiple event types. We establish consistency and asymptotic normality under standard identifiability, integrability, and smoothness conditions, and we demonstrate the validity of the percentile bootstrap. Finally, a simulation study shows favorable finite sample performance of our method compared to other alternatives, while an application to disability insurance data illustrates its practical potential.
- [20] arXiv:2312.11582 (replaced) [pdf, html, other]
-
Title: Shapley-PC: Constraint-based Causal Structure Learning with Shapley ValuesComments: 21 pages (with appendix)Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Methodology (stat.ME)
Causal Structure Learning (CSL), also referred to as causal discovery, amounts to extracting causal relations among variables in data. CSL enables the estimation of causal effects from observational data alone, avoiding the need to perform real life experiments. Constraint-based CSL leverages conditional independence tests to perform causal discovery. We propose Shapley-PC, a novel method to improve constraint-based CSL algorithms by using Shapley values over the possible conditioning sets, to decide which variables are responsible for the observed conditional (in)dependences. We prove soundness, completeness and asymptotic consistency of Shapley-PC and run a simulation study showing that our proposed algorithm is superior to existing versions of PC.
- [21] arXiv:2402.07717 (replaced) [pdf, other]
-
Title: Computationally efficient reductions between some statistical modelsComments: v2 contains numerical illustrations and more exposition in narrativeSubjects: Statistics Theory (math.ST); Information Theory (cs.IT); Probability (math.PR); Methodology (stat.ME); Machine Learning (stat.ML)
We study the problem of approximately transforming a sample from a source statistical model to a sample from a target statistical model without knowing the parameters of the source model, and construct several computationally efficient such reductions between canonical statistical experiments. In particular, we provide computationally efficient procedures that approximately reduce uniform, Erlang, and Laplace location models to general target families. We illustrate our methodology by establishing nonasymptotic reductions between some canonical high-dimensional problems, spanning mixtures of experts, phase retrieval, and signal denoising. Notably, the reductions are structure-preserving and can accommodate missing data. We also point to a possible application in transforming one differentially private mechanism to another.
- [22] arXiv:2409.10374 (replaced) [pdf, html, other]
-
Title: Nonlinear Causality in Brain Networks: With Application to Motor Imagery vs ExecutionSubjects: Applications (stat.AP); Computation (stat.CO); Methodology (stat.ME)
One fundamental challenge of data-driven analysis in neuroscience is modeling causal interactions and exploring the connectivity between nodes in a brain network. Various statistical methods, using different perspectives and data modalities, have been developed to understand the causal structures in brain dynamics. This study introduces a novel statistical approach, TAR4C, to dissect causal interactions in multichannel EEG recordings. TAR4C uses the threshold autoregressive (TAR) model to describe causal interactions between nodes in a brain network from two perspectives. The first tests whether one node controls the dynamics of another. The controlling node, named the threshold variable, implies its causative role since it operates as a switching mechanism governing the instantaneous transitions between autoregressive structures. This concept is known as threshold non-linearity. Once verified between a node pair, the next step in TAR modeling is assessing the causal node's predictive ability on the other's activity, representing causal interactions in autoregressive terms, a concept underlying Granger (G) causality. TAR4C can discover non-linear, time-dependent causal interactions while maintaining the G-causality framework. The approach's efficacy is demonstrated through EEG data from a motor execution/imagery experiment. By comparing causal interactions during motor execution and imagery, TAR4C reveals key similarities and differences in brain connectivity across subjects.