Biometrika — Vol 110 Issue 3 · 2026-06-20¶

共 16 篇 · Biometrika
目录核对 ✅ 16 篇全部抓到（对照 OpenAlex 17 篇）

本期导览¶

自动生成：归纳本期主要主题与脉络，不打分、不排名。

Biometrika 第110卷第3期围绕因果推断、高维推断与半参数/非参方法三大主题展开。因果推断方向共5篇，集中在图形化变量选择、匹配偏差、时变效应调节、回归隐含权重和空间混杂的谱域处理。高维推断方向包含4篇，覆盖多变量线性回归中的方差分析、矩阵变量协方差结构检验、图模型中的潜在变量调整以及选择后推断。半参数/非参方向有5篇，涉及均匀有效诚实推断的充要条件、损失型聚类的广义贝叶斯框架、二分类校准曲线的保序置信带、多元区间删失数据的边际比例风险模型，以及依赖删失下的参数copula识别。此外，还有1篇关于MCMC局部平衡算法的渐近最优设计，以及1篇关于紧空间上匹配先验存在性的理论工作。

因果推断主线中，《Variable elimination, graph reduction and the efficient g-formula》给出了剔除无信息变量的图形准则，不牺牲效率；《On the statistical role of inexact matching in observational studies》指出非精确匹配的主要作用是为后续模型化调整提供鲁棒性，而非消除偏差；《Assessing time-varying causal effect moderation》在micro-randomized trial中定义了考虑cluster异质性与干扰的因果效应；《On the implied weights of linear regression for causal inference》通过隐含权重将回归调整与随机化实验的协变量平衡、自加权性连接，并提出了设计阶段诊断工具；《Spectral adjustment for spatial confounding》在谱域刻画空间混杂的可识别性，并引入半参数spline调整方法。高维推断主线中，《High-dimensional analysis of variance in multivariate linear regression》提出U-type统计量，利用高维Gaussian逼近避免协方差矩阵奇异问题；《Testing Kronecker product covariance matrices for high-dimensional matrix-variate data》基于线性谱统计量检验Kronecker积结构，并给出Bootstrap一致性；《Thresholded graphical lasso adjusts for latent variables》通过硬阈值操作简化传统sparse+low-rank方法，在较弱的边强度条件下实现一致图恢复；《Splitting strategies for post-selection inference》通过响应变量随机化替代传统数据分裂，在保持推断有效性的同时提升power。

在半参数/非参方向中，《Median regularity and honest inference》提出“中位数正则性”概念，证明了均匀有效诚实推断可行的充要条件是该估计量存在中位数正则性，填补了必要性结果的空白，为功能参数推断提供了基础性工具。其他亮点包括：基于Gibbs posterior的广义贝叶斯聚类框架统一了损失型与模型型范式；基于保序回归的校准曲线置信带具有有限样本覆盖保证并支持反转检验；多元区间删失数据下基于工作独立假设的伪似然NPMLE具有sandwich方差一致性；依赖删失下参数copula可识别的充分条件得到系统验证。与因果推断最贴近的是上述5篇因果专题文章，以及中位数正则性为因果推断中的均匀诚实估计提供了理论基准；高维推断方向中，高维MANOVA与Kronecker协方差检验适用于高维因果数据中的检验问题；后选择推断的随机化策略对因果研究中变量选择后的推断有直接参考价值。

因果推断 (causal_inference, 5 篇)¶

1. 10.1093/biomet/asac062 · arXiv — Variable elimination, graph reduction and the efficient g-formula¶

作者: F Richard Guo, Emilija Perković, Andrea Rotnitzky
期刊/来源: Biometrika
分类: vol 110 · issue 3 · pp 739-761
相关性 9/10 · novelty: new_theory
摘要: 在无隐藏变量的 DAG 因果图模型下，本文研究点暴露干预均值（interventional mean）的高效估计问题，关注哪些变量可被剔除而不影响 identification 与 semiparametric efficiency bound。作者提出了一组图形准则（graphical criteria），证明其对于识别并消除所有 uninformative 变量是 sound and complete 的，从而在不牺牲估计效率的前提下节省测量成本。进一步，在仅保留 informative 变量的集合上构造了 reduced DAG，证明干预均值可由该 reduced graph 对应的 g-formula 识别，且原模型与 reduced 模型下的 semiparametric variance bound 完全一致。该 g-formula 是 irreducible 且 efficient 的：在正则条件下，其非参数估计量在原因果图模型下达到渐近有效，且不存在依赖更少变量的公式具有此性质。对您可能有用：本文将图论操作与 semiparametric efficiency bound 精确结合，为因果推断的 identification 与 efficiency theory 提供了新的图形化视角。
关键技术: semiparametric variance bound, g-formula, variable elimination graphical criteria, DAG reduction, irreducible efficient identifying formula
为什么对您有用: 本文直接连接 causal inference 的 identification theory 与 efficiency theory 两个子方向，给出了 DAG 变量剔除与 semiparametric efficiency bound 不变的完整刻画。研究者可用 very_familiar 的 minimax bounds / estimation theory 审视其 efficiency claim 的紧致性，或用 moderately_familiar 的 identification theory 检验图形准则的完备性证明。属于立即可做：用现有武器即可深入阅读并验证其理论细节，甚至可尝试将 reduced g-formula 框架推广至 longitudinal 或 hidden-variable 设定。

2. 10.1093/biomet/asac066 · arXiv — On the statistical role of inexact matching in observational studies¶

作者: Kevin Guo, Dominik Rothenhäusler
期刊/来源: Biometrika
分类: vol 110 · issue 3 · pp 631-644
相关性 8/10 · novelty: new_theory
摘要: 在观察性因果推断中，本文研究非精确协变量匹配（inexact matching）的统计角色，目标 estimand 为处理效应，关键设定为局部误设（local misspecification）框架。作者证明非精确匹配常遗留统计上有意义的偏差，且该偏差使标准随机化检验渐近无效，因此建议在匹配后追加模型化协变量调整。核心机制在于：在局部误设框架下，匹配操作使后续参数分析对模型选择或误设的敏感度降低，从而获得鲁棒性。理论结果明确界定了非精确匹配偏差的渐近影响，并重新定位其首要统计角色为提供模型鲁棒性而非消除偏差。对您有用：本文直接关联因果推断的 identification 与 sensitivity 分析，其局部误设框架为理解 matching+regression 组合策略的鲁棒性提供了新视角。
关键技术: inexact covariate matching, local misspecification framework, randomization test asymptotic validity, model-based covariate adjustment after matching, sensitivity to model misspecification
为什么对您有用: 本文直接关联因果推断的 sensitivity analysis 与 identification 子方向，用局部误设框架量化了 matching+parametric adjustment 的鲁棒性收益。您可以用 M-estimation theory（moderately_familiar）中的局部误设/扰动分析工具直接切入本文的理论证明，验证其鲁棒性界是否可进一步 sharpen。立即可做：用 very_familiar 的 minimax bound 视角审视其偏差-鲁棒性权衡，或用 moderately_familiar 的 M-estimation 理论将局部误设框架推广到 semiparametric 模型。

3. 10.1093/biomet/asac065 · arXiv — Assessing time-varying causal effect moderation in the presence of cluster-level treatment effect heterogeneity and interference¶

作者: J Shi, Z Wu, W Dempsey
期刊/来源: Biometrika
分类: vol 110 · issue 3 · pp 645-662
相关性 8/10 · novelty: new_method
摘要: 在 micro-randomized trial (MRT) 的纵向因果推断设定下，本文重新定义了 causal excursion effect，以处理 cluster-level treatment effect heterogeneity 与 interference（SUTVA 假设的两大偏离）。核心 estimand 允许 treatment effect 随时间变化且受 cluster-level moderator 调节；identification 依赖 MRT 的序列随机化与 cluster 内部干扰的结构性假设。估计方面，基于 weighted, centred least-squares criterion 构造 semiparametric estimator，无需对 outcome 的 nuisance model 作完全参数化假设，理论上保证 n^{-1/2}-CAN 与 semiparametric local efficiency。实证部分用美国多机构医学住院医师队列数据验证了方法。对您有用：本文将 longitudinal causal inference 的 excursion effect 从个体独立设定拓展到 cluster interference，直接对接您 primary interest 中的 longitudinal CI 与 identification theory。
关键技术: micro-randomized trial, causal excursion effect, cluster-level interference, weighted centred least-squares, semiparametric inference, time-varying effect moderation
为什么对您有用: 本文直接推进了您 primary interest 中 longitudinal causal inference 与 identification theory 的边界——从 standard non-interference 拓展到 cluster interference 下的 causal excursion effect identification 与 semiparametric estimation。您武器库中 very_familiar 的 estimation theory in causal inference 与 moderately_familiar 的 identification theory / semiparametric theory 完全可以攻入本文的 identification 假设推导与 estimator 的 influence function 分析。Follow-up 判断：立即可做——用您熟悉的 semiparametric efficiency bound 工具验证本文 weighted centred least-squares estimator 是否达到 cluster interference 设定下的 efficiency bound，并可探索更高阶的 HOIF 修正以改善 nuisance 估计偏差。

4. 10.1093/biomet/asac058 · arXiv — On the implied weights of linear regression for causal inference¶

作者: Ambarish Chattopadhyay, José R Zubizarreta
期刊/来源: Biometrika
分类: vol 110 · issue 3 · pp 615-629
相关性 8/10 · novelty: new_method
摘要: 在观察性研究的因果推断设定下，本文研究线性回归调整如何通过其隐含权重（implied weights）来近似随机化实验的核心特征（如协变量平衡、自加权与代表性）。作者推导了多种线性回归方法（OLS、WLS 等）隐含个体级权重的闭式表达式，并分析了其有限样本与大样本性质：有限样本下刻画了回归隐含的目标人群，大样本下从权重视角证明了回归估计量的多重稳健性（multiply robustness）。进一步，作者证明一般回归方法的隐含权重可等价地通过求解一个凸优化问题获得，从而在回归建模与因果推断文献之间建立桥梁，并据此提出属于观察性研究设计阶段的新型回归诊断工具。对您可能有用：本文将回归估计量重新表述为隐含权重优化问题，为理解 OLS 在因果推断中的 target population 偏移与 robustness 提供了新的 semiparametric 视角。
关键技术: implied regression weights, convex optimization equivalence, multiply robustness, covariate balance diagnostics, target population characterization, lmw R package
为什么对您有用: 本文直接连接到因果推断的 identification 与 estimation 子方向，特别是 OLS 调整在观察性研究中的隐含目标人群与 robustness 性质。从 technical_arsenal 看，您可以用 M-estimation theory 与 semiparametric theory 的工具来审视其闭式权重推导与 multiply robustness 声称的严格性，甚至可以尝试用 minimax bounds 评估其隐含权重优化框架下的效率性质。Follow-up 粗判：立即可做——用 very_familiar 的 estimation theory in causal inference 与 moderately_familiar 的 M-estimation theory 即可深入阅读并验证其 robustness 声称的边界条件。

5. 10.1093/biomet/asac069 · arXiv — Spectral adjustment for spatial confounding¶

作者: Yawen Guan, Garritt L Page, Brian J Reich, Massimo Ventrucci, Shu Yang
期刊/来源: Biometrika
分类: vol 110 · issue 3 · pp 699-719
相关性 7/10 · novelty: new_method
摘要: 在空间因果推断设定下，目标是在存在未测量空间混杂时估计暴露效应；关键假设为全局尺度混杂在局部尺度消散，且暴露与混杂的相干性在谱域满足特定可识别条件。作者在谱域刻画模型与混杂结构，证明谱域可识别条件等价于在空间域通过加入暴露的空间平滑项调整全局混杂。提出一系列调整方法：从基于 Matérn 相干函数的参数调整到使用 smoothing splines 的半参数方法，覆盖 areal 与 geostatistical 数据。理论证明在相干性衰减条件下效应可识别，模拟与真实数据验证方法有效性。对您有用：此工作将空间混杂的 identification 与谱域相干性条件显式连接，其半参数 spline 调整路径与您对 semiparametric theory 的兴趣直接相关。
关键技术: spatial confounding identification, spectral domain coherence, Matern coherence function, smoothing splines semiparametric adjustment, spatial smooth exposure adjustment
为什么对您有用: 本文直接连接因果推断的 identification theory（空间混杂可识别条件）与 semiparametric theory（spline 调整），属于您 primary interest 的交叉点。用您 very_familiar 的 minimax bounds for estimation problems 工具，可以分析其 semiparametric adjustment 在不同相干性衰减速率下的收敛率是否达到 efficiency bound，这是一个立即可做的 follow-up 方向。

高维统计 / 随机矩阵 (high_dim_rmt, 2 篇)¶

1. 10.1093/biomet/asac063 · arXiv — Testing Kronecker product covariance matrices for high-dimensional matrix-variate data¶

作者: Long Yu, Jiahui Xie, Wang Zhou
期刊/来源: Biometrika
分类: vol 110 · issue 3 · pp 799-814
相关性 7/10 · novelty: new_theory
摘要: 本文研究高维矩阵变量数据中协方差矩阵是否具有Kronecker积结构的检验问题，其中维度p和样本量n同阶增长。基于重整化样本协方差矩阵的线性谱统计量构造检验统计量，并首次推导出其渐近正态性，给出均值和协方差函数的显式公式，填补了该设定下的文献空白。进一步提出Bootstrap重抽样算法逼近极限分布，并在温和条件下证明Bootstrap一致性，同时允许额外随机噪声存在。模拟结果表明检验经验水平接近名义水平，且功效随维度和样本量增加快速趋于1。该工作直接运用随机矩阵理论的线性谱统计量解决高维假设检验问题，与您在高维渐近和假设检验方面的兴趣高度契合。
关键技术: linear spectral statistics, renormalized sample covariance matrix, Kronecker product covariance, bootstrap resampling, central limit theorem for LSS, high-dimensional matrix-variate data
为什么对您有用: 本文与您的高维统计与随机矩阵理论兴趣直接相关，具体使用线性谱统计量检验Kronecker积协方差结构，属于高维假设检验的核心问题。您非常熟悉的high-dimensional asymptotics可以直接用于评估其CLT证明的严谨性，并可用minimax bounds判断检验的渐近最优性；此外您的软件开发经验可快速复现和扩展其Bootstrap流程。立即可做：由于您对高维渐近和软件开发的熟悉度极高，可立即验证其数值性能或进一步推广至更一般的协方差结构。

2. 10.1093/biomet/asac060 · arXiv — Thresholded graphical lasso adjusts for latent variables¶

作者: Minjie Wang, Genevera I Allen
期刊/来源: Biometrika
分类: vol 110 · issue 3 · pp 681-697
相关性 5/10 · novelty: sharper_rate
摘要: 在潜在变量存在的 Gaussian graphical model 设定下，目标是恢复观测变量的稀疏图结构，传统方法（Chandrasekaran et al. 2012）通过 sparse + low-rank convex program 实现 identification，但计算与统计性质均面临挑战。本文提出极简替代方案：对 graphical lasso 估计的精度矩阵施加硬阈值操作（thresholded graphical lasso）。理论证明，在仅需 minimum edge strength 条件（无需严格的 irrepresentability 或低秩约束）下，该方法即可实现 graph selection consistency，且统计收敛速率优于现有 sparse + low-rank 方法。该结果进一步扩展至 thresholded neighbourhood selection 与 CLIME 估计器。仿真与神经科学功能连接数据实证均显示该方法优于现有潜在变量图模型方法。对您有用：本文在高维图模型中用极简阈值操作绕过低秩分解的计算瓶颈，直接改善了 graph recovery rate，与您的高维统计与统计计算兴趣直接相关。
关键技术: hard thresholding operator, graph selection consistency, minimum edge strength condition, sparse plus low-rank decomposition, graphical lasso, CLIME
为什么对您有用: 直接连接高维图模型估计与潜在变量调整，属于您的高维统计与统计计算子方向；您武器库中的 high-dimensional asymptotics 与 minimax bounds 可直接用来审视本文声称的 'improved statistical rate' 是否紧，以及 thresholding 在更一般的高维 precision matrix 估计中的 minimax 性质。立即可做：用 very_familiar 的高维渐近理论验证其速率优势的边界条件。

非参数 / 半参数 (nonparam_semipara, 5 篇)¶

1. 10.1093/biomet/asad004 · arXiv — A generalized Bayes framework for probabilistic clustering¶

作者: Tommaso Rigon, Amy H Herring, David B Dunson
期刊/来源: Biometrika
分类: vol 110 · issue 3 · pp 559-578
相关性 5/10 · novelty: new_method
摘要: 在缺乏明确似然函数的聚类设定下，本文目标是通过 Gibbs posterior 为 k-means 等损失型聚类方法提供不确定性量化。核心机制是将贝叶斯更新中的 log-likelihood 替换为聚类损失函数（如 Bregman divergence 或 pairwise similarity），形成 Gibbs posterior，从而在无需指定数据生成模型的情况下实现信念的连贯更新与后验推断。作者为点估计设计了高效的确定性算法，并为不确定性量化开发了采样算法，证明了若干现有算法（如 k-means）可被视为该框架下的广义 Bayes 估计量。主要理论贡献是统一了损失型与模型型聚类范式，并为前者提供了后验概率计算（如某点被正确聚类的概率）的途径。对您可能有用：该框架的 Gibbs posterior 结构与 semiparametric efficiency 中的 pseudo-likelihood / loss-based inference 有概念对接，可作为非参数贝叶斯推断的扩展阅读。
关键技术: Gibbs posterior, Bregman divergence loss, pairwise similarity loss, generalized Bayes updating, deterministic point estimation algorithm, posterior sampling for uncertainty quantification
为什么对您有用: 本文连接到 semiparametric / nonparametric theory 子方向中的 loss-based inference（Gibbs posterior 是 pseudo-likelihood 推断的一种贝叶斯变体）。用您 very_familiar 的 nonparametric statistics 与 minimax bounds 视角，可以审视该框架下 Gibbs posterior 的 concentration rate 是否达到最优（当前论文侧重算法与框架，理论收敛率分析可能留有口子）。follow-up 粗判：中期可做——需先在 moderately_familiar 的 M-estimation theory 上长肌肉，以将 Gibbs posterior 的 concentration 与 M-estimator 的经典极限理论对接，进而探索其 semiparametric efficiency bound。

2. 10.1093/biomet/asac068 · arXiv — Honest calibration assessment for binary outcome predictions¶

作者: Timo Dimitriadis, Lutz Dümbgen, Alexander Henzi, Marius Puke, Johanna Ziegel
期刊/来源: Biometrika
分类: vol 110 · issue 3 · pp 663-680
相关性 4/10 · novelty: new_method
摘要: 在二分类预测模型校准评估设定下，目标 estimand 是校准曲线 p(x)，核心 regularity 假设为单调性（isotonicity）。本文提出基于保序回归的新型置信带，用于对校准曲线进行 honest 评估。该方法具有有限样本覆盖保证，带宽比现有方法更窄，且能自适应校准曲线的局部平滑度与二值观测的局部方差。除了经典完美校准的 goodness-of-fit 检验外，该置信带还支持反转检验：拒绝原假设即可得出模型充分良好校准的结论。对您可能有用：本文的保序置信带构造与反转检验思路，为 semiparametric/nonparametric 理论中的 shape-constrained inference 提供了有限样本保证的新范例。
关键技术: isotonic regression, confidence band, finite-sample coverage guarantee, inverted goodness-of-fit test, local smoothness adaptation, calibration curve estimation
为什么对您有用: 本文连接到 semiparametric/nonparametric theory 子方向中的 shape-constrained inference（保序约束下的非参估计与推断）。您武器库中的 nonparametric statistics 与 minimax bounds 可直接用于审视其声称的更窄带宽与自适应率是否紧，或推广到其他单调约束模型。判断：立即可做——用 very_familiar 的 nonparametric statistics 与 minimax bounds 工具即可展开对其理论性质的验证与拓展。

3. 10.1093/biomet/asac059 — Marginal proportional hazards models for multivariate interval-censored data¶

作者: Yangjianchen Xu, Donglin Zeng, D Y Lin
期刊/来源: Biometrika
机构: University of North Carolina at Chapel Hill
分类: vol 110 · issue 3 · pp 815-830
相关性 4/10 · novelty: new_method
摘要: 在多变量区间删失数据设定下，目标是估计可能时变协变量对多事件时间的边际比例风险模型回归参数，不指定事件间依赖结构。构造基于工作独立假设的非参数伪似然，并提供稳定的 EM 型算法。所得非参数最大伪似然估计量（NPMLE）在任意依赖结构下被证明一致且渐近正态，极限协方差可通过 sandwich 估计量一致估计。实证通过模拟与 ARIC 流行病学队列数据验证。对您有用：该文在半参数边际模型下处理 cluster 依赖的 sandwich 方差与伪似然策略，可直接迁移到纵向因果推断中 cluster-dependent influence function 的稳健推断。
关键技术: nonparametric maximum pseudolikelihood, marginal proportional hazards model, EM-type algorithm, sandwich variance estimator, interval-censored data, working independence assumption
为什么对您有用: 本文连接到因果推断的纵向/cluster 依赖设定与半参数理论：其 working-independence 伪似良 + sandwich 方差的套路是处理 cluster-dependent semiparametric estimator 的经典范式，与您熟悉的 semiparametric efficiency 及 influence function 理论直接对接。用您 very_familiar 的高维渐近与 moderately_familiar 的 M-estimation 理论即可验证其 sandwich 估计量的渐近性质，属于立即可做的延伸阅读；若想进一步探究该 NPMPLE 是否达到边际模型的 semiparametric efficiency bound，则需在 moderately_familiar 的 semiparametric theory 上长肌肉。

4. 10.1093/biomet/asac067 — Dependent censoring based on parametric copulas¶

作者: C Czado, I Van Keilegom
期刊/来源: Biometrika
机构: Technical University of Munich · KU Leuven
分类: vol 110 · issue 3 · pp 721-738
相关性 3/10 · novelty: new_theory
摘要: 在随机右删失设定下，当生存时间 T 与删失时间 C 存在随机依赖时，目标是估计 T 的边际分布。本文提出基于参数 copula 刻画 (T,C) 依赖结构、且 T 与 C 的边际分布亦为参数形式的联合建模框架；与多数已有工作不同，本文不假设 copula 参数已知，而是从数据中同时估计。核心理论贡献是给出了一组使 (T,C) 双变量分布可识别的充分条件，并在多种常见 copula（如 Clayton、Frank、Gaussian 等）与边际分布组合下逐一验证了这些条件。估计方面采用极大似然，模拟与胰腺癌数据实证展示了方法表现。对您可能有用：该文在依赖删失下的 identification 条件分析思路，可迁移至因果推断中处理 unmeasured confounding 与 competing risk 的 identification 问题。
关键技术: parametric copula, dependent censoring, identification conditions, maximum likelihood estimation, survival analysis
为什么对您有用: (1) 连接到因果推断中依赖删失 / competing risk 的 identification 子方向——依赖删失本质上是一种 unmeasured confounding，copula identification 条件与 proximal CI 的 negative control identification 有结构相似性。(2) 用 technical_arsenal 中的 identification theory in causal inference 可以审视其可识别充分条件是否可放松为半参数边际，从而切入一个具体口子。(3) 中期可做：需先在 moderately_familiar 的 semiparametric theory 上长肌肉，才能将本文纯参数框架推广到半参数边际（保留参数 copula）并推导 efficient influence function。

5. 10.1093/biomet/asac061 · arXiv — Existence of matching priors on compact spaces¶

作者: Haosui Duanmu, Daniel M Roy, Aaron Smith
期刊/来源: Biometrika
分类: vol 110 · issue 3 · pp 763-776
相关性 3/10 · novelty: new_theory
摘要: 在紧参数空间设定下，研究 level 1-α 的 matching prior（使 1-α credible region 同时成为 1-α confidence set 的先验）的存在性。核心 estimand 是先验分布，关键 regularity 条件是拒绝概率函数在先验空间（赋 Wasserstein 距离）与参数空间的联合连续性。作者证明在此拓扑条件下 matching prior 必存在；同时指出常见 credible region（HPD region、credible ball、quantile）通常不满足该连续性，并构造了满足条件的近似 HPD 与 credible ball 以获得 matching prior。证明核心工具是非标准分析（nonstandard analysis），建立了 Wasserstein 距离非标准扩张的新性质。末尾给出了基于离散化与迭代的数值计算方案。对您可能有用：matching prior 的存在性条件与构造直接关联 semiparametric/nonparametric 理论中的贝叶斯–频率派推断一致性，且数值计算方案触及 statistical computing 兴趣。
关键技术: matching prior, Wasserstein metric on priors, nonstandard analysis, highest-posterior density region, credible ball, discretization-based numerical scheme
为什么对您有用: 本文直接关联 semiparametric/nonparametric theory 子方向中贝叶斯–频率派推断一致性的基础问题（matching prior 存在性），并触及 statistical computing（数值计算 matching prior）。您可用 very_familiar 中的 minimax bounds / high-dimensional asymptotics 审视其拓扑条件在非紧/高维参数空间下是否可推广，并用 software development 经验评估其离散化迭代算法的可实现性与收敛速率。follow-up 粗判：中期可做——需先在 moderately_familiar 的 semiparametric theory 上长肌肉（特别是贝叶斯非参数先验在无穷维空间上的 Wasserstein 拓扑性质），才能将紧空间结果推向 semiparametric 模型。

数理统计 / 假设检验 (hypothesis_testing, 3 篇)¶

1. 10.1093/biomet/asad001 · arXiv — High-dimensional analysis of variance in multivariate linear regression¶

作者: Zhipeng Lou, Xianyang Zhang, Wei Biao Wu
期刊/来源: Biometrika
分类: vol 110 · issue 3 · pp 777-797
相关性 9/10 · novelty: new_theory
摘要: 在高维多变量线性回归设定下（维度与系数个数均可随样本量增长），本文研究线性假设检验问题。作者提出一种新的 U-type 统计量，在较弱的矩假设下建立了高维 Gaussian approximation 结果，从而避免了传统 Hotelling 检验在高维时协方差矩阵奇异的问题。该框架统一处理了经典 one-way MANOVA 与非参数 one-way MANOVA 的高维版本。为实施检验，引入基于样本分割的误差协方差二阶矩估计量并分析其性质；模拟显示新检验在多种设定下优于现有方法。对您可能有用：该 U-type 统计量的高维 Gaussian approximation 与二阶矩估计直接触及您的高阶 U-statistic 理论与高维假设检验兴趣。
关键技术: U-type statistic, high-dimensional Gaussian approximation, one-way MANOVA, sample-splitting, linear hypothesis testing, error covariance second moment estimation
为什么对您有用: 直接连接您的高维假设检验与高阶 U-statistic 理论两个 primary interest 子方向。文中 U-type 统计量的构造与高维 Gaussian approximation 可用您 very_familiar 的 higher-order U-statistic computation (treewidth / tensor contraction) 视角分析其计算复杂度，并用 moderately_familiar 的 higher-order U-statistic 理论审视其投影与收敛性质。Follow-up 判断：立即可做——可用现有 U-statistic 武器库直接切入该统计量的计算成本分析与理论性质验证。

2. 10.1093/biomet/asad002 · arXiv — Median regularity and honest inference¶

作者: Arun Kumar Kuchibhotla, Sivaraman Balakrishnan, Larry Wasserman
期刊/来源: Biometrika
分类: vol 110 · issue 3 · pp 831-838
相关性 8/10 · novelty: new_theory
摘要: 本文在一般半参数/非参数设定下研究功能参数(functional)的均匀有效诚实推断(uniformly valid honest inference)可行性问题，核心 estimand 为任意功能参数，关键假设涉及估计量的中位数收敛行为。作者提出“中位数正则性”(median regularity)这一新概念：估计量的中位数偏差在参数空间上一致收敛至零。证明了均匀有效诚实推断可行的当且仅当条件是存在中位数正则估计量，这一必要性结果填补了文献空白，因为传统的正则性(基于均值/方差)并不构成均匀推断的必要条件。技术工具涉及均匀收敛、中位数偏差的渐近分析与反证构造。主要理论结果将推断可行性完全等价于中位数正则性，而非经典的均值正则性或影响函数路径。对您可能有用：此结果直接挑战了基于影响函数/均值正则性的经典效率理论框架，为半参数效率界与均匀推断的间隙提供了新视角。
关键技术: median regularity, uniformly valid honest inference, functional estimation, asymptotic median deviation, necessity and sufficiency equivalence
为什么对您有用: 本文直接触及 efficiency theory 与 hypothesis testing 两个 primary interest 子方向：它证明均匀推断的充要条件是 median regularity 而非经典 mean regularity / influence function 路径，这对 semiparametric efficiency bound 的传统框架构成根本性挑战。用 technical_arsenal 中 very_familiar 的 minimax bounds 与 moderately_familiar 的 semiparametric theory / M-estimation 可以直接审视其必要性证明的紧致性，并探索在 HOIF(Higher-Order Influence Functions)设定下 median regularity 是否比 mean regularity 更易满足。follow-up 判断：立即可做——用 minimax 与 M-estimation 工具验证该等价条件在具体半参数模型(如 ATE)中的操作性，并检查 HOIF 估计量是否天然具备 median regularity。

3. 10.1093/biomet/asac070 · arXiv — Splitting strategies for post-selection inference¶

作者: D García Rasines, G A Young
期刊/来源: Biometrika
分类: vol 110 · issue 3 · pp 597-614
相关性 7/10 · novelty: new_method
摘要: 在稀疏回归的 post-selection inference 设定下，目标是针对经选择步骤挑出的参数提供有效推断，克服传统回归工具因选择偏差而失效的问题。本文提出一种基于响应变量随机化（response randomization）的 data splitting 替代方案，允许使用任意选择规则，并推导了该随机化方法下估计量的中心极限定理。理论与实证对比表明，相较于经典 data splitting，该随机化策略在保持推断有效性的同时，选择与推断的 power 均有显著提升。对您可能有用：该工作为 post-selection 的有效推断提供了新的随机化视角，直接关联您对 hypothesis testing 与高维推断的兴趣。
关键技术: post-selection inference, response randomization, data splitting, central limit theorem, sparse regression, selection bias correction
为什么对您有用: 本文直接关联 hypothesis testing 子方向，聚焦高维稀疏回归中 post-selection 的有效推断问题，这是数学统计中 selection-adjusted inference 的经典难题。从 technical_arsenal 看，您 very_familiar 的高维渐近理论可直接用于审视其 CLT 的 rate 与条件，moderately_familiar 的 M-estimation 理论可攻其随机化估计量的渐近性质与偏差修正口子。Follow-up 判断：立即可做——用您熟悉的高维渐近与 M-estimation 工具即可动手验证其 CLT 在更一般 selection rule 下的 robustness 或拓展。

统计计算 / 算法 (stat_computing, 1 篇)¶

1. 10.1093/biomet/asac056 · arXiv — Optimal design of the Barker proposal and other locally balanced Metropolis–Hastings algorithms¶

作者: Jure Vogrinc, Samuel Livingstone, Giacomo Zanella
期刊/来源: Biometrika
分类: vol 110 · issue 3 · pp 579-595
相关性 4/10 · novelty: new_theory
摘要: 在产品型目标分布下，研究 Livingstone & Zanella (2022) 提出的一阶 locally balanced Metropolis–Hastings 算法类的最优设计问题；关键约束是 balancing function \(g(t)=tg(1/t)\) 与 proposal increment 的噪声分布。对满足 mild smoothness 假设的任意类成员，证明了当维度 \(n\to\infty\) 时，普遍的最优接受率为 57%，步长缩放阶为 \(n^{-1/3}\)，并以 expected squared jumping distance 为效率度量给出了任意算法的显式渐近效率表达式。在此基础上，分别推导了 Barker proposal 的最优噪声分布、Gaussian 噪声下的最优 balancing function，以及整个算法类的全局最优选择（后者依赖于具体目标分布）。数值模拟表明，Barker proposal 采用双模态最优噪声分布后，实际效率一致优于原始 Gaussian 版本。对您有用：本文的 \(n^{-1/3}\) 缩放与 57% 接受率结果为高维 MCMC 算法设计提供了精确的渐近基准，直接补充您在 statistical computing 与高维渐近分析方面的兴趣。
关键技术: locally balanced Metropolis-Hastings, Barker proposal, balancing function, expected squared jumping distance, high-dimensional scaling limit, optimal acceptance rate
为什么对您有用: 直接连接 statistical computing 子方向：给出了高维 MCMC proposal 设计的精确渐近效率表达式与 57% 最优接受率，是您用 high-dimensional asymptotics / minimax 思维审视计算算法的典型范例。用 very_familiar 的高维渐近工具即可验证其 \(n^{-1/3}\) 缩放与效率界是否紧，甚至可尝试将产品型假设放松至更一般的相关结构目标分布。立即可做：用 very_familiar 的高维渐近与 inverse-problem-with-random-noise 经验，直接动手分析非产品型目标下的缩放阶是否仍为 \(n^{-1/3}\)。

Maintained by 陈星宇 · Homepage · Source on GitHub