2026-06-17 每日 arXiv 资讯¶

高相关论文 11 篇 · 中相关 14 篇 · 其他 37 篇 · 会议/Seminar 事件 0 条

⭐ 高相关论文（按主题分组）¶

因果推断 (causal_inference, 4 篇)¶

1. 2606.14840 — Causal Sufficient Dimension Reduction for Multiple Continuous Exposures with an Application to Environmental Mixtures¶

作者: Thomas W. Hsiao, Howard H. Chang, Razieh Nabi
分类: stat.ME
相关性 8/10 · novelty: new_method
摘要: 本文针对多元连续暴露的因果效应估计困难，提出因果充分降维（CSDR）框架。目标是将高维暴露-响应曲面通过一个低维的因果中心均值子空间来概括，从而简化估计与解释。CSDR采用半参数两阶段估计：第一阶段估计干扰函数（如倾向性得分、条件均值），第二阶段基于这些估计构造修正回归问题以识别降维子空间。理论方面，建立了子空间恢复的收敛率，该率依赖干扰函数估计的精度，并给出结构维度的一致性选择方法；还引入每个暴露的重要性分数。模拟表明CSDR比非因果降维或直接原始暴露方法更准确。最后应用于研究母体PFAS混合物对婴儿出生体重的影响。本文直接连接您关注的因果推断中多暴露的识别与估计，以及环境流行病学的实际应用。
关键技术: causal sufficient dimension reduction, central mean subspace, semiparametric two-stage estimator, nuisance function estimation, convergence rate with nuisance error, subspace importance score
为什么对您有用: 本文属于因果推断中多连续暴露的识别与估计方向，并附有环境流行病学应用，直接命中您的primary interest中的因果推断和secondary interest中的流行病学。您对非参数统计、最小最大界和因果推断估计理论非常熟悉，可用于审视该方法的收敛率是否最优，或扩展至其他因果参数（如加权效应、分位数效应）。此为中期可做：需先巩固半参数理论中的正交得分和效率界概念（moderately familiar），但立即可复现其模拟代码并用您的软件经验优化实现。

2. 2606.15754 — Bounding Causal Effects for Ordinal Outcomes Under Positive Dependence¶

作者: Micha Mandel, Daniel Rodan
分类: stat.ME
相关性 8/10 · novelty: new_method
摘要: 本文处理序数分类结果的因果效应界定问题，标准均值 ATE 在此类尺度下不自然，而“处理组结果超过对照组”等替代 estimand 通常不可识别。既有工作仅利用边际分布给出 sharp bounds，但作者注意到若施加“独立性”工作假设边界会大幅变紧，于是探讨何时这种假设合理。他们首先证明常用的正象限依赖和正回归依赖等条件不足以担保独立性边界的有效性。随后提出新依赖条件“对角尾部优势 (Diagonal Tail Dominance, DTD)”，在该条件下独立性边界成立，但 DTD 自身很强、实际中常不可信。为实用化，进一步提出“局部 DTD”概念，仅对概率表的选定部分施加独立性假设，从而导出更紧且可能成立的边界。通过理论证明、数值模拟和急性缺血性卒中临床试验数据分析展示边界表现。此文直接关联您对因果推断中识别、边界方法和敏感度分析的兴趣，尤其为序数结果这一常见但方法上棘手的情形提供了可操作的依赖条件。
关键技术: diagonal tail dominance (DTD), partial identification bounds, positive dependence conditions, local DTD, copula-based dependence
为什么对您有用: 本文直接推进因果推断中序数结果的识别理论，提出 DTD 这一新依赖条件，可视为识别假设的放松/替代，与您 primary interest 中的因果推断识别和敏感度分析高度吻合。从技术弹药库看，您非常熟悉的非参数统计和因果推断估计理论足以评估 DTD 条件的可检验性并将其推广至其他 estimand（如工具变量或中介分析）。另外，局部 DTD 的构造可借助 moderately_familiar 中的 identification theory 进一步形式化，并可能引出基于 copula 的正性检验，因此从当前技术储备出发可立即动手做基于 DTD 的识别敏感性分析。

3. 2606.14977 — Identification and Inference for Algorithmic Frontiers with Selective Labels¶

作者: Yiqi Liu, Francesca Molinari, Amilcar Velez
分类: econ.EM · cs.LG
相关性 8/10 · novelty: new_method
摘要: 本文研究当结果变量仅对部分样本被观测（选择性标签）时，如何刻画并推断公平性-准确性（FA）前沿。首先，在损失函数为特定形式、选择机制无约束的条件下，给出了FA前沿的尖锐识别区域的特征刻画。其次，假设条件于协变量的无混杂性（及一般损失函数），实现了点识别，并提出了去偏机器学习估计量，推导了其渐近分布，进一步构建了FA前沿的假设检验和置信集。方法上融合了部分识别与半参数推断框架，利用DML实现高效估计。对于更广泛的损失函数，作者正在扩展部分识别的结果。该工作连接了因果推断中的选择性偏差、算法公平性度量，以及经济学中关于最优决策前沿的实证方法。
关键技术: debiased machine learning, sharp identification region, partial identification, unconfoundedness assumption, asymptotic inference, fairness-accuracy frontier
为什么对您有用: 直接关联您在因果推断中的识别与估计兴趣（选择性标签、无混杂性、DML），也连接经济学的应用因果工作。您的'非常熟悉'武器库中'估计理论在因果推断'和'非参数统计'可以立即应用于理解和扩展其DML估计量的理论，'软件开发'技能可复现推断程序。属于立即可做的方向。

4. 2606.16230 — Semiparametric Dynamic Logit Model with Endogenous Networks¶

作者: Brice Romuald Gueyap Kounga
分类: econ.EM
相关性 8/10 · novelty: new_method
摘要: 本文研究动态部分线性 logit 模型在社交网络内生且随时间变化时的识别与估计问题。设定中，结果方程包含滞后因变量和时变未观测社会特征的未知函数，该特征同时驱动网络链接形成；若忽略此共同因素，标准面板 logit 或简单加入网络控制将产生偏误。作者通过结合条件似然论证与跨个体网络类型匹配，同时消除个体异质性和未知社会影响函数，无需对网络形成过程施加参数假设即可实现斜率参数和状态依赖系数的点识别。估计器采用基于 codegree 相似性的核加权条件极大似然（KWCML），利用时间相邻协变量进行局部平滑；在弱正则条件下建立了相合性和渐近正态性。蒙特卡洛模拟显示，在各种网络形成机制和样本量下该方法显著降低偏误。实证部分用纵向友谊网络数据研究青少年吸烟行为，结果表明标准方法因混淆内生网络排序而高估状态依赖。该论文对您可能有用：其识别策略与因果推断中的横向/纵向网络处理效应识别紧密相关，而半参数估计框架可直接用您熟悉的 M-estimation 和非参数统计工具检验理论性质。
关键技术: conditional likelihood, network-type matching, kernel-weighted conditional maximum likelihood, codegree similarity, local smoothing, endogenous networks
为什么对您有用: 该论文聚焦于纵向网络因果推断的识别问题，直接对应您 primary interests 中的因果推断（动态处理、网络干扰）和半参数理论。您可以用 very_familiar 的非参数统计和 M-estimation 理论评估其估计量的渐近性质及正则条件是否紧；而您 moderately_familiar 的因果识别理论（如条件外生性、负向控制）可用来对比其网络匹配策略与其他识别假设的异同。follow-up 粗判：中期可做——您需要先在 moderately_familiar 的因果识别理论上加深对网络内生性及其工具变量类方法的理解，但一旦掌握即可将本文的 codegree 匹配思路扩展到连续处理或更一般结果类型。

非参数 / 半参数 (nonparam_semipara, 3 篇)¶

1. 2606.15433 — Limit theorems of Azadkia-Chatterjee's conditional graph correlation¶

作者: Muhong Gao, Fang Han, Qizhai Li
分类: math.ST · econ.EM · stat.ME · stat.TH
相关性 9/10 · novelty: new_theory
摘要: 本文研究 Azadkia-Chatterjee 提出的条件图相关系数 \(T_n\) 的渐近理论，目标是在一般依赖结构下建立其推断框架。\(T_n\) 是基于秩与 k-NN 图的非参数条件依赖度量，取 0 当且仅当条件独立、取 1 当且仅当条件完美依赖，且计算复杂度为 \(O(n \log n)\)。作者证明了在一般依赖下 \(T_n\) 满足渐近正态性，并给出了极限方差的具体闭式表达；同时构造了计算复杂度仍为 \(O(n \log n)\) 的方差一致估计量，配合现有 bias-correction 方法构成了完整的推断理论。核心工具涉及 U-statistic 的 H-decomposition 与投影理论、经验过程以及 k-NN 图的局部渐近分析。对您有用：此文的 \(T_n\) 渐近方差闭式推导与 U-statistic 投影技术直接关联您对 higher-order U-statistics 与非参数理论的研究。
关键技术: conditional graph correlation, H-decomposition of U-statistics, k-nearest neighbor graph, empirical process theory, asymptotic normality, bias-correction
为什么对您有用: 本文直接推进了非参数条件独立性检验的渐近理论，属于您 primary interest 中 semiparametric/nonparametric theory 与 higher-order U-statistics 的交叉点。文中对 \(T_n\) 的渐近方差闭式推导核心依赖 U-statistic 的 H-decomposition 与投影，您在 technical_arsenal 中 very_familiar 的 higher-order U-statistics computation 与 moderately_familiar 的 theory of higher-order U-statistics 正好可以用来审视其投影阶数选取与残差项的收敛率分析是否可以进一步 sharpen。立即可做：用您熟悉的 U-statistic 投影与 treewidth/einsum 视角，可以直接动手验证其方差估计量的高阶项衰减率，或尝试将 \(T_n\) 的计算复杂度用 tensor contraction 模型重新刻画。

2. 2606.16179 — On the Geometry of Separation in Finite Gaussian Mixtures¶

作者: Huy Nguyen, Dung Le, Alessandro Rinaldo, Nhat Ho
分类: math.ST · stat.ML · stat.TH
相关性 8/10 · novelty: new_theory
摘要: 研究高斯混合模型中最小组件分离程度对参数估计收敛速率的影响，这是一个公开问题。作者构建了一个统一的几何框架，基于新的Hellinger下界直接将混合密度差异与混合测度之间的Wasserstein距离联系起来，并显式依赖于最小分离和最小权重。方法上巧妙结合了插值多项式和共轭差分技术来构造特殊矩提取测试函数。当分量数已知时，发现局部化现象：分离复杂性完全由组件的空间构型决定（单簇、多簇有宏观间隔、或无结构排列）。当分量数未知且过指定时，分离复杂性降低，最小权重从收敛速率中消失，这是由于从一阶Wasserstein几何过渡到二阶。最终得到了依赖于分离的收敛速率，连续插值了逐点估计和均匀估计的边界，从而确定了有限高斯混合参数恢复的基本极限。该工作对您在高维统计和非参数理论中的minimax界限研究有直接启示，特别是处理混合模型时几何结构与收敛速率的定量关系。
关键技术: Hellinger lower bound, Wasserstein distance, interpolation polynomials, confluent divided differences, localization phenomenon
为什么对您有用: 连接到您primary interest中的非参数统计和minimax界限（very_familiar），本文为混合模型的参数估计提供了精确依赖于分离几何的收敛速率，可帮助您深化对高维混合模型最小最大界限的理解。武器库中'minimax bounds for estimation problems'可以直接用于验证或拓展其速率紧性；中期可做之处在于需要先熟悉Wasserstein几何与Hellinger下界的搭配技巧（moderately_familiar中暂无，属于策略扩展），但整体理论框架清晰，值得精读。

3. 2606.16179 — On the Geometry of Separation in Finite Gaussian Mixtures¶

作者: Huy Nguyen, Dung Le, Alessandro Rinaldo, Nhat Ho
分类: math.ST · stat.ML · stat.TH
相关性 8/10 · novelty: new_theory
摘要: 本文研究有限高斯混合模型中最小分量间距对参数估计收敛速率的影响，目标是刻画最小间距与最小权重如何共同决定混合分布参数的可恢复性。方法层面，作者发展了一个基于 Hellinger 距离下限的统一几何框架，通过精心构造的插值多项式和合流差商技巧设计矩提取检验函数，将混合密度之间的差异直接映射到混合测度之间的 Wasserstein 距离，显式包含最小间距和最小权重的依赖。在分量数已知时，收敛速率的复杂度由组分的空间配置（单簇、多簇宏观间隙、无结构约束）严格决定，呈现局部化现象；当分量数未知且过指定时，复杂性降低，最小权重的影响完全消失，因为 Wasserstein 几何从一阶过渡到二阶。最终得到依赖于最小间距的收敛速率，连续插值逐点估计和均匀估计两个极端，从而在理论上奠定了高斯混合参数恢复的基本极限。该结果对非参数统计中的混合模型理论有直接推进，其中最小最大下界技术可以用于验证所给速率的紧性。
关键技术: Hellinger lower bounds, Wasserstein distance, confluent divided differences, interpolation polynomials, moment-extraction test functions, separation-dependent convergence rates
为什么对您有用: 本文直接推进非参数统计中混合模型估计的基本极限理论，属于您‘nonparametric statistics’兴趣的核心问题。您非常熟悉的‘minimax bounds for estimation problems’工具可以立即用于验证本文所给速率是否紧，或推广到其他混合分布族（如非高斯）。此外，论文中使用的插值多项式和差商技巧可能为高阶 U-统计量的几何分析提供新视角——立即可做：用已有的最小最大界框架对标本文下界。

效率理论 / Debiased ML (efficiency_dml, 1 篇)¶

1. 2606.15602 — Bias-Aware External-Model-Assisted Inference in High-Dimensional Regression¶

作者: Hongzhe Zhang, Hanxuan Ye, Hongzhe Li
分类: stat.ME
相关性 9/10 · novelty: new_method
摘要: 在高维半监督线性回归设定下，目标是利用无标签数据和外部预测器（如黑盒模型）对回归系数进行有效推断；当外部预测器接近 oracle 时，现有 PPI/PPI++ 方法退化为 OLS 且方差膨胀。本文提出 DEAL（Debiased External-model-Assisted Lasso），将外部估计量与无标签协方差信息注入 debiased estimator 的方差项，并通过 bias-aware、cross-fitted shrinkage 步骤自适应于 target-only、near-oracle 与 biased-but-informative 三种 regime。理论上证明了坐标渐近正态性及自适应方差缩减，并将结果推广至模型误设下的 projection parameter 与非线性 labeler；在相同无标签预算下，DEAL 置信区间长度严格短于 debiased Lasso、PPI 及 PPI++，shift-aware 变体在协变量偏移下仍保持覆盖。模拟与六个真实数据集（含天文学、蛋白质组学及 LLM oracle）显示区间长度中位数比仅为 0.23–0.53。对您可能有用：该工作在半监督设定下实现了 debiased estimator 的方差缩减，直接连接到 efficiency theory 与 debiased ML 方向。
关键技术: debiased Lasso, cross-fitting, bias-aware shrinkage, prediction-powered inference, semi-supervised inference, projection parameter under misspecification
为什么对您有用: 直接连接到 efficiency theory / debiased ML 方向——在半监督高维设定下，通过外部模型与无标签数据实现 debiased estimator 的自适应方差缩减，突破了 PPI 在 near-oracle regime 的方差膨胀瓶颈。可用 very_familiar 的高维渐近工具验证其 coordinate-wise normality 与 variance reduction claim 是否紧；对 cross-fitted shrinkage 的 bias-aware 调节机制，可用 moderately_familiar 的 M-estimation theory 分析其 projection parameter 下的性质。立即可做：用 very_familiar 的高维渐近与 minimax bound 工具复现其理论推导并检验 sharper rate 是否紧。

数理统计 / 假设检验 (hypothesis_testing, 2 篇)¶

1. 2606.15636 — Paired Sample Tests for High-dimensional Uncorrelatedness via Random Integration¶

作者: Shiyao Huang, Xiaojun Song
分类: stat.ME
相关性 8/10 · novelty: new_method
摘要: 本文针对两个高维随机向量之间的不相关性提出一种非参数检验方法。在n和p同步发散的高维设定下，目标是对零假设“协方差矩阵为零”进行检验，无需假设n/p的相对大小。方法核心是将Jiang等人(2023,2024)的随机积分推广为检验统计量，该统计量估计协方差矩阵的加权平方L2范数。在高斯或次高斯假设下，证明了检验统计量在原假设下渐近服从标准正态分布。Monte Carlo模拟表明，该方法在检测“弱但广泛”的相关结构时优于现有检验，同时经验大小控制良好。最后通过DNA甲基化与基因表达的相关性实证分析展示应用。这篇论文直接联系到您对高维假设检验的兴趣，其随机积分技巧可视为一种核方法，能与您熟悉的非参数统计和高维渐近工具对接，并可进一步探讨该检验的minimax最优性。
关键技术: Random integration, Weighted squared L2 norm of covariance matrix, High-dimensional asymptotics (n, p → ∞), Nonparametric test for uncorrelatedness, Kernel-based test statistic
为什么对您有用: 本文属于高维假设检验这一主要兴趣方向，其方法基于随机积分构造检验统计量，与您熟悉的非参数统计（very_familiar）和高维渐近理论（very_familiar）直接相关。您可以立即利用minimax下界分析该检验在检测弱依赖时的最优性（立即可做），或从更高阶U统计量的角度（moderately_familiar中的HOIF）重新解释其统计量结构，以探索更紧的收敛速率。

2. 2606.15433 — Limit theorems of Azadkia-Chatterjee's conditional graph correlation¶

作者: Muhong Gao, Fang Han, Qizhai Li
分类: math.ST · econ.EM · stat.ME · stat.TH
相关性 9/10 · novelty: new_theory
摘要: 本文研究Azadkia和Chatterjee提出的条件图相关度量T_n的渐近理论，该度量基于秩和最近邻，在条件独立时为零、条件完全相关时为1，是首个具有这种极端识别性质的条件依赖度量。作者在一般依赖假设下证明了T_n的渐近正态性，且其极限方差具有闭式表达式，这为构造假设检验和置信区间提供了基础。进一步提出了O(n log n)计算复杂度的一致方差估计量，结合已有的偏差校正方法，完整建立了T_n的推断理论。该工作补全了该度量自提出以来缺失的极限分布理论，使得基于条件独立性的假设检验有了严格的渐近框架。对您而言，条件独立性检验是因果结构学习的核心工具，本文的渐近结果可直接应用于构建基于该度量的因果发现算法的p值或置信域。
关键技术: conditional dependence measure, rank and nearest neighbor, asymptotic normality, closed-form limiting variance, consistent variance estimator, bias correction
为什么对您有用: 直接关联因果推断中的条件独立性检验子方向，该度量及其渐近理论为因果结构学习提供了可操作的非参数假设检验工具。非常熟悉领域中的非参数统计与高维渐近可直接用于评估该检验的效率和扩展性，例如用minimax框架分析其最优性。立即可做：基于论文的闭式方差和偏差校正，可将其嵌入现有的因果发现算法并验证实证表现。

天体统计 (astrostats, 1 篇)¶

1. 2606.16525 — Scalable Bayesian data curation for next-generation radio experiments¶

作者: S. A. K. Leeney, E. de Lera Acedo, W. J. Handley, H. T. J. Bevins, G. Allen, D. Anstey, K. Artuc, G. Bernardi, M. Bucher, S. Carey, J. Cavillot, R. Chiello, A. S. Chu, W. Croukamp, J. Cumner, S. Dasgupta, A. K. Dash, D. I. L. de Villiers, J. Dhandha, A. Dragovic, J. A. Ely, A. Fialkov, T. Gessey-Jones, C. Kirkham, G. Kulkarni, A. Magro, P. D. Meerburg, S. Mittal, D. Molnar, R. S. Patel, J. H. N. Pattison, S. Pegwal, C. M. Pieterse, J. R. Pritchard, G. M. Z. Rajpoot, N. Razavi-Ghods, D. Robins, I. L. V. Roque, A. Saxena, K. H. Scheutwinkel, E. Shen, P. H. Sims, M. Spinelli, J. L. Tutt, J. Zhu
分类: astro-ph.IM · astro-ph.CO
相关性 8/10 · novelty: application
摘要: 本文针对下一代射电望远镜数据量过大、无法人工质量审核的问题，提出了一种全自动贝叶斯异常检测方法。其核心思想是将数据整理融入推断过程：在似然函数中边际化一个潜异常指示变量，而不是先做外部标记再筛除。该方法无需预定义阈值或人工检查，即可为每个观测自动分配概率性数据整理分数。实现上采用JAX框架和GPU加速，保证了大规模数据的可扩展性。作者将方法应用于REACH射电实验的4655次观测（一年数据），成功识别出天气驱动的系统误差、仪器部件漂移和窄带射频干扰，并揭示出数据质量与环境/仪器状态之间的复杂依赖关系。本文清楚展示了射电天文学中数据整理从人工瓶颈转化为自主推断基础设施的路线。对于统计学家来说，这是一篇极好的入门读物，有助于理解真实天文学数据的结构和挑战，并借鉴贝叶斯方法与GPU计算在大规模管道中的整合。
关键技术: Bayesian anomaly detection, marginalized latent indicator, GPU-accelerated inference (JAX), probabilistic data curation, radio telescope pipeline, automatic flagging
为什么对您有用: 本文属于astrostatistics gateway reading，以清晰非技术语言介绍了射电天文学的数据整理问题，并给出一个完全自动化的贝叶斯解决方案。武器库中的“软件开发”技能可直接协助理解其JAX GPU管道的实现逻辑，而“非参数统计”有助于审视其建模假设。该文值得全文阅读，可作为进入天文数据分析领域的起点。后续若想在该方向做贡献，需补充学习射电天文特有的噪声与系统学模型（如RFI特性），判断为中期可做。

📌 中相关论文（按主题分组）¶

因果推断 (causal_inference, 1 篇)¶

1. 2606.15031 — Partial Identification from LLM Prompts¶

作者: Xiaohong Chen, Elie Tamer
分类: econ.EM
相关性 7/10 · novelty: new_theory
摘要: 在潜在真实标签不可观测的设定下，本文研究如何利用大语言模型(LLM)的多次报告面板数据对真实发生率 θ=P(X=1) 进行部分识别。将LLM报告建模为两成分有限混合模型，核心发现是：若不对潜在成分施加分离约束，θ 完全无法识别；而弱随机序约束（一阶占优、单调似然比、均值排序）仍使识别集停留在 [0,1]。识别力来源于外部校准分数与事件（借鉴误分类与污染数据文献），作者刻画了由此产生的 sharp bounds，并证明利用分数的完整分布比仅用均值提供更多识别信息。当多个命名模型被重复提问时，识别 θ 的关键不是正回答计数而是模型间跨提示的一致性模式——投票计数会丢弃此信息。扩展部分推导了 X 作为不可观测回归变量时回归系数的隐含边界。对您可能有用：本文将部分识别理论系统应用于 LLM 误分类新场景，且回归变量边界扩展直接连接到因果推断中不可观测混淆/处理的敏感性分析。
关键技术: partial identification, finite mixture model, misclassification bounds, sharp bounds characterization, stochastic ordering restrictions, corrupted data literature
为什么对您有用: 本文直接连接到因果推断中的 identification theory 与 sensitivity analysis 子方向——LLM 误分类设定本质上是处理/标签不可观测时的部分识别问题，回归系数边界扩展与 IV/proxy 变量设定下的敏感性分析逻辑一致。用 technical_arsenal 中 'identification theory in causal inference' 的武器即可直接攻本文的 mixture identification 与 bound characterization 问题。立即可做：用 very_familiar 的部分识别与 minimax 思维审视其 bound 的紧性，或在 longitudinal/proximal CI 设定下引入类似的外部校准分数约束推导新 bound。

非参数 / 半参数 (nonparam_semipara, 7 篇)¶

1. 2606.15450 — Kernel Density Estimation by Spectral Decomposition: Data-Driven Tapering and Superposition¶

作者: Mitchell A. Thornton
分类: stat.ME · eess.SP
相关性 7/10 · novelty: new_method
摘要: 该论文在特征函数域处理核密度估计的带宽选择问题，将binned数据的循环群平均协方差矩阵谱分解，得到平方经验特征函数，其谱在理论噪声底1/n之上，带宽由谱截断位置决定。提出自动谱截断选择器，最小化频域误差准则，在光滑密度上匹配拇指法则，在多峰密度上接近最优固定带宽。进一步提出自适应Wiener滤波器，为每个频率分配最优衰减权重，可匹配或超越最优固定带宽，包括尖峰和梳状密度；并扩展至已知测量误差的解卷积。由于Wiener估计能解析结构但不如混合模型紧凑，提出两种叠加方式：分段划分和高斯混合平滑基底加带限残差的叠加。数据驱动的噪声底估计替代理论1/n，对舍入和堆积数据更稳健。在Marron-Wand基准上，n=5000时Wiener滤波和叠加法排名前二，优于交叉验证，且优势随样本量增大。六个真实数据集验证。对您有用：该频域方法为非参数密度估计提供新视角，与您熟悉的非参数统计和高维渐近工具（谱分析、特征函数）直接相关。
关键技术: spectral decomposition, Wiener filter, characteristic function domain, kernel density estimation, Gaussian mixture superposition, automatic bandwidth selection
为什么对您有用: 直接连接您的非参数统计兴趣，该论文在特征函数域用谱截断进行密度估计，与传统核方法不同；您武器库中的非参数统计和minimax界可用于分析该方法的收敛速率和自适应性质；建议中期可做——需在高维渐近下的谱分析工具上进一步熟悉（如Marchenko-Pastur律用于随机矩阵特征值），但现有时可直接复现代码并测试其在实际数据上的表现。

2. 2606.15836 — Minimax Synthesis of Network Mechanisms¶

作者: Marios Papamichalis, Regina Ruane
分类: math.ST · stat.ME · stat.TH
相关性 7/10 · novelty: new_theory
摘要: 本文考虑单个观测网络由多种候选机制（如社区、枢纽、聚类）组合生成的问题。目标是从一个图中估计各机制的贡献强度以及它们的组合规则（加性或交互）。作者解决两个核心挑战：一是当机制本身也需从图中估计时，直接拟合会导致贡献估计偏向零，提出偏差校正方法并构造有效置信区间；二是确定组合规则是否可恢复，发现存在一个图密度的尖锐阈值，当图足够稠密时才能区分加性 vs 交互组合，否则任何检验均无效。在理论方面，建立了与已知设计基准相匹配的 minimax 最优率，并证明所提估计量达到该率。模拟和实际数据应用验证了方法，例如在某网络中一个候选机制的贡献的置信区间排除了正贡献。该文将 minimax 框架引入网络机制合成问题，对非参数估计和网络数据分析有参考价值。
关键技术: network mechanism synthesis, bias correction for estimated mechanisms, minimax optimal rate, sharp threshold for combination rule, confidence interval construction
为什么对您有用: 本文的 minimax 率分析直接连接到您熟悉的非参数 minimax 下界方法（very_familiar 中的 minimax bounds for estimation problems）。网络机制估计中的偏差校正问题与您在高维统计中处理估计误差的思路相通。中期可做：您对网络模型（如随机块模型）可能不熟悉，但 minimax 率工具可立即迁移，需要先补充网络统计模型知识。若您希望深入，可从理解其 sharp threshold 证明入手，并与您的高维随机矩阵工具（若涉及谱方法）相结合。

3. 2606.16913 — Optimal Multiscale Learning of Linear Operators¶

作者: Jiaheng Chen, Daniel Sanz-Alonso
分类: math.ST · cs.NA · math.NA · stat.ML · stat.TH
相关性 7/10 · novelty: new_theory
摘要: 在 Sobolev 空间之间的有界线性算子学习设定下，目标是从带噪输入-输出数据中估计算子并在 Sobolev 算子范数损失下获得 minimax rate。作者将问题重写为小波坐标系下的无穷维矩阵回归，揭示出具有异质双侧多尺度结构的非均匀局部估计难度。构造了有限分辨率的分块最小二乘估计器（blockwise least-squares），证明了其在 Sobalev 算子范数下达到 minimax 收敛速率。通过分配尺度自适应的样本量，该估计器在稠密最小二乘实现中达到了最优计算成本，从而显式刻画了统计-计算权衡。对您可能有用：本文的 minimax 界与多尺度分块估计策略，为非参数逆问题与算子学习提供了清晰的率-成本分析框架。
关键技术: Sobolev operator-norm loss, wavelet coordinate representation, blockwise least-squares estimation, minimax rate, scale-adaptive sample allocation, statistical-computational tradeoff
为什么对您有用: 本文直接连接到统计-计算权衡这一 primary interest 子方向，显式刻画了算子学习中 minimax 统计率与稠密算法最优计算成本之间的 gap 与 achievability。研究者可用 very_familiar 的 minimax bounds for estimation problems 与 inverse problems with random noise 武器，直接审视其 minimax 界的紧性与分块估计器的构造逻辑；计算成本分析部分则可尝试用 moderately_familiar 的 higher-order U-statistics / tensor contraction 视角，审视其稠密矩阵实现是否有更优的 contraction-order 优化空间。Follow-up 判断：立即可做——用 minimax 与逆问题工具验证其率是否紧，并探索 tensor contraction 视角下的计算成本改进。

4. 2606.15836 — Minimax Synthesis of Network Mechanisms¶

作者: Marios Papamichalis, Regina Ruane
分类: math.ST · stat.ME · stat.TH
相关性 7/10 · novelty: new_theory
摘要: 在单观测网络设定下，目标是估计多个候选机制（社区、枢纽、聚类等）对图生成的贡献强度及组合规则，关键假设是图由候选机制线性或交互叠加生成。同一数据拟合机制与强度会导致强度估计向零偏移，本文提出去偏校正方法并构建有效置信区间。对于组合规则的可识别性，证明了稠密图存在精确阈值：高于阈值可精确判定叠加是加性还是交互，低于阈值则无检验能区分。估计方法通过将候选机制与观测边校准实现，并建立了匹配的 minimax rate（同时覆盖已知设计与估计设计设定）。模拟与真实网络验证了方法，其中置信区间可排除某候选机制的任何正向贡献。对您可能有用：该文的 minimax 界与去偏估计思路可迁移至因果推断中多干预叠加效应的 identification 与估计问题。
关键技术: minimax rate, sharp threshold, debiasing correction, confidence interval, network mechanism synthesis, estimated-design benchmark
为什么对您有用: 直接连接 nonparametric minimax theory 与 semiparametric debiasing：本文在估计设计（ nuisance 估计与目标参数同时从同一数据学习）下建立匹配 minimax rate 并给出去偏置信区间，这与 semiparametric efficiency / debiased ML 的 nuisance-orthogonal 估计设定高度同构。用 very_familiar 的 minimax bounds 工具可直接审视其声称的 minimax rate 是否紧；去偏校正的具体形式是否等价于 one-step / orthogonal score 也可用 moderately_familiar 的 semiparametric theory 检查。精确阈值（ sharp threshold ）的证明技术可能需要特定概率工具（如 large deviation / branching process ），暂不在武器库中。Follow-up 判断：中期可做——需先在 moderately_familiar 的 semiparametric theory 上确认其 debiasing 与 orthogonal score 的等价性，再考虑迁移到因果叠加效应设定。

5. 2606.16373 — Higher-order spectral perturbation expansions II: Kernel matrices and manifold learning¶

作者: Bernhard Stankewitz, Martin Wahl
分类: math.ST · math.PR · math.SP · stat.TH
相关性 7/10 · novelty: sharper_rate
摘要: 在Mercer条件与局部Weyl定律的弱假设下，本文研究核矩阵谱集中界作为对应核积分算子逼近的误差控制，目标estimand为核矩阵与积分算子的谱偏差。核心方法为higher-order spectral perturbation expansion，通过高阶展开精确刻画大多重性、大有效维度与重尾分布下的谱收敛，突破了传统一阶扰动或Davis-Kahan类界在此类设定下的局限。收敛性质给出谱偏差的minimax-type sharp bound，适用于无穷维PCA、manifold learning与Bayesian nonparametrics中的wavelet prior。主要理论结果在球面热核与Bayesian非参wavelet先验两例中验证了界的紧性。对您可能有用：该高阶谱扰动展开为RKHS/核方法下的semiparametric效率与higher-order U-statistics投影分析提供了新的谱逼近工具。
关键技术: higher-order spectral perturbation expansion, Mercer kernel integral operator, local Weyl law, spectral concentration bound, kernel matrix approximation, infinite-dimensional PCA
为什么对您有用: 直接连接nonparametric/semiparametric理论中RKHS算子谱逼近问题，以及high-dimensional statistics中核矩阵的谱分析。technical_arsenal中'high-dimensional asymptotics'与'minimax bounds for estimation problems'可直接攻本文的谱集中界紧性验证；'computation of higher-order U-statistics (tensor contraction)'可作为后续方向——将高阶谱扰动展开与U-statistic的高阶投影在计算复杂度上做类比分析。立即可做：用very_familiar的minimax bound与高维渐近工具验证/推广其谱界；若要深入manifold learning与局部Weyl定律的几何交互，需在moderately_familiar的semiparametric理论中补充RKHS谱理论细节（中期可做）。

6. 2606.16913 — Optimal Multiscale Learning of Linear Operators¶

作者: Jiaheng Chen, Daniel Sanz-Alonso
分类: math.ST · cs.NA · math.NA · stat.ML · stat.TH
相关性 7/10 · novelty: new_theory
摘要: 本文研究从带噪输入-输出数据中学习Sobolev空间之间有界线性算子的统计与计算极限。在小波坐标下，问题被重新表述为具有异质双边多尺度结构的无限维矩阵回归。作者为Sobolev算子范数损失建立了minimax最优率，并构造了一个有限分辨率块最小二乘估计量，该估计量达到了这一最优率。分析揭示了跨尺度的非均匀局部估计难度，通过分配尺度自适应样本量，该估计量在稠密最小二乘实现中达到了最优计算成本。本文的工作将函数空间中的算子学习视为多尺度逆问题，并给出了清晰的收敛率和计算复杂度分析。对于您非常熟悉的minimax bound和非参数统计工具，这是一次在算子学习这一新兴方向上的直接应用，可作为延伸阅读。
关键技术: Wavelet transform, Infinite-dimensional matrix regression, Block least squares, Minimax lower bound, Sobolev operator norm, Multiscale adaptation
为什么对您有用: 本文直接关联到您的核心兴趣（非参数统计的minimax理论）和武器库中的'minimax bounds for estimation problems'与'inverse problems with random noise'（均very_familiar），可以立即可用这些工具检验该方法的率紧性、或将其推广到其他函数空间。此外，文章对计算成本的多尺度分析也为您正在关注的统计计算tradeoff提供了具体案例，属于立即可做的拓展方向。

7. 2606.16043 — Bias-Reduced GEE via Adjusted Estimating Equations, with Odds-Ratio Extensions¶

作者: Anestis Touloumis
分类: stat.ME
相关性 6/10 · novelty: new_method
摘要: 本文针对聚类数据的广义估计方程（GEE）在小样本下回归估计量偏倚严重的问题，将GEE估计量视为聚类数据M-估计量，推导了调整估计方程以消除一阶偏倚项，同时考虑了工作协方差对均值参数的依赖。该方法框架包含三类偏倚减少估计量和三类一步偏倚校正估计量，覆盖了此前Lunardon & Scharfstein（2017）和Paul & Zhang（2014）的偏倚校正估计量作为特例。方法适用于一般响应类型（通过相关系数参数化关联结构），并首次扩展到配对优势比参数化的二元相关数据，后者在小样本下因边际均值兼容性约束更宽松而表现更优。在标准正则条件下，六种估计量具有与普通GEE相同的渐近分布；模拟表明它们在保持效率的同时有效减少偏倚，覆盖概率接近名义水平。最后，一个临床试验实例展示了方法的应用。R包geer提供了现成实现。对您而言，本文是M-估计偏倚校正系统框架的典型范例，与您 moderately_familiar 的M-估计理论直接相关，且软件实现可借鉴到您自己的统计计算工具箱中。
关键技术: bias-reduced estimating equations, M-estimation, one-step bias correction, correlation coefficient parameterization, pairwise odds-ratio parameterization
为什么对您有用: 本文属于 semiiparametric 理论中 M-估计的偏倚校正方法，直接连接到研究者 moderately_familiar 的 M-估计理论子方向。研究者可用 very_familiar 的 minimax bounds 技术分析该偏倚校正估计量的有限样本性能，同时调整估计方程的推导思路可迁移到因果推断（如 AIPW 的小样本偏倚校正）。立即可做：研究者已熟悉 M-估计理论和软件开发，可直接阅读 R 包 geer 源码，并尝试将类似偏倚校正应用于因果推断中的估计方程。

数理统计 / 假设检验 (hypothesis_testing, 4 篇)¶

1. 2606.16089 — Wild bootstrap for mean response inference in functional linear regression models¶

作者: Hyemin Yeon, Xiongtao Dai, Daniel Nordman
分类: stat.ME · math.ST · stat.TH
相关性 7/10 · novelty: new_method
摘要: 本文针对函数型线性回归模型中均值响应的推断问题，提出了一种wild bootstrap方法。传统残差bootstrap计算快但无法处理异方差误差，而配对bootstrap适用范围广但计算成本高。所提wild bootstrap在保持计算效率的同时，能像配对bootstrap一样适用于异方差误差等复杂设定。理论上证明了bootstrap估计的相合性，并给出了截断参数选择的实用建议。数值模拟表明该方法在置信区间覆盖率和计算速度上均有良好表现。通过天气数据示例展示了实际应用，并提供了R包实现。该方法为函数型回归中的不确定性量化提供了一种快速可靠的替代方案。
关键技术: wild bootstrap, functional linear regression, mean response inference, bootstrap consistency, heteroscedastic errors, truncation level selection
为什么对您有用: 本文与您的假设检验兴趣直接相关，提供了一种在函数型线性模型中经理论保证的不确定性量化方法，可启发您对bootstrap在复杂推断问题中适用性的理解。您非常熟悉的nonparametric statistics和software development能力可帮助您迅速评估其技术细节，并可能将wild bootstrap的思想迁移到因果推断中的敏感性分析或U统计量的推断上。立即可做：您已有足够的bootstrap和渐近理论基础来复现或扩展该方法至其他设置。

2. 2606.16683 — Two fully specified Bayes factors for hypothesis testing and sensitivity analysis in process tracing¶

作者: Matias L\'opez, Jake Bowers, Daniel Gajardo Cooper
分类: stat.ME · stat.OT
相关性 7/10 · novelty: new_method
摘要: 在 process tracing（定性因果推断的小样本案例研究）设定下，本文旨在解决 Fairfield & Charman (2022) 提出的 Bayes factor 方法中需手动指定证据概率而引入主观偏差的问题。作者通过构建两个完全指定的生成观测模型（fully specified generative models），直接从模型推导出证据概率，从而得到完全指定的 Bayes factor 用于假设检验。该框架允许研究者报告正向结论在翻转前能吸收多少观测偏差（observation bias），并将对 smoking gun 权重的依赖纳入敏感性分析。实证部分将该方法应用于六篇政治学顶刊的 process tracing 研究，表明最终结论更多由敏感性检验驱动而非 Bayes factor 本身。对您而言，本文展示了在小样本定性因果推断中如何将贝叶斯假设检验与敏感性分析形式化，可作为因果推断敏感性分析方向的一个跨领域参考。
关键技术: fully specified Bayes factor, generative model of observation, sensitivity analysis for observation bias, process tracing hypothesis testing, smoking gun weight
为什么对您有用: 本文连接到因果推断中的敏感性分析子方向，但核心是贝叶斯假设检验在定性小样本研究中的应用，而非半参数/高维效率理论框架。用您武器库中 very_familiar 的 minimax bounds 或 moderately_familiar 的 identification theory 难以直接攻入此贝叶斯主观概率设定——该框架缺乏频率学派效率理论或高维结构。暂不可做：核心机器（贝叶斯生成模型的概率指定与偏差敏感性度量）不在您的频率学派效率与高维武器库内，且方法论迁移空间有限。

3. 2606.14837 — Bartlett adjustment for Gaussian random effects meta-analysis¶

作者: Haben Michael
分类: stat.ME
相关性 6/10 · novelty: minor
摘要: 在随机效应荟萃分析的Gaussian模型设定下，针对小样本（研究数量少）时似然比检验的χ²近似精度不足的问题，推导了Bartlett校正因子。该工作基于对数似然函数的高阶渐近展开，计算出校正因子中依赖于模型参数的项，并修正了文献中已有的错误公式。Bartlett校正通过将似然比统计量乘以因子1+B/n来改善其有限样本分布与渐近χ²分布之间的偏差。具体推导过程中，利用了累积量展开和期望偏差的二阶分析。主要结果为该校正因子的正确闭合形式表达式及其在有限样本下改进检验水平表现的理论验证。该工作展示了高阶渐近理论在解决小样本推断问题中的直接应用，与您对假设检验和数学统计的兴趣高度吻合。
关键技术: Bartlett correction, likelihood ratio test, small-sample asymptotics, higher-order expansion, random effects meta-analysis
为什么对您有用: 本文直接关联您对假设检验（特别是有限样本性质）这一主要兴趣点。您熟悉的非参数渐近理论（very_familiar）可用于理解其高阶展开框架，但Bartlett校正的具体计算依赖于似然函数的解析结构，属于M-estimation范畴（moderately_familiar）。因此，该工作为中期可做：需先在M-estimation理论或高阶似然展开技术上积累，方能有效复现或推广该校正方法。此外，若您关注higher-order U-statistics与HOIF中的高阶思想，本文的展开技术也可作为对比参考。

4. 2606.15097 — Separate versus pooled winsorization for group mean contrasts: a finite-sample theory¶

作者: Chao Cheng (Washington University in St. Louis), Chenshan Hu (University of Colorado Boulder), Yukai Huang (Suffolk University)
机构: Washington University in St. Louis · University of Colorado Boulder · University of Colorado System · Suffolk University
分类: stat.ME · stat.AP
相关性 6/10 · novelty: new_theory
摘要: 在两组均值比较（如随机试验、双重差分）中，数据重尾时常用 winsorization 进行稳健估计，但现有两种策略——合并后截断（pooled winsorization）和分组内截断（separate winsorization）——缺乏有限样本理论支持。本文建立了两种策略的有限样本偏差界，并证明了合并截断的一个不可能结果：任何确定性规则选取截断水平都无法达到 sub-Gaussian 收敛速率。相反，分组内截断可以达到该速率，且该保证可扩展到任意线性对比（如对照组均值差）。模拟验证显示合并截断存在显著偏差，而分组内截断几乎无偏且集中性良好。因此本文建议在分组内分别截断而非合并后截断。该结果提供了重尾分布下均值对比的精确有限样本理论，可借助 nonparametric statistics 和 minimax bounds 工具分析其他稳健估计器的类似界。
关键技术: winsorization, finite-sample deviation bounds, sub-Gaussian rate, impossibility result, group mean contrasts
为什么对您有用: 直接关联 hypothesis testing 中均值的稳健估计问题，属于 primary interest 的数学统计方向；可用非常熟悉的技术（nonparametric statistics、minimax bounds）分析其他稳健方法（如 Huber、trimming）的有限样本性能，属于立即可做的探索方向——利用类似的不可能性证明框架或有限样本界推导技术即可动手。

统计计算 / 算法 (stat_computing, 1 篇)¶

1. 2606.16681 — Spectral Sparsification of Laplacian-Constrained Gaussian and H\"usler-Reiss Graphical Models¶

作者: Ignacio Echave-Sustaeta Rodr\'iguez, Aida Abiad, Frank R\"ottger
分类: stat.ME · stat.ML
相关性 7/10 · novelty: new_method
摘要: 本文研究 Laplacian 约束的高斯图模型 (LCGGM) 和 Hüsler-Reiss 极值图模型的结构学习问题。这两个模型的似然函数均以图拉普拉斯矩阵为参数，且限制边权为正时无需调参即可学习图结构，但估计结果通常过于稠密，限制了可解释性和可扩展性。为改善估计精度，作者提出将谱图稀疏化作为后估计步骤：用一个谱接近的更稀疏的拉普拉斯矩阵替换原始估计，并在稀疏图上重新拟合模型，得到两种新方法 Spectral-LCGGM 和 Spectral-HR。理论部分分析了所提估计量的性质，包括谱逼近误差和模型拟合的收敛性。模拟在 Erdős–Rényi 和随机块模型图上验证了方法的有效性，真实数据应用展示了其在网络拓扑学习和极值依赖建模中的潜力。对您而言，本文属于统计计算中算法优化的典型案例：谱稀疏化是一种计算友好的后处理技巧，可与您熟悉的图结构学习或高维协方差估计问题结合，但需要预先补充谱图理论的基础知识（中期可做）。
关键技术: spectral graph sparsification, Laplacian-constrained Gaussian graphical model, Hüsler-Reiss graphical model, graph structure learning, Erdős–Rényi graphs, stochastic block model
为什么对您有用: 本文连接您的主要兴趣“统计计算（数值方法与算法）”，具体在“图模型估计后的稀疏化算法”这一子方向。您非常熟悉的“软件开发”技能可直接用于实现和测试该方法，但其理论分析需要谱图理论（不在当前武器库中），故暂不可做——需先学习中位数/谱近似等工具。若您后续关注统计-计算权衡中的具体算法设计，本文可作为入门读物。

其他 (other, 1 篇)¶

1. 2606.16373 — Higher-order spectral perturbation expansions II: Kernel matrices and manifold learning¶

作者: Bernhard Stankewitz, Martin Wahl
分类: math.ST · math.PR · math.SP · stat.TH
相关性 7/10

🗂 其他论文（仅 LLM 评分，未生成摘要）¶

未生成中文摘要的论文，按 LLM 评分由高到低排列，仅保留评分与简评，便于回溯查全。一般为相关性低于展示阈值者；个别历史页也含当时因单日摘要上限未展开的高分篇目（评分仍清楚标着）。

1. 2606.15002 — Decision Theory for the Archetype Discovery Problem¶

作者: Jos\'e Luis Montiel Olea, Amilcar Velez, Zhuoheng Xu, Haomin Yu, Shunqi Zhang
分类: econ.EM · stat.AP · stat.ME
相关性 7/10 · novelty: new_method
评分理由: Decision-theoretic approach for summarizing heterogeneous policy effects, connecting to causal inference and treatment effect heterogeneity.

2. 2606.16080 — Bayesian joint modelling using semiparametric accelerated failure time approaches¶

作者: Ding Ma, Patrick Maher, Andrew Martin
分类: stat.ME
相关性 6/10
评分理由: Semiparametric joint modeling of longitudinal and time-to-event data touches semiparametric theory and causal inference (informative censoring), though focus is more applied.

3. 2606.14899 — Direct High-Resolution Imaging of Earth-like Exoplanets with the Solar Gravitational Lens¶

作者: Slava G. Turyshev
分类: astro-ph.IM · astro-ph.EP
相关性 6/10
评分理由: Astrostats gateway: inverse problem (regularized Fourier/Wiener) with explicit noise/operator models is a genuine data-analysis question, but heavy astroph jargon limits outsider accessibility.

4. 2606.16922 — Graph Neural Networks Trained on Null Signal for Angle Reconstruction in X-ray Polarimetry¶

作者: Vittorio Latorre (Universtitas Mercatorum, Piazza Mattei 10, 00186 Rome, Italy), Victor Rodriguez-Fernandez (Universidad Politecnica de Madrid, Madrid, Spain), Alessandro Di Marco (INAF-IAPS, Via Fosso Del Cavaliere 100, 00133 Rome, Italy), Fabio La Monaca (INAF-IAPS, Via Fosso Del Cavaliere 100, 00133 Rome, Italy), Fabio Muleri (INAF-IAPS, Via Fosso Del Cavaliere 100, 00133 Rome, Italy), Paolo Soffitta (INAF-IAPS, Via Fosso Del Cavaliere 100, 00133 Rome, Italy)
机构: Mercatorum University · Universidad Politécnica de Madrid · National Institute for Astrophysics
分类: astro-ph.IM
相关性 6/10
评分理由: Astrostats gateway: GNNs on sparse non-Euclidean data to avoid geometric bias is an interesting data-analysis problem, though the ML focus somewhat obscures deeper statistical modeling.

5. 2606.14800 — Bridging data-driven priors via the score function for posterior sampling -- Comparative review and experimental study¶

作者: Elhadji Cisse Faye, Mame Diarra Fall, Sylvain Delchini, Nicolas Dobigeon
分类: stat.ME · cs.LG · eess.IV · stat.ML
相关性 5/10
评分理由: Bayesian inverse problem review with score-based priors; method comparison, not core primary interest.

6. 2606.14902 — Heterogeneous behavioral mechanisms in epidemiological models¶

作者: Jessica Pavani, Rob Deardon, Alexandra M. Schmidt
分类: stat.ME · stat.AP
相关性 5/10
评分理由: Epidemiological model with behavioral heterogeneity; moderate relevance as epidemiology application.

7. 2606.15145 — On the estimation of the median odds ratio for measuring contextual effects in multilevel binary data from complex survey designs¶

作者: Shafayet Khan Shafee, M. Shafiqur Rahman
分类: stat.ME
相关性 5/10
评分理由: Median odds ratio for multilevel binary data; limited overlap with primary interests.

8. 2606.15237 — Optimized Sequential Testing for Binary Ensemble Classifiers¶

作者: Joseph Kalman, Amit Moscovich
分类: stat.ME
相关性 5/10
评分理由: Sequential testing for ensemble classifiers touches statistical computing and hypothesis testing, but focuses on ML cost-reduction rather than deep mathematical or causal theory.

9. 2606.15526 — Latent Variable Models for Distributional Features¶

作者: Luna Fazio, Paul-Christian B\"urkner
分类: stat.ME
相关性 5/10
评分理由: Latent variable models for distributional features involve estimation theory, but the psychological application and two-step estimation focus are methodologically adjacent rather than central.

10. 2606.15933 — A Comparison of \(\texttt{R}\) Packages for Estimating Generalized Linear Mixed Models¶

作者: Xiang Li, Mirko Signorelli
分类: stat.ME · stat.CO
相关性 5/10
评分理由: Systematic comparison of R packages for GLMMs; touches statistical computing/software but lacks theoretical depth.

11. 2606.16289 — Moment-Free Kunchenko Stochastic Polynomials via Empirical Characteristic Function¶

作者: Serhii Zabolotnii
分类: stat.ME · math.ST · stat.TH
相关性 5/10
评分理由: Stochastic polynomials via empirical characteristic function touches estimation theory and heavy-tailed laws, but the specific polynomial construction is somewhat tangential to primary interests.

12. 2606.16872 — Towards Fair Predictions: Group Conditional Concordance Index to Quantify Fairness in Time-to-Event Prognostication¶

作者: Haoyuan Wang, Riddhiman Bhattacharya, Richardo Henao, Daniel Wojdyla, Chuan Hong, Matthew Engelhard
分类: stat.ME
相关性 5/10
评分理由: 生存分析公平性度量，与主要统计兴趣相关度较低，属于应用方法学扩展。

13. 2606.17022 — Learning the Geometry of Data: A Mathematical Review of Shape Space Analysis¶

作者: Gary P. T. Choi, Khanh Dao Duc, Shira Faigenbaum-Golovin, Karen Habermann, Emmanuel Hartman, Christoph von Tycowicz, Chi Zhang, Wenjun Zhao, Felix Zhou
分类: math.ST · cs.LG · stat.ML · stat.TH
相关性 5/10
评分理由: Review of shape spaces is a statistical geometry survey, not directly advancing your primary interests.

14. 2606.14769 — Agentomics: Economic Foundations for the Valuation, Attribution, and Pricing of AI Agents in Human-AI Workflows¶

作者: Quanyan Zhu
分类: econ.EM · cs.AI · cs.GT
相关性 5/10
评分理由: Economic framework for AI agent valuation, tangential to your statistical primary interests.

15. 2606.14836 — To Combine or Not? Consolidating Horizontal Acquisitions in Multi-sided Market¶

作者: Pallavi Pal
分类: econ.EM
相关性 5/10
评分理由: Applied econ paper using APC decomposition for merger effects; weak secondary match to econ/causal but lacks deep causal methodology.

16. 2606.14887 — Estimating Sloppy Directions via KDE: The Case of Kirman's Ants¶

作者: Karl Naumann-Woleske
分类: econ.EM · econ.GN · q-fin.EC
相关性 5/10
评分理由: KDE-based sloppy model analysis, loosely connected to statistical computing but not a primary interest.

17. 2606.15593 — Discrete Choice and Competitive Reactions: End-to-End Simulation with the R Package cash¶

作者: Jan H. R. Dressler, Peter Kurz, Winfried J. Steiner
分类: econ.EM
相关性 5/10
评分理由: R package for discrete choice competitive simulation, relevant as an economic application and software development, but not central to primary interests.

18. 2606.15748 — Estimating Demand for a New Product¶

作者: Sizhong Sun
分类: econ.EM
相关性 5/10
评分理由: Demand estimation method from willingness-to-pay data, relevant to economic theory and applied econometrics but not directly aligned with primary statistical interests.

19. 2606.16773 — Generative Predictive Distributions for Time Series¶

作者: Jordi Llorens-Terrazas, Mika Meitz
分类: econ.EM · stat.ME · stat.ML
相关性 5/10
评分理由: Generative predictive distributions for nonlinear time series, touches on statistical computing and nonparametric modeling but not a primary focus.

20. 2606.15361 — Spiking Neural Dedispersion: A Neuromorphic Fast Radio Burst Detection Pipeline¶

作者: Alessio Magro
分类: astro-ph.IM
相关性 5/10
评分理由: Astrostats gateway: neuromorphic computing for dedispersion presents a clear stat-comp tradeoff (resource vs sensitivity), but heavy domain-specific architecture limits accessibility.

21. 2606.15705 — Transfer learning for transient search with small-field optical survey telescopes¶

作者: Kumar Pranshu, Kuntal Misra, Rithesh A, Jean Surdej, Sarvesh Kumar Yadav
分类: astro-ph.IM
相关性 5/10
评分理由: Astrostats gateway: transfer learning for transient search touches real data-analysis challenges (small labeled datasets), but ML application is standard and lacks deep statistical modeling exposition.

22. 2606.14921 — Flexible Method Comparison with the Probability of Agreement¶

作者: Nathaniel T. Stevens
分类: stat.ME
相关性 4/10
评分理由: Method comparison via probability of agreement; peripheral to primary interests.

23. 2606.15478 — A Bayesian Functional Accelerated Failure-Time Model with Varying Effects Correcting for Measurement Error¶

作者: Joseph Yang, Roger Zoh, Carmen Tekwe, Lan Xue
分类: stat.ME
相关性 4/10
评分理由: Bayesian functional AFT model with measurement error touches semiparametric theory, but the focus is applied biostatistics rather than the researcher's primary efficiency or high-dim theory.

24. 2606.16058 — Jeffreys-Type Penalized GEE for Correlated Binary Data with an Odds-Ratio Parameterization¶

作者: Anestis Touloumis
分类: stat.ME
相关性 4/10
评分理由: Penalized GEE for correlated binary data; touches M-estimation and small-sample inference, but narrowly focused.

25. 2606.16488 — An Energy-Driven Framework for Privacy-Aware Synthetic Data Generation¶

作者: Pierpaolo Massoli, Fabio Spagnuolo
分类: stat.ME
相关性 4/10
评分理由: Energy-driven synthetic data generation uses Bayesian networks and MH sampling, touching statistical computing, but privacy/synthetic data is not a core interest.

26. 2606.16689 — Statistical methods for assessing non-replicable, outlying, and influential studies¶

作者: Yefeng Yang, Shinichi Nakagawa
分类: stat.ME
相关性 4/10
评分理由: Meta-analysis methods for assessing non-replicable and outlying studies is a hypothesis testing application, but the framework is far from the researcher's mathematical statistics focus.

27. 2606.16708 — Risks and Uncertainty in Monetary Policy¶

作者: Tobias Adrian, Domenico Giannone, Matteo Luciani, Mike West
分类: econ.EM
相关性 4/10
评分理由: Weak secondary econ match: unifies scenario analysis and distributional forecasting via conditional predictive densities, but lacks causal inference or semiparametric methodology.

28. 2606.15012 — A Kuramoto-von Mises Time Series Model for Probabilistic Modeling of Coupled Oscillators¶

作者: Yun Hwang, Todd P. Coleman
分类: stat.ME · q-bio.QM
相关性 3/10
评分理由: Oscillator time series model for neuroscience/GI; unrelated to researcher's focus.

29. 2606.15360 — Generating-Element Maximum Entropy for Non-Gaussian Uncertainty Evaluation¶

作者: Serhii Zabolotnii
分类: stat.ME · physics.data-an · stat.CO
相关性 3/10
评分理由: Maximum entropy density reconstruction is an inverse problem, but the focus on measurement uncertainty evaluation and fractional-power elements is tangential to the researcher's core interests.

30. 2606.15953 — Drift-Aware Spectral Conformal Prediction for Non-Exchangeable Streaming Data¶

作者: Jeffery Opoku, David Banahene
分类: stat.ME
相关性 3/10
评分理由: Conformal prediction for streaming data; methodological overlap with nonparametrics but not a core interest.

31. 2606.16455 — hyreg2: An R package to Estimate Latent Classes on a Mixture of Continuous and Dichotomous Data¶

作者: Svenja Elkenkamp, John Grosser, Kim Rand
分类: stat.ME · stat.AP
相关性 3/10
评分理由: R package for latent class models on mixed data is applied software, but the latent class methodology is unrelated to the researcher's primary theoretical interests.

32. 2606.15354 — GridFire: A new, open source, nuclear network for stellar structure and evolution¶

作者: Emily M. Boudreaux, Aaron Dotter
分类: astro-ph.IM · astro-ph.SR
相关性 3/10
评分理由: Nuclear network numerics for stellar evolution; lacks a broader statistical data-analysis question accessible to an outsider, pure astrophysics software.

33. 2606.15445 — Interim Monitoring as an Information-Time Alignment Problem: The WCR Framework for Time-to-Event Trials¶

作者: Haitao Pan, Zhongheng Cai
分类: stat.ME
相关性 2/10
评分理由: Clinical trial interim monitoring framework is unrelated to the researcher's primary interests in causal inference, high-dim theory, or semiparametric efficiency.

34. 2606.15962 — p-PSO: A Penalized Particle Swarm Optimization Technique for Finding D-Optimal Designs with Mixed Factors in Generalized Linear Models¶

作者: Shrabanti Chowdhury, Abhyuday Mandal
分类: stat.ME · cs.LG
相关性 2/10
评分理由: Metaheuristic optimization for experimental design; tangential to primary interests.

35. 2606.15839 — Shigatse Astronomical Site Testing. I. Cloud-cover Climatology and Selected Local Meteorological Conditions¶

作者: Baiyu Zhang, Hejun Yang, Xiaojun Dong, Lingling Wang, Juean Luobu, Minfeng Gu, Xiyan Peng, Hao Luo, Yindun Mao, ZhaoXiang Qi, Basangzeren, Qihang He, Guojie Feng, Chunhai Bai, Ali Esamdin, Wenbo Gu, Siqi Wang, Zihuang Cao
分类: astro-ph.IM
相关性 2/10
评分理由: Pure meteorological/climatological site testing for an observatory; no substantive statistical methodology or data-analysis question for a general statistician.

36. 2606.15876 — Archetypal analysis of European 10-year government bond yields with multidimensional scaling of two-mode three-way asymmetric dissimilarities¶

作者: Aleix Alcacer, Rafael Benitez, Vicente J. Bolos, Irene Epifanio
分类: stat.ME · stat.AP
相关性 1/10
评分理由: Archetypal analysis of bond yields via asymmetric dissimilarities; unrelated to primary/secondary interests.

37. 2606.15881 — Biarchetype analysis for univariate functional data. An application to macroeconomic financial time series¶

作者: Aleix Alcacer, Rafael Benitez, Vicente J. Bolos, Irene Epifanio
分类: stat.ME · cs.LG · stat.AP
相关性 1/10
评分理由: Biarchetype analysis for functional time series; unrelated to primary/secondary interests.

Maintained by 陈星宇 · Homepage · Source on GitHub

2026-06-17 每日 arXiv 资讯¶

⭐ 高相关论文（按主题分组）¶

因果推断 (causal_inference, 4 篇)¶

1. 2606.14840 — Causal Sufficient Dimension Reduction for Multiple Continuous Exposures with an Application to Environmental Mixtures¶

2. 2606.15754 — Bounding Causal Effects for Ordinal Outcomes Under Positive Dependence¶

3. 2606.14977 — Identification and Inference for Algorithmic Frontiers with Selective Labels¶

4. 2606.16230 — Semiparametric Dynamic Logit Model with Endogenous Networks¶

非参数 / 半参数 (nonparam_semipara, 3 篇)¶

1. 2606.15433 — Limit theorems of Azadkia-Chatterjee's conditional graph correlation¶

2. 2606.16179 — On the Geometry of Separation in Finite Gaussian Mixtures¶

3. 2606.16179 — On the Geometry of Separation in Finite Gaussian Mixtures¶

效率理论 / Debiased ML (efficiency_dml, 1 篇)¶

1. 2606.15602 — Bias-Aware External-Model-Assisted Inference in High-Dimensional Regression¶

数理统计 / 假设检验 (hypothesis_testing, 2 篇)¶

1. 2606.15636 — Paired Sample Tests for High-dimensional Uncorrelatedness via Random Integration¶

2. 2606.15433 — Limit theorems of Azadkia-Chatterjee's conditional graph correlation¶

天体统计 (astrostats, 1 篇)¶

1. 2606.16525 — Scalable Bayesian data curation for next-generation radio experiments¶

📌 中相关论文（按主题分组）¶

因果推断 (causal_inference, 1 篇)¶

1. 2606.15031 — Partial Identification from LLM Prompts¶

非参数 / 半参数 (nonparam_semipara, 7 篇)¶

1. 2606.15450 — Kernel Density Estimation by Spectral Decomposition: Data-Driven Tapering and Superposition¶

2. 2606.15836 — Minimax Synthesis of Network Mechanisms¶

3. 2606.16913 — Optimal Multiscale Learning of Linear Operators¶

4. 2606.15836 — Minimax Synthesis of Network Mechanisms¶

5. 2606.16373 — Higher-order spectral perturbation expansions II: Kernel matrices and manifold learning¶

6. 2606.16913 — Optimal Multiscale Learning of Linear Operators¶

7. 2606.16043 — Bias-Reduced GEE via Adjusted Estimating Equations, with Odds-Ratio Extensions¶

数理统计 / 假设检验 (hypothesis_testing, 4 篇)¶

1. 2606.16089 — Wild bootstrap for mean response inference in functional linear regression models¶

2. 2606.16683 — Two fully specified Bayes factors for hypothesis testing and sensitivity analysis in process tracing¶

3. 2606.14837 — Bartlett adjustment for Gaussian random effects meta-analysis¶

4. 2606.15097 — Separate versus pooled winsorization for group mean contrasts: a finite-sample theory¶

统计计算 / 算法 (stat_computing, 1 篇)¶

1. 2606.16681 — Spectral Sparsification of Laplacian-Constrained Gaussian and H\"usler-Reiss Graphical Models¶

其他 (other, 1 篇)¶

1. 2606.16373 — Higher-order spectral perturbation expansions II: Kernel matrices and manifold learning¶

🗂 其他论文（仅 LLM 评分，未生成摘要）¶

1. 2606.15002 — Decision Theory for the Archetype Discovery Problem¶

2. 2606.16080 — Bayesian joint modelling using semiparametric accelerated failure time approaches¶

3. 2606.14899 — Direct High-Resolution Imaging of Earth-like Exoplanets with the Solar Gravitational Lens¶

4. 2606.16922 — Graph Neural Networks Trained on Null Signal for Angle Reconstruction in X-ray Polarimetry¶

5. 2606.14800 — Bridging data-driven priors via the score function for posterior sampling -- Comparative review and experimental study¶

6. 2606.14902 — Heterogeneous behavioral mechanisms in epidemiological models¶

7. 2606.15145 — On the estimation of the median odds ratio for measuring contextual effects in multilevel binary data from complex survey designs¶

8. 2606.15237 — Optimized Sequential Testing for Binary Ensemble Classifiers¶

9. 2606.15526 — Latent Variable Models for Distributional Features¶

10. 2606.15933 — A Comparison of \(\texttt{R}\) Packages for Estimating Generalized Linear Mixed Models¶

11. 2606.16289 — Moment-Free Kunchenko Stochastic Polynomials via Empirical Characteristic Function¶

12. 2606.16872 — Towards Fair Predictions: Group Conditional Concordance Index to Quantify Fairness in Time-to-Event Prognostication¶

13. 2606.17022 — Learning the Geometry of Data: A Mathematical Review of Shape Space Analysis¶

14. 2606.14769 — Agentomics: Economic Foundations for the Valuation, Attribution, and Pricing of AI Agents in Human-AI Workflows¶

15. 2606.14836 — To Combine or Not? Consolidating Horizontal Acquisitions in Multi-sided Market¶

16. 2606.14887 — Estimating Sloppy Directions via KDE: The Case of Kirman's Ants¶

17. 2606.15593 — Discrete Choice and Competitive Reactions: End-to-End Simulation with the R Package cash¶

18. 2606.15748 — Estimating Demand for a New Product¶

19. 2606.16773 — Generative Predictive Distributions for Time Series¶

20. 2606.15361 — Spiking Neural Dedispersion: A Neuromorphic Fast Radio Burst Detection Pipeline¶

21. 2606.15705 — Transfer learning for transient search with small-field optical survey telescopes¶

22. 2606.14921 — Flexible Method Comparison with the Probability of Agreement¶

23. 2606.15478 — A Bayesian Functional Accelerated Failure-Time Model with Varying Effects Correcting for Measurement Error¶

24. 2606.16058 — Jeffreys-Type Penalized GEE for Correlated Binary Data with an Odds-Ratio Parameterization¶

25. 2606.16488 — An Energy-Driven Framework for Privacy-Aware Synthetic Data Generation¶

26. 2606.16689 — Statistical methods for assessing non-replicable, outlying, and influential studies¶

27. 2606.16708 — Risks and Uncertainty in Monetary Policy¶

28. 2606.15012 — A Kuramoto-von Mises Time Series Model for Probabilistic Modeling of Coupled Oscillators¶

29. 2606.15360 — Generating-Element Maximum Entropy for Non-Gaussian Uncertainty Evaluation¶

30. 2606.15953 — Drift-Aware Spectral Conformal Prediction for Non-Exchangeable Streaming Data¶

31. 2606.16455 — hyreg2: An R package to Estimate Latent Classes on a Mixture of Continuous and Dichotomous Data¶

32. 2606.15354 — GridFire: A new, open source, nuclear network for stellar structure and evolution¶

33. 2606.15445 — Interim Monitoring as an Information-Time Alignment Problem: The WCR Framework for Time-to-Event Trials¶

34. 2606.15962 — p-PSO: A Penalized Particle Swarm Optimization Technique for Finding D-Optimal Designs with Mixed Factors in Generalized Linear Models¶

35. 2606.15839 — Shigatse Astronomical Site Testing. I. Cloud-cover Climatology and Selected Local Meteorological Conditions¶

36. 2606.15876 — Archetypal analysis of European 10-year government bond yields with multidimensional scaling of two-mode three-way asymmetric dissimilarities¶

37. 2606.15881 — Biarchetype analysis for univariate functional data. An application to macroeconomic financial time series¶

评论