2026-05-26 每日 arXiv 资讯¶

高相关论文 12 篇 · 中相关 13 篇 · 其他 18 篇 · 会议/Seminar 事件 0 条

⭐ 高相关论文（按主题分组）¶

因果推断 (causal_inference, 3 篇)¶

1. 2605.25811 — Geometry Adaptive Counterfactual Distribution Learning with Diffusion-Guided Smoothing¶

作者: Kwangho Kim
分类: stat.ME · cs.LG · stat.ML
相关性 9/10 · novelty: new_theory
摘要: 本文研究高维反事实分布学习，设定为反事实分布可能集中于低维流形结构，而传统各向同性平滑在环境维度下会导致收敛率恶化与局部推断不稳定。作者提出两种基于半参数去偏的扩散引导估计器：扩散引导的密度平滑与扩散引导的分数平滑，将因果混杂调整与扩散分数驱动的几何自适应局部化结合，消除一阶混杂偏差并使平滑方向对齐结果分布的局部几何。理论上建立了平滑密度与分数目标的渐近展开、风险界与推断程序；在结构几何条件下，主导随机误差由扩散引导核诱导的有效维度控制而非环境维度，并在额外逼近条件下获得了环境密度推断。半合成实验（CelebA）验证了几何自适应方法更陡峭的误差衰减。对您有用：将半参数去偏与流形几何结合的思路，可直接迁移至高维因果推断中效率界的讨论。
关键技术: semiparametric debiasing, diffusion score matching, geometry-adaptive smoothing, effective dimension reduction, counterfactual density estimation, nuisance adjustment
为什么对您有用: 本文连接到因果推断中的高维反事实分布估计与半参数效率理论，核心在于用有效维度替代环境维度获得更优收敛率。您可用 very_familiar 的高维渐近理论与 minimax bound 工具验证其声称的有效维度率是否紧，或用 moderately_familiar 的半参数理论审视其一阶去偏构造的 influence function 完整性。Follow-up 判断：中期可做——需先在 moderately_familiar 的半参数理论（特别是非标准光滑参数下的效率界）上长肌肉，以严格推导扩散核诱导有效维度的半参数效率下界。

2. 2605.25687 — Confidence intervals for causal effects in sequential decision making¶

作者: Vladimir Vovk, Ruodu Wang
分类: math.ST · stat.TH
相关性 9/10 · novelty: new_theory
摘要: 在 DAG 因果框架下，本文研究 back-door 与 front-door 准则适用时因果效应（如 ATE）的置信区间与置信序列构造，设定从标准 IID 训练数据延伸到干预依赖历史数据的序贯决策场景。核心方法基于 e-value / testing-by-betting 框架而非传统 Neyman-Pearson 逆概率，将因果效应估计转化为序贯概率检验问题；在 IID 设定下得到与经典 semiparametric efficient interval 一致的紧区间，而在干预依赖过去数据时，区间宽度因 law of the iterated logarithm (LIL) 项而膨胀。当观测数未知（纯序贯设定）时，置信序列包含更多 iterated logarithm 项，宽度进一步增加，明确刻画了 adaptivity 与序贯性对推断精度的代价。主要理论结果给出了三种设定下区间宽度的精确渐近阶，对您有用之处在于：它为 longitudinal / sequential CI 中 adaptivity 导致的效率损失提供了严格的 LIL 界定，直接连接您对 causal inference 与 hypothesis testing 的双重兴趣。
关键技术: e-value, confidence sequences, law of the iterated logarithm, back-door and front-door criteria, testing-by-betting, sequential decision making
为什么对您有用: 本文直接连接 causal inference 的 longitudinal/sequential 设定与 mathematical statistics 的 hypothesis testing，用 LIL 严格量化了 adaptivity 对因果效应置信区间宽度的代价，这是您两个 primary interest 的交汇点。您武器库中 very_familiar 的 minimax bounds 与 estimation theory in causal inference 可以立即用来验证其 IID 设定下的区间是否达到 semiparametric efficiency bound，以及 LIL 项在何种干预依赖机制下不可缩减。follow-up 判断：立即可做——用 semiparametric efficiency bound 检查其 IID 区间的紧性，并尝试将 e-value 框架与 higher-order influence function 结合以改善非参数设定下的宽度。

3. 2605.25687 — Confidence intervals for causal effects in sequential decision making¶

作者: Vladimir Vovk, Ruodu Wang
分类: math.ST · stat.TH
相关性 9/10 · novelty: new_theory
摘要: 在 DAG 因果框架下，本文研究 back-door 与 front-door 准则适用时因果效应（如 ATE）的置信区间与置信序列构造，设定从标准 IID 训练数据延伸到干预依赖历史数据的序贯决策场景。核心方法基于 e-value / testing-by-betting 框架而非传统 Neyman-Pearson 逆概率，将因果效应估计转化为序贯概率检验问题；在 IID 设定下得到与经典 semiparametric efficient interval 一致的紧区间，而在干预依赖过去数据时，区间宽度因 law of the iterated logarithm (LIL) 项而膨胀。当观测数未知（纯序贯设定）时，置信序列包含更多 iterated logarithm 项，宽度进一步增加，明确刻画了 adaptivity 与序贯性对推断精度的代价。主要理论结果给出了三种设定下区间宽度的精确渐近阶，对您有用之处在于：它为 longitudinal / sequential CI 中 adaptivity 导致的效率损失提供了严格的 LIL 界定，直接连接您对 causal inference 与 hypothesis testing 的双重兴趣。
关键技术: e-value, confidence sequences, law of the iterated logarithm, back-door and front-door criteria, testing-by-betting, sequential decision making
为什么对您有用: 本文直接连接 causal inference 的 longitudinal/sequential 设定与 mathematical statistics 的 hypothesis testing，用 LIL 严格量化了 adaptivity 对因果效应置信区间宽度的代价，这是您两个 primary interest 的交汇点。您武器库中 very_familiar 的 minimax bounds 与 estimation theory in causal inference 可以立即用来验证其 IID 设定下的区间是否达到 semiparametric efficiency bound，以及 LIL 项在何种干预依赖机制下不可缩减。follow-up 判断：立即可做——用 semiparametric efficiency bound 检查其 IID 区间的紧性，并尝试将 e-value 框架与 higher-order influence function 结合以改善非参数设定下的宽度。

非参数 / 半参数 (nonparam_semipara, 1 篇)¶

1. 2605.25519 — Identification and Estimation of Semiparametric Multilayered Sample Selection Models¶

作者: Dongwoo Kim
分类: econ.EM
相关性 9/10 · novelty: new_theory
摘要: 在多层样本选择模型（先参与决策、再有序/多项排序）框架下，目标是识别并估计选择偏差下的平均因果效应/回归参数，关键假设为非参数控制函数与连续协变量变异。不同于二元选择仅需单指标控制函数，有序与多项排序层产生多指标控制函数，其维度决定了识别所需的连续协变量变异性；本文建立了匹配的非识别与点识别结果，证明选择结构的非线性可替代排除变量（excluded variables）的作用。进一步引入结构约束可降低控制函数维度，使估计可行；提出两步 sieve plug-in 估计器，证明其 root-n 一致性与渐近正态性。实证应用于韩国大学毕业生性别工资差距，发现校正排序后大企业就业的女性系数转为正。对您有用：多指标控制函数的维度与识别条件分析直接连接 semiparametric theory 与 identification theory in causal inference。
关键技术: multilayered sample selection model, multi-index control function, non-identification / point-identification geometry, sieve plug-in estimation, root-n-consistent two-step estimator, excluded variable substitution via nonlinearity
为什么对您有用: 直接连接 causal inference 的 identification theory 与 semiparametric theory：多层选择模型的多指标控制函数维度分析为非参数识别提供了新的几何视角。用 technical_arsenal 中 moderately_familiar 的 semiparametric theory 与 identification theory in causal inference 可攻其 sieve plug-in 估计器的渐近性质与效率界问题——当前武器库足以展开。follow-up 判断：立即可做——可用 very_familiar 的 nonparametric statistics 与 moderately_familiar 的 M-estimation theory 验证其 sieve 估计器的收敛率与影响函数推导，并探索是否可构造 one-step / DR 估计器达到 semiparametric efficiency bound。

数理统计 / 假设检验 (hypothesis_testing, 6 篇)¶

1. 2605.25380 — Rank-Based Tests for Mutual Independence of High-Dimensional Random Vectors via \(L_q\) Norm¶

作者: Ping Zhao, Hongfei Wang, Long Feng
分类: stat.ME
相关性 9/10 · novelty: new_method
摘要: 本文研究高维随机向量分量间互独立性检验问题，基于 rank-based max-sum 框架，在简单线性秩统计量、非退化秩 U-statistic 及退化秩 U-statistic 三类相关性度量下，提出有限 q 的 L_q power-sum 检验统计量。所提统计量在 L_2 统计量对 dense alternative 的敏感性与 L_∞ 统计量对 sparse alternative 的敏感性之间进行插值。理论上证明了任意固定有限 L_q 统计量与对应 L_∞ 统计量的渐近独立性，并利用 Cauchy combination rule 融合 L_2, L_4, L_6 与 L_∞ 的 p 值。模拟表明 L_{2,4,6,∞} 组合程序对备择假设稀疏度具有强鲁棒性。对您有用：本文将高阶 U-statistic（退化与非退化）引入高维秩检验，其 power-sum 结构与渐近独立性证明可直接启发您对 higher-order U-statistic 投影与高维极值渐近理论的研究。
关键技术: rank-based U-statistics, L_q power-sum statistic, max-sum testing framework, Cauchy combination rule, asymptotic independence, high-dimensional mutual independence test
为什么对您有用: 本文直接连接 hypothesis testing 与 higher-order U-statistics 两个 primary interest，特别是退化与非退化秩 U-statistic 在高维检验中的 power-sum 极值渐近分析。您可以用 very_familiar 的 higher-order U-statistics computation (treewidth / einsum) 视角审视其 power-sum 统计量的计算复杂度，并用 moderately_familiar 的 theory of higher-order U-statistics 改进其投影与渐近独立性证明。follow-up 判断：立即可做，用现有 U-statistic 理论武器即可对 power-sum 结构做更紧的界或计算优化分析。

2. 2605.24838 — Adaptable High-Dimensional Change Point Detection via Ridge Regularization¶

作者: Haoran Li, Haotian Xu
分类: stat.ME
相关性 8/10 · novelty: new_method
摘要: 在独立高维观测序列均值向量的多变点检测设定下，目标是当维度与样本量可比时，对稠密替代假设进行具有检验力的变点识别。本文提出一族 ridge-regularized CUSUM 统计量，继承 Li et al. (2020) 的 adaptable ridge Hotelling's T2 检验，通过引入 ridge 正则化使样本协方差逆的归一化形式稳定，并实现对总体协方差结构的适应性。在温和条件下，推导了零假设与一类局部替代假设下统计量的极限分布；进一步通过最大化渐近功效建立了选择正则化参数的原则性框架。模拟与 S&P 500 日收益率面板数据应用验证了方法优势。对您有用：该文将 ridge 正则化与 RMT 高维协方差逆的稳定化结合，直接触及高维假设检验与 RMT 交叉领域。
关键技术: ridge-regularized CUSUM statistic, high-dimensional Hotelling's T2 test, sample covariance normalization stabilization, limiting distribution under local alternatives, asymptotic power optimization for regularization parameter, multiple change point detection
为什么对您有用: 本文直接连接高维假设检验与 RMT 两个 primary interest 子方向，核心在于用 ridge 正则化解决高维协方差逆的不稳定性，这正是 RMT Marchenko-Pastur 工具常处理的痛点。用您 very_familiar 的高维渐近工具可以验证其极限分布推导与功效最大化框架的紧性，甚至可以尝试用 minimax bound 刻画该 ridge-CUSUM 在稠密/稀疏替代下的最优检验率。立即可做：用 very_familiar 的高维渐近与 minimax 武器即可动手分析其渐近功效界是否达到 minimax optimal。

3. 2605.24741 — On the Sample Complexity of Robust Binary Hypothesis Testing¶

作者: Shankar Vallinayagam, Ankit Pensia, Varun Jog
分类: math.ST · cs.IT · cs.LG · math.IT · stat.ML · stat.TH
相关性 8/10 · novelty: new_theory
摘要: 本文研究在三种标准污染模型（ε-加性/Huber、ε-减性、ε-总变差/TV）下稳健二元假设检验的样本复杂度，分别记为 n_Hub(ε)、n_Sub(ε) 与 n_TV(ε)。对减性污染模型，作者证明了最不利分布（least favourable distributions）的存在性并给出显式公式，使其与经典的 Huber 和 TV 模型理论对齐。核心发现是：在所有三种模型中，样本复杂度对污染参数 ε 极不稳定，即使 ε 仅发生 o(ε) 的微小扰动，样本复杂度也可能产生多项式级增长；同样，ε 的精确已知与仅知至 o(ε) 误差之间也存在多项式级差距。尽管存在该不稳定性，作者证明了三种模型的样本复杂度在 ε 的常数倍重标度下是可比的，并给出了紧的缩放常数界（如 n_Hub(ε) ≲ n_TV(ε) ≲ n_Hub(2ε)）。最后将结果推广至自适应污染版本。对您可能有用：本文揭示了稳健假设检验中污染参数微小不确定性导致的样本复杂度多项式级放大效应，直接关联到您在 mathematical statistics 与 hypothesis testing 方向对 minimax 界与检验理论的研究。
关键技术: robust binary hypothesis testing, least favourable distributions, Huber contamination model, total variation contamination, subtractive contamination model, sample complexity instability
为什么对您有用: 本文直接推进了您 primary interest 中 mathematical statistics (hypothesis testing) 子方向的理论，系统刻画了三种经典污染模型下样本复杂度的 minimax 界及其对污染参数的多项式级不稳定性。您武器库中 minimax bounds for estimation problems 的经验可直接迁移来审视本文的样本复杂度缩放界是否紧，以及该不稳定性现象在更一般的检验设定中是否同样成立。立即可做：用 very_familiar 的 minimax 理论工具即可动手验证/拓展本文的缩放常数界到其他距离度量或多元检验场景。

4. 2605.25859 — Minimax Limits of k-Fold Cross-Validation via Majority¶

作者: Ido Nachum, R\"udiger Urbanke, Thomas Weinberger
分类: math.ST · cs.LG · stat.TH
相关性 8/10 · novelty: new_theory
摘要: 在二分类 majority algorithm（最小非平凡 ERM）设定下，研究 k-fold CV 作为风险估计器的均方误差（MSE）如何随折数 k 变化，关键假设是折间估计的复杂相依结构。核心发现是：现有理论给出的 bound 在此简单算法上即松散甚至 vacuous，本文通过精细分析揭示了 CV 行为中的微妙现象。进而建立 minimax 框架，证明当折数 k 随样本量 n 增长时，没有任何 ERM 算法能达到 O(1/n) 的 minimax MSE，下界为 Ω(√k/n)。该结果揭示了数据复用策略的根本限制，并纠正了先前理论的不准确之处。对您有用：本文的 minimax 下界推导与精细依赖结构分析，直接连接到 minimax bounds 与 hypothesis testing/estimation 理论兴趣。
关键技术: minimax lower bound, k-fold cross-validation, empirical risk minimization, majority algorithm, fold-wise dependence structure, mean-squared error of risk estimator
为什么对您有用: 本文直接连接到 minimax bounds for estimation problems 这一具体子方向，给出了 CV 风险估计的 Ω(√k/n) minimax 下界，填补了数据复用策略的理论空白。用 very_familiar 的 minimax bounds 武器即可直接检验其下界构造的紧性与证明技巧，属于立即可做的 follow-up：可以尝试将该 minimax 框架推广到更一般的损失函数或 semiparametric 估计设定中。

5. 2605.24741 — On the Sample Complexity of Robust Binary Hypothesis Testing¶

作者: Shankar Vallinayagam, Ankit Pensia, Varun Jog
分类: math.ST · cs.IT · cs.LG · math.IT · stat.ML · stat.TH
相关性 8/10 · novelty: new_theory
摘要: 本文研究在三种标准污染模型（ε-加性/Huber、ε-减性、ε-总变差/TV）下稳健二元假设检验的样本复杂度，分别记为 n_Hub(ε)、n_Sub(ε) 与 n_TV(ε)。对减性污染模型，作者证明了最不利分布（least favourable distributions）的存在性并给出显式公式，使其与经典的 Huber 和 TV 模型理论对齐。核心发现是：在所有三种模型中，样本复杂度对污染参数 ε 极不稳定，即使 ε 仅发生 o(ε) 的微小扰动，样本复杂度也可能产生多项式级增长；同样，ε 的精确已知与仅知至 o(ε) 误差之间也存在多项式级差距。尽管存在该不稳定性，作者证明了三种模型的样本复杂度在 ε 的常数倍重标度下是可比的，并给出了紧的缩放常数界（如 n_Hub(ε) ≲ n_TV(ε) ≲ n_Hub(2ε)）。最后将结果推广至自适应污染版本。对您可能有用：本文揭示了稳健假设检验中污染参数微小不确定性导致的样本复杂度多项式级放大效应，直接关联到您在 mathematical statistics 与 hypothesis testing 方向对 minimax 界与检验理论的研究。
关键技术: robust binary hypothesis testing, least favourable distributions, Huber contamination model, total variation contamination, subtractive contamination model, sample complexity instability
为什么对您有用: 本文直接推进了您 primary interest 中 mathematical statistics (hypothesis testing) 子方向的理论，系统刻画了三种经典污染模型下样本复杂度的 minimax 界及其对污染参数的多项式级不稳定性。您武器库中 minimax bounds for estimation problems 的经验可直接迁移来审视本文的样本复杂度缩放界是否紧，以及该不稳定性现象在更一般的检验设定中是否同样成立。立即可做：用 very_familiar 的 minimax 理论工具即可动手验证/拓展本文的缩放常数界到其他距离度量或多元检验场景。

6. 2605.25859 — Minimax Limits of k-Fold Cross-Validation via Majority¶

作者: Ido Nachum, R\"udiger Urbanke, Thomas Weinberger
分类: math.ST · cs.LG · stat.TH
相关性 8/10 · novelty: new_theory
摘要: 在二分类 majority algorithm（最小非平凡 ERM）设定下，研究 k-fold CV 作为风险估计器的均方误差（MSE）如何随折数 k 变化，关键假设是折间估计的复杂相依结构。核心发现是：现有理论给出的 bound 在此简单算法上即松散甚至 vacuous，本文通过精细分析揭示了 CV 行为中的微妙现象。进而建立 minimax 框架，证明当折数 k 随样本量 n 增长时，没有任何 ERM 算法能达到 O(1/n) 的 minimax MSE，下界为 Ω(√k/n)。该结果揭示了数据复用策略的根本限制，并纠正了先前理论的不准确之处。对您有用：本文的 minimax 下界推导与精细依赖结构分析，直接连接到 minimax bounds 与 hypothesis testing/estimation 理论兴趣。
关键技术: minimax lower bound, k-fold cross-validation, empirical risk minimization, majority algorithm, fold-wise dependence structure, mean-squared error of risk estimator
为什么对您有用: 本文直接连接到 minimax bounds for estimation problems 这一具体子方向，给出了 CV 风险估计的 Ω(√k/n) minimax 下界，填补了数据复用策略的理论空白。用 very_familiar 的 minimax bounds 武器即可直接检验其下界构造的紧性与证明技巧，属于立即可做的 follow-up：可以尝试将该 minimax 框架推广到更一般的损失函数或 semiparametric 估计设定中。

其他 (other, 2 篇)¶

1. 2605.24167 — Modified treatment policies that depend on the natural history of treatment¶

作者: Iv\'an D\'iaz, Nicholas T. Williams, Pawe{\l} Morzywo{\l}ek, Kara E. Rudolph
分类: stat.ME
相关性 9/10 · novelty: new_method
摘要: 在纵向修改治疗策略（LMTP）框架下，目标 estimand 是依赖于治疗自然历史（而非仅当前自然值）的干预效应，如延迟治疗启动的效应，关键假设为纵向顺序可忽略性与一致性。本文提出基于 augmented-data 序列回归形式的纵向 g-computation 公式，构建了 targeted minimum loss-based estimator（TMLE）。估计器基于 efficient influence function，在 outcome 与 treatment 回归满足标准双重稳健收敛速率条件下达到 √n-consistent 与 asymptotically normal，实现 semiparametric efficiency bound。实证应用于评估延迟 risky pain treatment 一个月对 12 个月 opioid use disorder 发病率的影响。对您有用：直接扩展了 longitudinal causal inference 中 LMTP 的 identification 与 estimation 理论，将依赖自然历史的干预纳入 debiased ML / TMLE 框架。
关键技术: longitudinal modified treatment policies, efficient influence function, targeted minimum loss-based estimation, sequential regression g-computation, doubly robust estimation, augmented-data longitudinal estimator
为什么对您有用: 直接连接 longitudinal causal inference 中 LMTP 的 identification 与 estimation 子方向，将依赖治疗自然历史的干预（如 grace period / 延迟启动）纳入 TMLE / efficient influence function 框架。您武器库中 estimation theory in causal inference 与 semiparametric theory（moderately_familiar）可直接攻入其 efficient influence function 推导与双重稳健性证明的细节。立即可做：用 very_familiar 的 causal estimation theory 检验其 augmented sequential regression 构造是否在更弱条件下仍保 √n-CAN，或探索该 LMTP 设定下 higher-order influence function 的可能改进。

2. 2605.24346 — Using the target trial framework for combining information: external comparator analyses and other applications¶

作者: Lawson Ung, Miguel A. Hern\'an, Issa J. Dahabreh
分类: stat.ME
相关性 8/10
摘要: 本文探讨如何利用目标试验框架来规划与报告跨多数据源融合的因果推断问题，核心estimand为不同人群/不同干预下的因果对比（如ATE）。设定涵盖外部对照分析、随机试验结果的外推与可迁移性分析，关键假设涉及各数据源对目标试验组件（入排标准、处理分配等）的测量对齐与不可调和错配的识别。方法学上，文章在目标试验框架中新增了"目标人群及其抽样模型"组件，用以指导eligibility定义、因果模型设定及identification策略选择，并强调通过数据元素映射来暴露多源数据间的错配。本文为概念框架与识别理论层面的梳理，未涉及具体estimator或效率界理论，对您在因果推断identification理论及流行病学/经济学的多源数据应用有直接参考价值。
关键技术: target trial framework, external comparator analysis, transportability, generalizability, identification via data mapping, sampling model specification

📌 中相关论文（按主题分组）¶

因果推断 (causal_inference, 1 篇)¶

1. 2605.25483 — Partial Identification of Causal Effects that Vary by Setting¶

作者: Nick Huntington-Klein
分类: econ.EM
相关性 7/10 · novelty: new_method
摘要: 在多设定（multi-setting）准实验因果推断框架下，目标 estimand 为跨不同数据源的同一因果效应；当部分设定缺乏外生变异而仅能依赖条件独立性假设（selection-on-observables）时，本文研究其部分识别（partial identification）问题。核心机制是利用不同设定间遗漏变量偏误（omitted variable bias）的未观测关联性，通过跨设定约束来收紧联合识别集（jointly identified set）。方法无需设定偏误的具体参数分布，而是将其建模为设定间的相关性结构，从而在条件独立性假设局部失效时提供比单一设定更紧的 bounds。理论结果给出了跨设定联合识别集的显式刻画及收敛性质。对您可能有用：该框架直接连接 causal inference 的 partial identification 与 sensitivity analysis 子方向，为多源数据融合下的 robustness 分析提供了新视角。
关键技术: partial identification, omitted variable bias, selection-on-observables, multi-setting causal inference, jointly identified set
为什么对您有用: 本文直接连接 causal inference 的 partial identification 与 sensitivity analysis 子方向，处理多设定下条件独立性假设局部失效时如何跨源收紧 bounds 的问题。您 very_familiar 的 identification theory in causal inference 可以直接攻这篇 paper 的核心口子：分析其跨设定偏误关联假设的 identification power，并评估其 bounds 相比单设定 sensitivity analysis（如 Rosenbaum bound）的 sharpening 程度。立即可做：用您熟悉的 identification theory 与 sensitivity analysis 工具即可复现并拓展其识别集刻画。

非参数 / 半参数 (nonparam_semipara, 7 篇)¶

1. 2605.24858 — Optimal Estimation of Discrete Multiview Distributions under Heteroskedastic Multinomial Sampling¶

作者: Runshi Tang, Julien Chhor, Olga Klopp, Alexandre B. Tsybakov, Anru R. Zhang
分类: stat.ME
相关性 7/10 · novelty: sharper_rate
摘要: 本文研究多视图隐变量模型下离散联合分布（非负低秩张量）在多项式采样中的估计问题，核心挑战在于多项式噪声的异方差性与负相依性使得估计难度高度依赖于概率质量在样本空间上的分布。作者提出基于缩放框架的谱估计量，在 Frobenius 范数下获得了显式处理异方差与负相依性的上界，并证明了依赖 fiber mass 的极小化下界，说明该依赖性不可消除。在 ℓ1 损失下，基于同一缩放原理提出了 oracle 与数据驱动的可行估计量，证明了固定秩下 oracle 规则的近最优性，以及在 slice-to-fiber 失衡有界时 slice normalization 的近最优性。对您可能有用：本文将低秩张量估计与异方差离散噪声下的极小化理论结合，其缩放框架与张量谱方法可为您用 treewidth/einsum 视角分析高阶 U-统计量计算复杂度提供新的结构化张量模型参考。
关键技术: low-rank tensor estimation, heteroskedastic multinomial sampling, spectral estimator, minimax lower bounds, slice normalization, fiber-mass-dependent scaling
为什么对您有用: 本文直接推进了非参数/半参数理论中的低秩张量极小化估计，其异方差多项式设定下的缩放框架与谱方法对高维统计有方法论价值。您武器库中 very_familiar 的 minimax bounds 与高维渐近理论可直接用来审视其下界紧性，而 very_familiar 的张量缩并/einsum 复杂度分析可切入其谱估计量的计算瓶颈。follow-up 判断：立即可做——用您熟悉的极小化理论验证其 ℓ1 下界在更广秩条件下的紧性，并可用 einsum 复杂度框架分析其缩放谱估计量的计算代价。

2. 2605.24854 — Deep Regression for Repeated Measurements under Covariate Shift¶

作者: Yingxuan Wang, Xiangyu Xing, Wangli Xu
分类: stat.ME
相关性 6/10 · novelty: sharper_rate
摘要: 在协变量偏移（covariate shift）下的迁移学习框架中，研究目标域响应不可观测时的非参数回归估计问题，利用源域可观测响应并通过密度比（density ratio）修正分布偏移。考虑密度比已知与未知两种场景，并在密度比有界或仅满足有限矩条件下分别建模；未知场景下密度比与目标回归函数均用 ReLU FNN 估计，已知场景下仅估计回归函数。理论上建立了非渐近误差界，证明所提估计量在重复测量设定下达到 minimax 最优收敛速率。核心创新是提出新的神经网络逼近理论：网络参数常数对维度的依赖为多项式而非指数，从而缓解 curse of dimensionality 并获得更紧的随机误差界。对您有用：本文的 sharper non-asymptotic bound 与多项式维度依赖逼近论直接关联非参数理论与效率理论子方向。
关键技术: covariate shift correction, density ratio estimation, ReLU feedforward neural network approximation, minimax optimal convergence rate, non-asymptotic error bound, polynomial dimension dependence
为什么对您有用: 本文直接关联非参数理论子方向，其核心 novelty——网络逼近常数对维度的多项式而非指数依赖——为高维非参数估计的 minimax 界提供了更紧的分析工具。您武器库中 very_familiar 的 minimax bounds for estimation problems 可直接用来验证其声称的 minimax optimal rate 是否紧，以及多项式维度依赖在何种 regularity 下成立。判断：立即可做——用 minimax bound 工具审视其 rate 的紧性，并对比您熟悉的非参数收敛结果。

3. 2605.25897 — Nonparametric Estimation via Expected Order Statistics¶

作者: Tommaso Lando, Lorenzo Tedesco
分类: stat.ME
相关性 6/10 · novelty: new_method
摘要: 本文提出一种基于期望次序统计量的非参数分布估计器，将等概率 1/m 分配给 m 个估计的期望次序统计量，而非对 n 个原始观测各赋 1/n，以降低经验分布函数的变异性。核心机制是利用次序统计量的均值结构平滑样本信息，其 L^1 误差受控于经验分布的 L^1 误差，且任意 L-泛函可对应为经验分布 L-泛函的权重更新版。理论方面，对固定 m 证明了 L^p 范数与 Wasserstein 距离的几乎必然收敛，以及经验分位过程在 L^p(0,1) 中的弱收敛（p∈[1,∞)）；对 n,m 同趋于无穷时给出了 p=1,2 的弱收敛结果，并建立了 Bootstrap 有效性。模拟显示该估计器在多种分布设定下比经验分布更稳定，与核方法竞争力相当。对您有用之处：该估计器的 L-泛函权重更新视角可直接连接到您熟悉的 L-统计量与高阶 U-统计量理论。
关键技术: expected order statistics, L-functional weight update, empirical quantile process weak convergence, Wasserstein distance asymptotics, L^p norm almost sure convergence, bootstrap validity
为什么对您有用: 本文属于非参数理论，其 L-泛函权重更新机制直接连接到您 primary interest 中的高阶 U-统计量与 L-统计量理论——期望次序统计量的平滑本质上是对经典 L-统计量的一种变体，您可以用 very_familiar 的 U-统计量投影与 minimax 理论分析该估计器在更一般损失下的收敛率是否达到 minimax 下界。follow-up 判断：立即可做——您现有的高阶 U-统计量计算与 M-估计理论足以展开对该估计器泛函的 influence function 与高阶渐近分析。

4. 2605.24601 — Bayesian Conformal-Projective Prediction¶

作者: Arkaprava Roy, Malay Ghosh
分类: stat.ME · math.ST · stat.TH
相关性 5/10 · novelty: new_method
摘要: 在贝叶斯预测框架下，本文提出 conformal-projective prediction (CPP)，目标是在 ε-污染模型中构造具有 bounded influence 的稳健预测区间。核心机制是分布式 conformality：候选值被纳入数据后，若 leave-one-out 预测分布几乎不变则视为 conforming，而非传统残差 score。在 swapped 预测均值对候选值可微的条件下，建立 bounded-influence proposition 与 local convexity lemma，证明 CPP 在 ε-污染下渐近方差优于任何 unbounded-influence 的 plug-in predictor。当后验均值关于观测为线性（如 Gaussian linear model, GP regression），swapped 预测均值关于候选值为 affine，可得到闭式或一维优化解及 rank-two 计算更新。模拟与实证验证了有限样本优势。对您有用：CPP 的 bounded-influence 性质与 ε-污染下的效率优势，直接连接到 semiparametric efficiency 与 robust estimation 理论。
关键技术: conformal prediction, leave-one-out predictive distribution, bounded-influence function, epsilon-contamination model, Bayesian predictive distribution, rank-two matrix update
为什么对您有用: 本文直接连接到 semiparametric / nonparametric theory 中的 robust estimation 与 efficiency 理论（ε-污染模型下的渐近方差比较）。用您 very_familiar 的 minimax bounds 与 influence function 工具，可以验证其声称的方差优势是否达到 minimax optimal，或拓展到高维/半参数设定——属于立即可做的 follow-up。

5. 2605.24118 — PCA score regression: the art of losing power¶

作者: Yu Lu, Nidhi Pai, Erjia Cui, Ciprian Crainiceanu
分类: stat.ME
相关性 4/10 · novelty: new_method
摘要: 在函数型数据与标量协变量关联分析设定下，本文系统比较了传统主成分得分回归（RPCS）与函数-on-标量回归（FoSR）的检验功效与推断有效性。核心发现是：RPCS 因 PC 与真实效应方向的对齐偏差而损失功效，损失量取决于该对齐相关度；若不做多重性校正，RPCS 的 α 水平会膨胀；且现有 RPCS 方法无法对真实效应提供有效推断。FoSR 通过特定建模工具组合（如 penalized spline / mixed model representation）可同时避免功效损失与多重性问题，获得有效推断。模拟与 NHANES 加速度计数据实证验证了理论。对您有用：本文对 functional regression 中 inference 的有效性分析，直接触及您在 semiparametric theory 与 hypothesis testing 上的兴趣。
关键技术: principal component score regression, function-on-scalar regression, penalized spline mixed model, power loss quantification, multiplicity correction, functional data inference
为什么对您有用: 本文直接连接您在 semiparametric theory 与 hypothesis testing 的兴趣，聚焦 functional regression 中两步法（先 PCA 再回归）的推断失效与功效损失问题。您可用 very_familiar 的 minimax bounds 工具量化 RPCS 功效损失的最坏情况，或用 moderately_familiar 的 M-estimation theory 分析 FoSR mixed-model penalized spline 的渐近有效性是否达到 semiparametric efficiency bound。Follow-up 判断：中期可做——需先在 moderately_familiar 的 M-estimation / semiparametric theory 上长肌肉，以将本文的 ad-hoc 功效比较提升为 minimax / efficiency 层面的理论刻画。

6. 2605.25478 — Transcripts and Algebraic Distances in Time Series: Stochastic Properties and Nonparametric Dependence Tests¶

作者: Christian H. Wei{\ss}, Jos\'e M. Amig\'o
分类: stat.ME
相关性 4/10 · novelty: new_method
摘要: 本文在连续分布时间序列的序模式（ordinal patterns, OP）框架下，引入由相邻 OP 差异定义的 transcript 及其对应的 Cayley 与 Kendall 编辑距离，目标是构建序列独立性假设的非参数检验。将原序列转化为 transcript 或代数距离序列后，推导了其在不同过程类型下的随机性质，并基于此构造了新的检验统计量。核心理论贡献是在零假设（序列独立）下导出了这些统计量的渐近分布，从而实现了严格的非参数检验。模拟表明新检验的势常优于传统 OP 检验，实证部分展示了其可解释性。对您有用之处在于，该渐近分布的推导涉及高阶项的化简，可能需要借助 higher-order U-statistic 投影技术来刻画其精细收敛性质。
关键技术: ordinal patterns, Cayley edit distance, Kendall edit distance, nonparametric serial independence test, asymptotic distribution under null, U-statistic projection
为什么对您有用: 本文直接关联非参数假设检验与时间序列独立性设定。虽然作者给出了渐近分布，但检验统计量本质是离散排列组合上的对称泛函，您可以用 very_familiar 的 higher-order U-statistic 投影与 treewidth/einsum 视角，精确计算其高阶 Hoeffding 分解与方差界，从而获得比渐近正态更精细的有限样本逼近或计算加速。follow-up 判断：立即可做——用 very_familiar 的 U-stat 计算与投影工具即可动手分析其统计量的高阶结构与计算复杂度。

7. 2605.25852 — A Post-Processing Conformal Prediction Approach for Conditional Coverage via Pivotal Scores¶

作者: F\'elix Laplante
分类: stat.ME
相关性 4/10 · novelty: new_method
摘要: 在分布自由的 conformal prediction (CP) 框架下，目标是实现条件覆盖 (conditional coverage) 的近似保证；已知无结构假设下的有限样本精确条件覆盖不可能，本文证明其等价于构造一个分布与特征 X 独立的 pivotal nonconformity score。基于此提出 PIT-CP 方法：对任意基础 score 做一维条件密度估计（而非原始高维 Y|X 的全条件密度估计），通过概率积分变换 (PIT) 将其映射为近似不变的 pivotal score，同时保留原 score 的几何与可解释性。理论上给出了条件覆盖差距 (conditional coverage gap) 的确定性及高概率上界，并建立了预测区域的体积与对称差界；方法本质上将问题降维至一维非参数/半参数条件密度估计，并引用了 minimax 最优的条件密度估计距离界。对您有用：PIT 变换与 minimax 界的分析直接连接到非参数/半参数效率理论及 minimax bound 估计，且一维条件密度估计的收敛率决定了覆盖差距的紧致性。
关键技术: conformal prediction, conditional coverage, probability integral transform (PIT), conditional density estimation, minimax-optimal estimation bounds, pivotal nonconformity score
为什么对您有用: 本文将条件覆盖问题转化为非参数一维条件密度估计的 minimax 收敛率问题，直接连接到 primary interest 中的非参数/半参数理论与 minimax bounds。用您 very_familiar 的 minimax bounds for estimation problems 可以直接审视本文引用的条件密度估计 minimax 界是否紧，以及覆盖差距上界是否被估计误差主导；对 moderately_familiar 的 semiparametric theory 可进一步追问 PIT 变换下 pivotal score 估计的半参数效率界。Follow-up 判断：立即可做——用 minimax 估计率工具验证其条件覆盖差距界的紧性，并可探讨半参数效率视角下的一维条件密度最优估计。

效率理论 / Debiased ML (efficiency_dml, 1 篇)¶

1. 2605.24587 — Synthetic Heterogeneous-Effects LASSO: A Fixed-effects Estimation Approach for High-dimensional Mixed-effects Models¶

作者: Shangyuan Ye, Cong Zhang, Ying Chen, Ye Liang, Guanbo Wang
分类: stat.ME
相关性 6/10 · novelty: new_method
摘要: 在高维聚类数据设定下，本文研究边际模型 LASSO 的变量选择与 post-selection inference 问题，目标 estimand 是结构固定效应。当协变量跨簇异质分布时，作者指出边际模型 LASSO 会将异质协变量用作潜在簇效应的稀疏代理，导致估计目标偏离真实固定效应并产生假阳性选择。为解决此问题，提出 Synthetic Heterogeneous-Effects LASSO (SHEL)，一种固定效应惩罚框架，通过簇级合成近似吸收潜在异质性。理论部分在 high-dimensional 设定下建立了 SHEL 的选择一致性等性质，并构造了 valid post-selection inference 程序。实证部分通过模拟与 COVID-19 纵向 bulk RNA-seq 数据验证方法。对您有用之处在于：SHEL 的合成近似吸收异质性机制与 debiased ML / orthogonal score 中 nuisance 吸收的思想同源，且其 post-selection inference 框架可直接借鉴到高维因果推断的 sensitivity / IV 设定。
关键技术: fixed-effects penalized regression, post-selection inference, cluster-level synthetic approximation, high-dimensional variable selection, LASSO false selection mechanism
为什么对您有用: 直接连接高维因果推断与 semiparametric efficiency 子方向：SHEL 用合成近似吸收簇级异质性的机制，本质上与 debiased ML 中用 orthogonal score / nuisance projection 吸收混杂的思想一致，您可用 very_familiar 的高维渐近理论审视其 selection consistency 条件是否过强。Follow-up 判断：立即可做——用 very_familiar 的 minimax bound 与 M-estimation 理论验证其 post-selection inference 的 sharpness，并尝试将 synthetic approximation 思路移植到高维 IV / proximal CI 的 nuisance 吸收中。

数理统计 / 假设检验 (hypothesis_testing, 1 篇)¶

1. 2605.25855 — High-Dimensional Change-Point Detection via Angular Kernel Statistics¶

作者: Jyotishka Ray Choudhury, Yao Xie
分类: stat.ME · math.ST · stat.ML · stat.TH
相关性 7/10 · novelty: new_theory
摘要: 本文研究高维低样本量（HDLSS）设定下的变点检测问题，目标是在序列长度固定、维度发散的极限下检测边际分布漂移。提出基于维度平均角核扫描的统计量，聚合各坐标的一维角差异，构成完全非参数、无超参数且无需有限矩假设的估计量，适用于重尾或污染分布。对离线单变点问题，推导出总体均值可精确分解为普适形状函数与标量信号因子，刻画了零假设下协方差结构至长程方差因子，并在跨坐标混合条件下建立HDLSS多元中心极限定理，由此获得插件Gaussian校准、渐近I类误差控制及功率与定位保证，局部检测尺度为d^{-1/2}。进一步将离线推广至固定窗口的序贯监控，给出ARL校准与最坏情形EDD界。对您有用：该文的HDLSS极限与CLT推导直接连接您的高维渐近理论兴趣，其角核聚合与混合条件下的协方差刻画为高维假设检验提供了新工具。
关键技术: angular kernel scan statistic, HDLSS multivariate CLT, dimension-averaged aggregation, long-run variance factorization, sequential monitoring ARL/EDD, moment-agnostic nonparametric test
为什么对您有用: 本文直接连接您的高维渐近理论与假设检验两个primary interest子方向，在HDLSS极限下推导多元CLT与协方差结构刻画，属于您very_familiar的高维渐近工具可直接处理的范畴。您可以用very_familiar的minimax bounds工具审视其声称的d^{-1/2}局部检测尺度是否紧，或用moderately_familiar的M-estimation理论分析其插件校准的渐近性质。Follow-up判断：立即可做——用高维渐近与minimax框架验证其检测率的紧性并探索可能的sharper rate。

天体统计 (astrostats, 2 篇)¶

1. 2605.25840 — skysurvey: a pure python package to simulate the transient sky¶

作者: M. Rigault, M. Ginolin, L. Dellazzeri, B. Popovic, J. O. Hjortlund, A. Gilles-Lordet, S. Conseil, M. Coughlin, F. Ruppin, M. Smith, A. Townsend, A. Trigui, C. Barjou-Delayre, R. Kebadian, J. Nordin
分类: astro-ph.IM · astro-ph.CO
相关性 6/10 · novelty: application
摘要: 本文介绍了一个纯 Python 天文模拟软件包 skysurvey，用于快速生成瞬变天体（如超新星）的观测模拟数据。核心设定基于三个对象：Target（天体内在物理模型与参数分布）、Survey（观测策略与仪器响应）和 DataSet（两者结合生成含噪声与选择偏差的观测数据）。配套的 modeldag 包利用有向无环图（DAG）结构简化了复杂参数依赖的建模流程，例如 Ia 型超新星中随颜色变化的亮度 β 系数。文章通过复现 ZTF SNe Ia DR2 数据集的红移与率分布展示了软件能力，并指出该工具可为 simulation-based inference 铺路。对您而言，这是一篇了解天文瞬变源数据生成机制与选择偏差建模的优质入门读物。
关键技术: simulation-based inference, selection effects modeling, directed acyclic graph (DAG), population modeling, survey simulator
为什么对您有用: (1) 本文是天文数据分析的优秀入门读物，清晰展示了天文学家如何处理观测策略、仪器响应与选择偏差，这些是统计学家进入 astrostats 的核心门槛；(2) 您的武器库（software development, inverse problems with random noise）完全足够支撑您使用并深入该软件的底层逻辑，理解其数据生成过程；(3) 值得花时间通读全文及文档，特别是其提到的 simulation-based inference 方向，与您熟悉的 likelihood-free / debiased ML 等半参数效率理论有潜在交汇点。

2. 2605.24095 — Fast and Flexible Characterisation of Astronomical Light Curves Using Multi-Time Attention¶

作者: Yash Gondhalekar, Anais M\"oller, Paula S\'anchez-S\'aez
分类: astro-ph.IM · astro-ph.GA
相关性 5/10 · novelty: application
摘要: 本文提出基于 Multi-Time Attention Network 的无监督框架，用于快速刻画天文不规则光度时间序列（light curve）。模型直接从 ZTF alert 数据（通过 Fink broker 获取）学习 time-aware 潜变量表示，无需重度预处理；对稀疏序列插值偏差约 0.01 mag、散度约 0.1 mag。潜空间与物理属性（持续时间、峰值时间、变率、颜色）高度相关，且对观测次数等无关属性鲁棒；在 AGN 主导的数据中仍能分离 SN 与 AGN，并对未见类别（LPV、TDE）有合理泛化，但受限于 2 天时间分辨率无法捕捉 RRLyrae 的短周期脉动。模型极轻量（数百 KB），单条序列推理约 0.01 s（CPU），且推理时间不依赖观测点数（优于 GP 回归）。对您而言，这是一篇将深度 attention 机制引入不规则天文时序的入门级应用文，展示了 Rubin LSST 时代的数据规模与计算约束。
关键技术: Multi-Time Attention Network, irregular time series interpolation, unsupervised latent representation, ZTF alert stream / Fink broker, light curve characterisation
为什么对您有用: (1) 作为 astrostats gateway reading，本文清晰展示了天文不规则时序数据（缺失、稀疏、异采样）的结构与建模挑战，以及 Rubin LSST 时代对推理速度的硬约束，适合零天文背景的统计学者入门。(2) 武器库中的 software development 与 minimax / inverse-problem 理论可支撑进入此方向——例如用 inverse-problem 视角审视 attention 插值的统计性质，或用高维渐近理论分析潜空间的几何。(3) 值得花时间读全文：数据端（ZTF alert 流）与模型端（attention 架构与时间分辨率限制）的 exposition 对 outsider 友好，且计算-统计 tradeoff（GP vs. attention 的推理速度对比）直接呼应您 primary interest 中的 computationally constrained statistics。

流行病学 (epidemiology, 1 篇)¶

1. 2605.26023 — Considering causality in the construction of molecular signatures of lifestyle exposures¶

作者: Diana Wu, Vivian Viallon
分类: stat.ME
相关性 5/10 · novelty: minor
摘要: 在流行病学高维 omics 数据构建生活方式暴露分子签名的设定下，目标是识别受暴露因果影响的分子特征。当暴露因果地影响分子特征时，直接进行多变量回归会因对撞因子结构打开非因果路径，导致签名包含非因果特征。本文利用 DAG 和 d-separation 论证了初始单变量筛选步骤可阻断该对撞偏倚，从而缓解非因果特征的混入。模拟研究表明筛选步骤降低了非因果特征的包含率，但代价是灵敏度下降及暴露与签名相关性减弱。对您可能有用：该文为流行病学应用中高维筛选的因果意涵提供了 DAG 视角，虽无方法学 novelty，但指出了对撞偏倚的具体结构。
关键技术: directed acyclic graphs, d-separation, collider bias, univariate screening, high-dimensional omics regression
为什么对您有用: (1) 连接到流行病学应用中的因果推断设定，具体是高维 omics 签名构建中的对撞偏倚问题。(2) 用 technical_arsenal 中 identification theory in causal inference 的 DAG 工具即可完全解析其 d-separation 逻辑，甚至可进一步用高维推断理论刻画筛选步骤的 sensitivity-specificity tradeoff。(3) 立即可做：用 very_familiar 的高维渐近理论或 minimax bounds 精确量化该单变量筛选步骤在何种信号强度下能消除对撞偏倚并控制灵敏度损失。

🗂 其他论文（仅 LLM 评分，未生成摘要）¶

未生成中文摘要的论文，按 LLM 评分由高到低排列，仅保留评分与简评，便于回溯查全。一般为相关性低于展示阈值者；个别历史页也含当时因单日摘要上限未展开的高分篇目（评分仍清楚标着）。

1. 2605.24734 — Consistent Identification of Top-\(K\) Nodes in Noisy Networks¶

作者: Hui Shen, Eric D. Kolaczyk
分类: math.ST · stat.TH
相关性 4/10
评分理由: Top-K node identification under noise involves minimax-style recovery guarantees, but network centrality is not a primary focus.

2. 2605.25482 — Constraining the Inclination of Binary System Orbits with the Astrometric Excess Noise from Gaia DR3¶

作者: Shilong Liao, Ye Ding, Shangyu Wen, Zhaoxiang Qi, Qiqi Wu
分类: astro-ph.IM · astro-ph.SR
相关性 4/10
评分理由: Adjacent: astrostats paper with a clear statistical inverse problem (constraining inclination from noise), but narrow and lacks broad gateway accessibility.

3. 2605.24169 — Post-Processing Posterior Predictive P-values¶

作者: Nils Lid Hjort, Fredrik A. Dahl, Gunnhildur H\"ognad\'ottir Steinbakk
分类: stat.ME
相关性 3/10
评分理由: Bayesian model criticism via posterior predictive p-values is tangential to the researcher's primary mathematical and causal inference interests.

4. 2605.24818 — Spiking the training data to correct for test set contamination¶

作者: Johnny Tian-Zheng Wei, Jerry Li, Ameya Godbole, Robin Jia
分类: stat.ME · cs.CL · cs.LG
相关性 3/10
评分理由: Test set contamination correction for LLMs is an applied ML methodology paper, tangential to the researcher's core statistical and causal theory interests.

5. 2605.24848 — Distributional Conformal Prediction for Markov Processes¶

作者: Dehao Dai, Kejin Wu, Dimitris N. Politis
分类: stat.ME · math.ST · stat.TH
相关性 3/10
评分理由: Conformal prediction for Markov processes; tangential to primary interests in causal inference and mathematical statistics.

6. 2605.25496 — Estimation of Directed Acyclic Graphs by Frequentist Model Averaging¶

作者: Huihang Liu, Wenhui Li, Xinyu Zhang
分类: stat.ME
相关性 3/10
评分理由: DAG estimation via model averaging is adjacent to causal inference but focuses on network structure rather than identification/estimation theory.

7. 2605.25934 — Weighted NPMLE for the Marginal Mean of Recurrent Events with a Competing Terminal Event¶

作者: Anna Bellach, Michael R. Kosorok
分类: stat.ME · math.ST · stat.AP · stat.TH
相关性 3/10
评分理由: Semiparametric regression for recurrent events uses semiparametric tools but is applied survival analysis, weakly matching primary interests.

8. 2605.24156 — Long Memory in Intrinsically Dynamic Factor Models¶

作者: Qin Wen (Stern School, New York University), Clifford M. Hurvich (Stern School, New York University)
分类: math.ST · stat.TH
相关性 3/10
评分理由: Dynamic factor models with long memory are tangential to the researcher's primary interests in causal inference and high-dim theory.

9. 2605.25633 — Exponential mixing properties of nonlinear functional autoregressive models¶

作者: Shuntarou Suzuki, Yoshikazu Terada
分类: math.ST · stat.TH
相关性 3/10
评分理由: Nonlinear functional autoregressive mixing properties are tangential; no clear connection to causal inference or high-dim/minimax theory.

10. 2605.25766 — Measuring multivariate maximal tail dependence¶

作者: Takaaki Koike, Marius Hofert, Haruki Tsunekawa
分类: math.ST · q-fin.RM · stat.TH
相关性 3/10
评分理由: Tangential: multivariate tail dependence in copulas is outside primary interests (causal, high-dim, efficiency, U-stats) and lacks arsenal overlap.

11. 2605.24707 — Shared hidden-factor information framework for multiple behavioral tasks¶

作者: Yuan Bian, Yuanjia Wang, Xingche Guo
分类: stat.ME
相关性 2/10
评分理由: Shared latent factor modeling for behavioral tasks in MDD is an applied psychometrics paper, tangential to the researcher's theoretical interests.

12. 2605.25452 — Different Statistical Perspectives for Understanding Generalisation in Graph Neural Networks¶

作者: Nil Ayday, Mahalakshmi Sabanayagam, Debarghya Ghoshdastidar
分类: stat.ME · cs.LG · stat.ML
相关性 2/10
评分理由: Generalization bounds for GNNs; tangential ML theory, lacks connection to researcher's specific high-dim or causal arsenal.

13. 2605.25873 — Bayesian perspectives on exponential random graph models¶

作者: Alberto Caimo, Isabella Gollini
分类: stat.ME · stat.CO
相关性 2/10
评分理由: Bayesian ERGM review focuses on network models and doubly intractable posteriors, largely unrelated to primary interests.

14. 2605.25146 — Kernel Embedding for Operator-Valued Measures and Its Application to Quantum Tomography¶

作者: Philipp Nikolas Mayer, Ho Yun
分类: math.ST · quant-ph · stat.TH
相关性 2/10
评分理由: Quantum tomography with RKHS embeddings is far from the researcher's primary interests and outside their technical arsenal.

15. 2605.25359 — A Quasi Maximum Likelihood Estimation Method for Bergomi-Type Volatility Models¶

作者: Masaaki Fukasawa, Haruki Tomita
分类: math.ST · stat.TH
相关性 2/10
评分理由: Quasi MLE for Bergomi volatility models is financial math, tangential to the researcher's causal and high-dim statistics focus.

16. 2605.24191 — Hubble Astrometry for the Local Group and Beyond in the 2030s¶

作者: S. Tony Sohn, Paul Bennet, Kevin Andrew McKinnon, Roeland P. van der Marel, Mattia Libralato, Eduardo Vitral, Ekta Patel, Laura L. Watkins, Andres del Pino, Andrea Bellini, Massimo Griggio, Mark A. Fardal, Nitya Kallivayalil, Jack T. Warfield, Karoline M. Gilbert, Puragra Guhathakurta, Daniel Weisz, Andrew Wetzel, Andrew B. Pace, Marcel S. Pawlowski, Joshua D. Simon, Gurtina Besla, Erik Tollerud, Xiaowei Ou, Niusha Ahvazi, Anna Bonaca
分类: astro-ph.IM · astro-ph.GA
相关性 2/10
评分理由: Low: pure astrophysics results/proposal lacking clear data-model exposition or statistical methodology accessible to an outsider; not good gateway reading.

17. 2605.25959 — Building on SVOM : the CATCH satellite constellation for transient astronomy¶

作者: St\'ephane Schanne
分类: astro-ph.IM
相关性 2/10
评分理由: Pure astrophysics mission proposal lacking the data/model exposition and statistical problem articulation required for gateway reading.

18. 2605.24995 — Information-Theoretic Reliability is Robust to Analytic Choice: A 24-Specification Multiverse on Public Cognitive Test-Retest Data¶

作者: Maria Westrin
分类: stat.ME
相关性 1/10
评分理由: Information-theoretic reliability for cognitive tests; unrelated to primary mathematical statistics and causal inference interests.

Maintained by 陈星宇 · Homepage · Source on GitHub

2026-05-26 每日 arXiv 资讯¶

⭐ 高相关论文（按主题分组）¶

因果推断 (causal_inference, 3 篇)¶

1. 2605.25811 — Geometry Adaptive Counterfactual Distribution Learning with Diffusion-Guided Smoothing¶

2. 2605.25687 — Confidence intervals for causal effects in sequential decision making¶

3. 2605.25687 — Confidence intervals for causal effects in sequential decision making¶

非参数 / 半参数 (nonparam_semipara, 1 篇)¶

1. 2605.25519 — Identification and Estimation of Semiparametric Multilayered Sample Selection Models¶

数理统计 / 假设检验 (hypothesis_testing, 6 篇)¶

1. 2605.25380 — Rank-Based Tests for Mutual Independence of High-Dimensional Random Vectors via \(L_q\) Norm¶

2. 2605.24838 — Adaptable High-Dimensional Change Point Detection via Ridge Regularization¶

3. 2605.24741 — On the Sample Complexity of Robust Binary Hypothesis Testing¶

4. 2605.25859 — Minimax Limits of k-Fold Cross-Validation via Majority¶

5. 2605.24741 — On the Sample Complexity of Robust Binary Hypothesis Testing¶

6. 2605.25859 — Minimax Limits of k-Fold Cross-Validation via Majority¶

其他 (other, 2 篇)¶

1. 2605.24167 — Modified treatment policies that depend on the natural history of treatment¶

2. 2605.24346 — Using the target trial framework for combining information: external comparator analyses and other applications¶

📌 中相关论文（按主题分组）¶

因果推断 (causal_inference, 1 篇)¶

1. 2605.25483 — Partial Identification of Causal Effects that Vary by Setting¶

非参数 / 半参数 (nonparam_semipara, 7 篇)¶

1. 2605.24858 — Optimal Estimation of Discrete Multiview Distributions under Heteroskedastic Multinomial Sampling¶

2. 2605.24854 — Deep Regression for Repeated Measurements under Covariate Shift¶

3. 2605.25897 — Nonparametric Estimation via Expected Order Statistics¶

4. 2605.24601 — Bayesian Conformal-Projective Prediction¶

5. 2605.24118 — PCA score regression: the art of losing power¶

6. 2605.25478 — Transcripts and Algebraic Distances in Time Series: Stochastic Properties and Nonparametric Dependence Tests¶

7. 2605.25852 — A Post-Processing Conformal Prediction Approach for Conditional Coverage via Pivotal Scores¶

效率理论 / Debiased ML (efficiency_dml, 1 篇)¶

1. 2605.24587 — Synthetic Heterogeneous-Effects LASSO: A Fixed-effects Estimation Approach for High-dimensional Mixed-effects Models¶

数理统计 / 假设检验 (hypothesis_testing, 1 篇)¶

1. 2605.25855 — High-Dimensional Change-Point Detection via Angular Kernel Statistics¶

天体统计 (astrostats, 2 篇)¶

1. 2605.25840 — skysurvey: a pure python package to simulate the transient sky¶

2. 2605.24095 — Fast and Flexible Characterisation of Astronomical Light Curves Using Multi-Time Attention¶

流行病学 (epidemiology, 1 篇)¶

1. 2605.26023 — Considering causality in the construction of molecular signatures of lifestyle exposures¶

🗂 其他论文（仅 LLM 评分，未生成摘要）¶

1. 2605.24734 — Consistent Identification of Top-\(K\) Nodes in Noisy Networks¶

2. 2605.25482 — Constraining the Inclination of Binary System Orbits with the Astrometric Excess Noise from Gaia DR3¶

3. 2605.24169 — Post-Processing Posterior Predictive P-values¶

4. 2605.24818 — Spiking the training data to correct for test set contamination¶

5. 2605.24848 — Distributional Conformal Prediction for Markov Processes¶

6. 2605.25496 — Estimation of Directed Acyclic Graphs by Frequentist Model Averaging¶

7. 2605.25934 — Weighted NPMLE for the Marginal Mean of Recurrent Events with a Competing Terminal Event¶

8. 2605.24156 — Long Memory in Intrinsically Dynamic Factor Models¶

9. 2605.25633 — Exponential mixing properties of nonlinear functional autoregressive models¶

10. 2605.25766 — Measuring multivariate maximal tail dependence¶

11. 2605.24707 — Shared hidden-factor information framework for multiple behavioral tasks¶

12. 2605.25452 — Different Statistical Perspectives for Understanding Generalisation in Graph Neural Networks¶

13. 2605.25873 — Bayesian perspectives on exponential random graph models¶

14. 2605.25146 — Kernel Embedding for Operator-Valued Measures and Its Application to Quantum Tomography¶

15. 2605.25359 — A Quasi Maximum Likelihood Estimation Method for Bergomi-Type Volatility Models¶

16. 2605.24191 — Hubble Astrometry for the Local Group and Beyond in the 2030s¶

17. 2605.25959 — Building on SVOM : the CATCH satellite constellation for transient astronomy¶

18. 2605.24995 — Information-Theoretic Reliability is Robust to Analytic Choice: A 24-Specification Multiverse on Public Cognitive Test-Retest Data¶

评论