Submissions from arxiv.org

		Towards Autonomous Mathematics Research (arxiv.org)
		20 points by gmays 51 minutes ago \| past \| 2 comments
		Retrieval-Aware Distillation for Transformer-SSM Hybrids (arxiv.org)
		1 point by readitalready 5 hours ago \| past \| discuss
		Biases in the Blind Spot: Detecting What LLMs Fail to Mention (arxiv.org)
		2 points by mpweiher 1 day ago \| past \| discuss
		A Framework for Time-Updating Probabilistic Forecasts (arxiv.org)
		6 points by Luc 1 day ago \| past \| discuss
		Towards Autonomous Mathematics Research (Google DeepMind) (arxiv.org)
		1 point by u1hcw9nx 1 day ago \| past \| discuss
		Remote Labor Index: Measuring AI Automation of Remote Work (arxiv.org)
		2 points by Leynos 1 day ago \| past \| discuss
		Generalized on-policy distillation with reward extrapolation (arxiv.org)
		3 points by fzliu 1 day ago \| past \| discuss
		OpenAI model proposes and proves Physics result (arxiv.org)
		1 point by KothuRoti 2 days ago \| past \| discuss
		An API for Biological Neural Networks (arxiv.org)
		1 point by bwjx 2 days ago \| past \| discuss
		Adversarial Patch: images that make classifiers ignore other items in a scene (arxiv.org)
		1 point by felineflock 2 days ago \| past \| discuss
		Maximum Agreement Linear Predictor (MALP) (arxiv.org)
		1 point by tesserato 2 days ago \| past \| 1 comment
		Standardized and In-Depth Benchmarking of Post-Moore Dataflow AI Accelerators (arxiv.org)
		1 point by PaulHoule 2 days ago \| past \| discuss
		Fine-Tuning GPT-5 for GPU Kernel Generation (arxiv.org)
		4 points by matt_d 2 days ago \| past \| discuss
		SWE-ContextBench: context learning benchmark in coding (arxiv.org)
		1 point by mustaphah 2 days ago \| past \| discuss
		LLMs exceed physicians on complex text-based differential diagnosis (arxiv.org)
		3 points by rippeltippel 2 days ago \| past \| 2 comments
		Horus: A Protocol For Trustless Verification Under Uncertainty (arxiv.org)
		1 point by optimalsolver 2 days ago \| past \| discuss
		Learning to Reason in 13 Parameters (arxiv.org)
		2 points by stared 2 days ago \| past \| discuss
		LLM Reasoning Failures (arxiv.org)
		1 point by gradus_ad 2 days ago \| past \| discuss
		Defining causal mechanism in dual process theory and 2 types of feedback control (arxiv.org)
		1 point by s6i 2 days ago \| past \| discuss
		Routing LLM queries using internal success predictions (70% cost reduction) (arxiv.org)
		1 point by stansApprentice 2 days ago \| past \| 2 comments
		SWE-AGI: benchmarking spec-driven software construction (arxiv.org)
		1 point by mustaphah 2 days ago \| past \| 1 comment
		Authenticated Workflows: A Systems Approach to Deterministic Agentic Controls (arxiv.org)
		3 points by mrajagopalan 3 days ago \| past \| 1 comment
		Formalization and Inevitability of the Pareto Principle (arxiv.org)
		3 points by bikenaga 3 days ago \| past \| 1 comment
		RL on GPT-5 to write better kernels (arxiv.org)
		4 points by atallahw 3 days ago \| past \| 1 comment
		Quantum observers can communicate across multiverse branches (arxiv.org)
		2 points by lisper 3 days ago \| past \| discuss
		Pushing Tensor Accelerators Beyond MatMul in a User-Schedulable Language (arxiv.org)
		1 point by matt_d 3 days ago \| past \| discuss
		HySparse: A Hybrid Sparse Attention Architecture (arxiv.org)
		5 points by readitalready 3 days ago \| past \| discuss
		Biases in the Blind Spot: Detecting What LLMs Fail to Mention (arxiv.org)
		1 point by jari_mustonen 3 days ago \| past \| discuss
		Evaluation of RAG Architectures for Policy Document Question Answering (arxiv.org)
		1 point by PaulHoule 3 days ago \| past \| discuss
		SoftMatcha 2: A Fast and Soft Pattern Matcher for Trillion-Scale Corpora (arxiv.org)
		3 points by salkahfi 3 days ago \| past \| discuss
		More