What does it take for LLM systems to participate meaningfully in the scientific process, as authors, reviewers, mentors, and collaborative tools for researchers? In this talk, Prof. Dhruv Kumar and Dhruv Trehan will draw on recent experiments from Lossfunk and collaborators and share lessons from building and evaluating agentic LLM systems for idea generation, autonomous experimentation, manuscript creation, research review, and mentoring young researchers. The discussion will cover where today’s LLM-based systems still fall short, including failure modes such as lack of research taste and scientific skepticism, where these systems are already useful, and what these experiments suggest about the future of autonomous and AI-assisted science.
Check out the paper here! https://www.alphaxiv.org/abs/2601.03315