Manav Singhal

Inside OpenAI’s o1: Part 1

Inside OpenAI’s o1: Part 1

Introduction The o1 family of models is trained using reinforcement learning to perform complex reasoning. It is baking in the now familiar chain of thoughts before giving the output. Through the training, the models learn to refine their thinking, explore different strategies, and identify mistakes in those exploration paths. It

Data contamination

Evaluating data contamination in LLMs

Introduction In recent years, large language models (LLMs) like GPT and Llama have constantly been pushing accuracy numbers on popular benchmarks. However, one issue that has become increasingly important is data contamination—the overlap between a model’s training data and the evaluation benchmarks used to assess its performance. This

System overview

Red teaming with auto-generated rewards and multi-step RL

Introduction Making AI systems like LLMs robust against adversarial cases is a critical area of research. One approach to identifying vulnerabilities in AI models is red-teaming, where adversarial prompts or attacks are designed to expose weaknesses. However, generating a wide variety of diverse yet effective attacks remains a significant challenge.

Agent as a Judge

Agent as a Judge

Introduction Most popular benchmarks like SWE-Bench rely solely on the final resolve rate of automated repair tasks. They do not effectively consider the steps taken by the agentic system to reach the resolve rate. Thus, agentic systems should be evaluated like a human, looking at the thoughts and agent trajectory

Contextual document embeddings

Contextual document embeddings

Introduction Retrieval is a complex task due to the diversity of queries and the importance of the relevance of the text being retrieved. There are primarily statistical-based retrieval and neural-based retrieval techniques. The paper we will be discussing today works on improving document embeddings for neural retrieval tasks. Traditional methods