SELF-REFINE — A New Milestone in the AI Era?

Isaac Kargar
6 min readApr 4

Note: ChatGPT is used in this post as an assistant.

When I found this work, I got super excited! A bunch of questions came to my mind, and I knew I had to write a blog post on it. It might be a game-changer like the Transformer paper was. This could take AI to new levels. So, let’s jump in and see what this paper’s all about.



Large language models (LLMs) can produce coherent outputs, but they often struggle with more complex tasks that involve multiple objectives or less-defined goals. Current advanced techniques for refining LLM-generated text rely on external supervision and reward models, which require significant amounts of training data or costly human annotations. This highlights the need for a more flexible and effective method that can handle a range of tasks without extensive supervision.

To address these limitations, a new method called SELF-REFINE has been proposed. It better mimics the human creative generation process without the need for an expensive human feedback loop. SELF-REFINE consists of an iterative loop between two components, FEEDBACK and REFINE, that work together to produce high-quality outputs. The process starts with an initial draft output generated by a model, which is then passed back to the same model for feedback and refinement. This iterative process continues until the model determines no further refinement is needed or a specified number of iterations have been reached. The same underlying language model performs both feedback and refinement in a few-shot setup.

SELF-REFINE has been applied to various tasks across diverse domains that require different feedback and revision strategies, such as review rewriting, acronym generation, constrained generation, story generation, code rewriting, response generation, and toxicity removal. By using a few-shot prompting approach, the model can learn from a small number of examples. SELF-REFINE is the first method to offer an iterative approach for improving generation using natural language feedback. It is hoped that this iterative framework will encourage further research in the area.

The contributions of this work can be summarized as follows:

  • SELF-REFINE is a novel approach allowing LLMs to…
Isaac Kargar

Co-Founder and CIO @ SUPPLYZ: | Ph.D. candidate at the Intelligent Robotics Group at Aalto University |