Researchers from Sea Ai Lab, UCAS, NUS and SJTU Introduce FlowReasons: A Meta-Agent at request level for personal generation of personalized system

LLM-based multi-agent systems characterized by planning, reasoning, tool use and memory functions form the basis for applications such as chatbots, code generation, math and robotics. However, these systems face significant challenges as they are manually designed, leading to high human resource costs and limited scalability. Graph -based methods have tried to automate workflow design by formulating workflows as a network, but their structural complexity limits scalability. Advanced approaches represent multi-agent systems as a programming code and use advanced LLMs as meta-agents to optimize workflows, but focus on task solutions that generate single-task-specific systems. This approach to one size fits all deficiencies capacity for automatic adjustment to individual user queries.

LLM-based multi-agent systems are the basis for various applications in the real world, including code info, computer use and deep research. These systems have LLM-based agents equipped with planning functions, database access and tool function calls that collaborate to obtain promising benefits. Early approaches focused on optimization of prompt or hyperparameters through evolution algorithms to automate agent profiling. ADAS introduced code presentation for agents and workflows with a meta agent to generate workflows. In addition, Openai has advanced reasoning in LLMs by developing the O1 model. Models such as QWQ, QVQ, Deepseek and Kimi have followed and developed O1-like reasoning architectures. Openais O3 model achieves promising results on Arg-Agi-Benchmark.

Researchers from Sea Ai Lab, Singapore, University of Chinese Academy of Sciences, National University of Singapore and Shanghai Jiao Tong University have suggested flowreasons, a meta-agent on the query designed to automate the creation of query-level-multi-agent systems that generate a customized system per. User request. The researchers distilled Deepseek R1 to supply flowreasons with the basic reasoning features needed to create multi-agent systems, and then improved it through reinforcement learning with external execution feedback. A multi-purpose reward mechanism is designed to optimize training across three critical dimensions: performance, complexity and efficiency. This allows flowReason to generate personalized multi-agent systems through predominant reasoning for each unique user request.

The researchers choose three data sets: BigCodeBench for engineering assignments, Human Revel and MBPP for algorithmic challenges for detailed evaluation across different code generation scenarios. Flowreasons are evaluated against three categories of basic lines:

  • Directly invoking a single model using Standalone LLMs
  • Manually designed workflows including self-rafine, LLM-Debate and LLM blender with human-shaped reasoning strategies
  • Automatic workflow optimization methods such as AFLOW, ADAS and MAAS that construct workflows through search or optimization.

Both O1-mini and GPT-4o-mini are used as worker models for manually designed workflows. Flowreasons are implemented with two variants of Deepseek-R1-Distill-Qwen (7B and 14B parameters) using O1-mini as the worker model.

FlowReasoner-14B surpasses all competing approaches and achieves a total improvement of 5 percentage points compared to the strongest baseline, Maas. It exceeds the performance of its underlying worker model, O1-mini, by a significant margin of 10%. These results show the effectiveness of the workflow -based reasoning framework to improve code generation accuracy. To evaluate generalization functions, experiments are replaced by the O1-MINI worker with models such as QWEN2.5 codes, Claude and GPT-4o-mini, while the Meta Agent is fixed as either FlowReasoner-7B or Flowreasons-14B. Flowreasons exhibit remarkable transferability and maintain uniform performance across different worker models on the same tasks.

In this article, researchers present flowreasons, a meta-agent at the request level designed to automate the creation of personalized multi-agent systems for individual user queries. Flowazeasons use external execution feedback and reinforcement learning with multifunctional rewards focusing on performance, complexity and efficiency to generate optimized workflows without relying on complex search algorithms or carefully designed search sets. This approach reduces the cost of human resources, while improving scalability by enabling more adaptive and effective multi-agent systems that dynamically optimize their structure based on specific user queries rather than relying on fixed workflows for entire task categories.


Check Paper and github side. Nor do not forget to follow us on Twitter and join in our Telegram Channel and LinkedIn GrOUP. Don’t forget to take part in our 90k+ ml subbreddit.

🔥 [Register Now] Minicon Virtual Conference On Agentic AI: Free Registration + Certificate for Participation + 4 Hours Short Event (21 May, 9- 13.00 pst) + Hands on Workshop


Sajjad Ansari is a last year bachelor from IIT KHARAGPUR. As a technical enthusiast, he covers the practical uses of AI focusing on understanding the impact of AI technologies and their real world. He aims to formulate complex AI concepts in a clear and accessible way.

Leave a Comment