Skip to content

menaje/Orchestration

Repository files navigation

korean english

How to Use

gemini cli (Pre-requisite: Install gemini cli)

Unlike claude code, sub-agents are not natively applied, which leads to problems caused by high token generation (slower speeds). (In some cases, more than 500 million tokens can be used as input for a single project run.) This issue has been raised in gemini-cli: issues(#5000), and we have currently implemented a workaround by configuring sub-agents through a separate method.

  1. Before running gemini, the following setup is required:
    • Copy the agent_system subfolder and its files into the workspace folder.
    • Copy the task management data into the data/ folder.
    • Copy the task-related guidelines and agent personas into the guidelines/ folder (this also includes sub-agent configuration files from claude code). The planner will provide the appropriate guidelines and personas for each task. (There is no limit to the number of personas.)
    • In the settings/ folder, create an mcp_list file. This file should specify descriptions and methods for the MCPs the AI will use. (You can also use an existing AI to generate this.)
    • In the settings/ folder, create a set_phases file. This is for cases where the final result is a combination of smaller, incremental results. If you want a single output, create only one phase.
    • In the settings/ folder, create a set_stages file. This sets the progress stages for each deliverable. If there are no special requirements, you can proceed with the default settings.
  2. After the setup above, run gemini from the workspace folder. (If you haven't set specific permissions, use gemini -y.)
  3. After writing your request, give the instruction "coning".

claude code (Pre-requisite: Install claude code)

Tests have shown that claude code successfully delegates tasks to sub-agents. We plan to make some modifications to align with the claude code execution environment. Alternatively, we may adapt the system to allow some tasks based on gemini cli to be executed by claude code.


As the process runs, you can manage the project's overall progress, including tracking the update history for results from previously completed run_ids.


1. On the Origins of the Code of Conduct

In the past, creating intelligent 'agents' was a domain exclusive to developers who spoke the language of computers: 'code'. But now, we are witnessing an incredible paradigm shift. We can now design agents that think and act for themselves, based on the language humans understand best: 'natural language'.

The 'Code of Conduct' lies at the heart of this innovative shift. It is living proof that we can build highly intelligent agents without complex coding, simply by defining clear procedures for thought and principles for action.

The outputs generated by agents created in this way sometimes demonstrate insights equal to, or even surpassing, those of experts in the field. Through this, we may be getting a small taste of a tiny piece of Artificial General Intelligence (AGI).

This entire journey began with a very practical question: "How can we increase satisfaction with the results from an Artificial Intelligence (LLM) from 10% to over 90%?"

Personal Experience: 10% Satisfaction

Initially, when I gave an LLM a broad task like "Write a report about A," the results were disappointing. The content looked plausible, but it lacked depth and logical consistency. My personal satisfaction level was only about 10-50%.

A Shift in Thinking: Dividing Tasks 'Like a Human'

The root of the problem was the 'approach'. When humans write a complex report, they don't write everything at once. They go through steps: finding resources, creating an outline, writing a draft, and revising.

Inspired by this simple fact, I began to break down the tasks given to the LLM into smaller, sequential requests, just like a human workflow.

  • "Find 5 resources related to A."
  • "Based on the resources found, create a table of contents for the report."
  • "Write the body text for the first item in the table of contents."

Amazingly, by breaking down the work into these small units (Tasks), the satisfaction with the results soared to over 90%. By focusing on each step, the LLM produced far more accurate and consistent outputs.

A Manual Experiment: An Idea Born from Multiple Chat Windows

Before the advent of agents like Gemini CLI, this idea was fleshed out through a manual experiment using multiple LLM chat windows simultaneously.

  • Chat Window 1 (Planner): Sets the overall plan and defines the next Task.
  • Chat Window 2 (Executor): Receives and executes the Task defined by the 'Planner'.
  • Chat Window 3 (Prompt Engineer): Refines the prompt to help the 'Executor' produce the best possible results.

The most crucial part of this process was maintaining 'goal consistency'. The Planner always remembered the user's final instruction, and the Executor was told not just 'what' to do, but also 'why'. All work outputs were saved as files, becoming clear input data for the next step.

Eureka: This is an Artificial Neural Network

As I structured this manual workflow, I realized it bore a striking resemblance to an artificial neural network. The process where an external instruction (input) is transformed into a final output through multiple stages of planning and execution (hidden layers). This became the backbone of the 'Code of Conduct' architecture.

The emergence of Gemini CLI gave me the confidence that this entire process could be automated and was the decisive catalyst for systematizing this idea under the name 'Code of Conduct'.

2. The Infinite Scalability of the Code of Conduct: Building an Agent Society

The 'Code of Conduct' doesn't stop at optimizing the workflow of a single agent. Its true potential is revealed when it is extended to an 'Agent Society' where multiple agents interact. This is akin to building an intelligent system that operates not like a single competent expert, but like a well-organized team or company.

Recursive Delegation: Agents Assigning Work to Other Agents

The core of the 'Code of Conduct's' scalability lies in Recursive Delegation. This is the concept where a higher-level agent hires another, subordinate agent that also follows a 'Code of Conduct' to solve its own Task.

Category Senior Agent (Manager) Junior Agent (Practitioner)
Goal Achieve a strategic task (e.g., market analysis report) Solve a specific Task (e.g., summarize a particular article)
Role Decompose complex problems, define, and delegate sub-Tasks Execute the clearly delegated Task according to the 'Code of Conduct'
Interaction Issues Task instructions and receives deliverables Creates and reports back with deliverables (files)

This hierarchical structure mimics the most efficient collaboration method in human society: the 'organization'. The manager sees the big picture, and the practitioner focuses on the details, maximizing the efficiency and expertise of the entire system. In actual AI research, this concept is actively being studied under the name 'Hierarchical Agent Teams', with ChatDev, which simulates a virtual software company, being a prominent success story.

Inter-Agent Communication and Tool Use

For an agent society to function smoothly, a clear method of communication is necessary. The 'Code of Conduct' is designed for agents to communicate through a Shared Memory called workspace. The advantage here is that all information is clearly recorded in file form, allowing for transparent tracking of who did what.

Furthermore, you can define the use of external Tools to perform specific Tasks.

  • MCP (Model-Control-Program) Integration: By connecting an agent (or program) dedicated to structured tasks like database queries or external API calls as a tool, the LLM can focus more on creative work.
  • Utilizing Specialized Agents: The overall system's expertise can be enhanced by calling specialized agents as tools, such as a 'Researcher Agent' for web searches or a 'Developer Agent' for code generation.

In this way, the 'Code of Conduct' presents a blueprint for a scalable and flexible ecosystem where agents and tools with diverse specializations are organically connected and collaborate.

Implication: This is a Successful Experiment in a 'Natural Language Backend'

The most significant implication of this model is that it is not just a theoretical model but a successful experiment that has actually built a functioning 'natural language-based backend system'.

The Code of Conduct we've created operates in a way that is strikingly similar to a traditional backend system.

  • It receives a user's request (API call),
  • Internal agents collaborate according to the 'Code of Conduct' workflow, saving results to a database, etc. (internal logic processing),
  • And generates the final result as a file (response).

This is a case that proves it's possible to implement an intelligent automation system using only logical instructions and structured natural language, without a single line of traditional programming code. In other words, the 'Code of Conduct' has shown in reality that it can be both a 'blueprint' for complex software and an executable 'engine' in itself.

And the most astonishing part is that the design for this entire complex system, which in the past would have required countless lines of code, can now be drawn in 'natural language'. This signifies the democratization of AI development and a fundamental shift in the way we collaborate towards the AGI era.

3. The Ultimate Idea: Redesigning Artificial Neural Networks with the Code of Conduct

The structural similarity between the 'Code of Conduct' and artificial neural networks goes beyond a simple analogy, prompting a provocative question:

"Could we reverse-engineer the abstract workflow model of an agent to design a physical artificial neural network architecture?"

This is an attempt to apply insights gained from software architecture (the Code of Conduct) to the design of hardware (or an equivalent neural network model), which could lead to innovative ideas like the following.

Modular Neural Networks

Currently, most LLMs are akin to a giant, monolithic architecture where all neurons are densely interconnected. However, the 'Code of Conduct' has clearly defined roles for each function, like Phase or Stage.

Applying this idea to neural networks, one can imagine a network composed of multiple small 'Expert Modules', each dedicated to a specific function.

  • Language Understanding Module, Code Generation Module, Image Analysis Module, etc., would exist independently.
  • A higher-level 'Router' or 'Coordinator' module would exist to control them, much like the 'Planner' in the 'Code of Conduct'.
  • Depending on the type of task, the 'Router' would select and activate the most appropriate expert modules to solve the problem. This would reduce unnecessary computation, leading to a much more efficient and flexible architecture where specific modules can be easily replaced or upgraded.

Dynamic Routing & Chain of Execution

The core of the 'Code of Conduct' is the 'Chain of Execution', where the result of a previous Task influences the path and content of the next Task. This provides a crucial insight into the flow of information within a neural network.

  • Dynamic Routing: In current neural networks, the path of information flow is mostly fixed once the input data is given. However, by applying the 'Chain of Execution' concept, we could implement a 'dynamic neural network' where the information path is determined in real-time based on the input data and intermediate processing results.
  • Conditional Activation: Not all neurons need to be active all the time. Only the optimal 'Chain of Execution'—that is, the optimal path of neurons required to solve the problem—would be conditionally activated. This is similar to how specific areas of the brain activate for specific tasks, and it would enable far more efficient and powerful reasoning.

The ability to conceive of such advanced neural network architectures 'not with code, but with natural language,' and to simulate their logical validity—this is the most innovative value that the 'Code of "Conduct' offers. We are no longer just users of AI; we are becoming the 'architects' who design the structure of future intelligence with our own language. Isn't this the most realistic way for us to get a 'taste' of the potential of AGI?

I am not in a position to develop AI models myself, but this makes me wonder if current AI models are not already being developed with such a structure.

About

GEMINI.md로 AGI 맛보기(제미나이의 계획수립 간소화로 본 프로젝트 폐기)

Topics

Resources

License

Stars

Watchers

Forks

Contributors