| korean | english |
|---|
gemini cli (Pre-requisite: Install gemini cli)
Unlike
claude code, sub-agents are not natively applied, which leads to problems caused by high token generation (slower speeds). (In some cases, more than 500 million tokens can be used as input for a single project run.) This issue has been raised in gemini-cli: issues(#5000), and we have currently implemented a workaround by configuring sub-agents through a separate method.
- Before running gemini, the following setup is required:
- Copy the
agent_systemsubfolder and its files into the workspace folder. - Copy the task management data into the
data/folder. - Copy the task-related guidelines and agent personas into the
guidelines/folder (this also includes sub-agent configuration files fromclaude code). The planner will provide the appropriate guidelines and personas for each task. (There is no limit to the number of personas.) - In the
settings/folder, create anmcp_listfile. This file should specify descriptions and methods for the MCPs the AI will use. (You can also use an existing AI to generate this.) - In the
settings/folder, create aset_phasesfile. This is for cases where the final result is a combination of smaller, incremental results. If you want a single output, create only one phase. - In the
settings/folder, create aset_stagesfile. This sets the progress stages for each deliverable. If there are no special requirements, you can proceed with the default settings.
- Copy the
- After the setup above, run
geminifrom the workspace folder. (If you haven't set specific permissions, usegemini -y.) - After writing your request, give the instruction "coning".
claude code (Pre-requisite: Install claude code)
Tests have shown that
claude codesuccessfully delegates tasks to sub-agents. We plan to make some modifications to align with theclaude codeexecution environment. Alternatively, we may adapt the system to allow some tasks based ongemini clito be executed byclaude code.
As the process runs, you can manage the project's overall progress, including tracking the update history for results from previously completed run_ids.
In the past, creating intelligent 'agents' was a domain exclusive to developers who spoke the language of computers: 'code'. But now, we are witnessing an incredible paradigm shift. We can now design agents that think and act for themselves, based on the language humans understand best: 'natural language'.
The 'Code of Conduct' lies at the heart of this innovative shift. It is living proof that we can build highly intelligent agents without complex coding, simply by defining clear procedures for thought and principles for action.
The outputs generated by agents created in this way sometimes demonstrate insights equal to, or even surpassing, those of experts in the field. Through this, we may be getting a small taste of a tiny piece of Artificial General Intelligence (AGI).
This entire journey began with a very practical question: "How can we increase satisfaction with the results from an Artificial Intelligence (LLM) from 10% to over 90%?"
Initially, when I gave an LLM a broad task like "Write a report about A," the results were disappointing. The content looked plausible, but it lacked depth and logical consistency. My personal satisfaction level was only about 10-50%.
The root of the problem was the 'approach'. When humans write a complex report, they don't write everything at once. They go through steps: finding resources, creating an outline, writing a draft, and revising.
Inspired by this simple fact, I began to break down the tasks given to the LLM into smaller, sequential requests, just like a human workflow.
- "Find 5 resources related to A."
- "Based on the resources found, create a table of contents for the report."
- "Write the body text for the first item in the table of contents."
Amazingly, by breaking down the work into these small units (Tasks), the satisfaction with the results soared to over 90%. By focusing on each step, the LLM produced far more accurate and consistent outputs.
Before the advent of agents like Gemini CLI, this idea was fleshed out through a manual experiment using multiple LLM chat windows simultaneously.
- Chat Window 1 (Planner): Sets the overall plan and defines the next Task.
- Chat Window 2 (Executor): Receives and executes the Task defined by the 'Planner'.
- Chat Window 3 (Prompt Engineer): Refines the prompt to help the 'Executor' produce the best possible results.
The most crucial part of this process was maintaining 'goal consistency'. The Planner always remembered the user's final instruction, and the Executor was told not just 'what' to do, but also 'why'. All work outputs were saved as files, becoming clear input data for the next step.
As I structured this manual workflow, I realized it bore a striking resemblance to an artificial neural network. The process where an external instruction (input) is transformed into a final output through multiple stages of planning and execution (hidden layers). This became the backbone of the 'Code of Conduct' architecture.
The emergence of Gemini CLI gave me the confidence that this entire process could be automated and was the decisive catalyst for systematizing this idea under the name 'Code of Conduct'.
The 'Code of Conduct' doesn't stop at optimizing the workflow of a single agent. Its true potential is revealed when it is extended to an 'Agent Society' where multiple agents interact. This is akin to building an intelligent system that operates not like a single competent expert, but like a well-organized team or company.
The core of the 'Code of Conduct's' scalability lies in Recursive Delegation. This is the concept where a higher-level agent hires another, subordinate agent that also follows a 'Code of Conduct' to solve its own Task.
| Category | Senior Agent (Manager) | Junior Agent (Practitioner) |
|---|---|---|
| Goal | Achieve a strategic task (e.g., market analysis report) | Solve a specific Task (e.g., summarize a particular article) |
| Role | Decompose complex problems, define, and delegate sub-Tasks | Execute the clearly delegated Task according to the 'Code of Conduct' |
| Interaction | Issues Task instructions and receives deliverables |
Creates and reports back with deliverables (files) |
This hierarchical structure mimics the most efficient collaboration method in human society: the 'organization'. The manager sees the big picture, and the practitioner focuses on the details, maximizing the efficiency and expertise of the entire system. In actual AI research, this concept is actively being studied under the name 'Hierarchical Agent Teams', with ChatDev, which simulates a virtual software company, being a prominent success story.
For an agent society to function smoothly, a clear method of communication is necessary. The 'Code of Conduct' is designed for agents to communicate through a Shared Memory called workspace. The advantage here is that all information is clearly recorded in file form, allowing for transparent tracking of who did what.
Furthermore, you can define the use of external Tools to perform specific Tasks.
- MCP (Model-Control-Program) Integration: By connecting an agent (or program) dedicated to structured tasks like database queries or external API calls as a tool, the LLM can focus more on creative work.
- Utilizing Specialized Agents: The overall system's expertise can be enhanced by calling specialized agents as tools, such as a 'Researcher Agent' for web searches or a 'Developer Agent' for code generation.
In this way, the 'Code of Conduct' presents a blueprint for a scalable and flexible ecosystem where agents and tools with diverse specializations are organically connected and collaborate.
The most significant implication of this model is that it is not just a theoretical model but a successful experiment that has actually built a functioning 'natural language-based backend system'.
The Code of Conduct we've created operates in a way that is strikingly similar to a traditional backend system.
- It receives a user's request (API call),
- Internal agents collaborate according to the 'Code of Conduct' workflow, saving results to a database, etc. (internal logic processing),
- And generates the final result as a file (response).
This is a case that proves it's possible to implement an intelligent automation system using only logical instructions and structured natural language, without a single line of traditional programming code. In other words, the 'Code of Conduct' has shown in reality that it can be both a 'blueprint' for complex software and an executable 'engine' in itself.
And the most astonishing part is that the design for this entire complex system, which in the past would have required countless lines of code, can now be drawn in 'natural language'. This signifies the democratization of AI development and a fundamental shift in the way we collaborate towards the AGI era.
The structural similarity between the 'Code of Conduct' and artificial neural networks goes beyond a simple analogy, prompting a provocative question:
"Could we reverse-engineer the abstract workflow model of an agent to design a physical artificial neural network architecture?"
This is an attempt to apply insights gained from software architecture (the Code of Conduct) to the design of hardware (or an equivalent neural network model), which could lead to innovative ideas like the following.
Currently, most LLMs are akin to a giant, monolithic architecture where all neurons are densely interconnected. However, the 'Code of Conduct' has clearly defined roles for each function, like Phase or Stage.
Applying this idea to neural networks, one can imagine a network composed of multiple small 'Expert Modules', each dedicated to a specific function.
- Language Understanding Module, Code Generation Module, Image Analysis Module, etc., would exist independently.
- A higher-level 'Router' or 'Coordinator' module would exist to control them, much like the 'Planner' in the 'Code of Conduct'.
- Depending on the type of task, the 'Router' would select and activate the most appropriate expert modules to solve the problem. This would reduce unnecessary computation, leading to a much more efficient and flexible architecture where specific modules can be easily replaced or upgraded.
The core of the 'Code of Conduct' is the 'Chain of Execution', where the result of a previous Task influences the path and content of the next Task. This provides a crucial insight into the flow of information within a neural network.
- Dynamic Routing: In current neural networks, the path of information flow is mostly fixed once the input data is given. However, by applying the 'Chain of Execution' concept, we could implement a 'dynamic neural network' where the information path is determined in real-time based on the input data and intermediate processing results.
- Conditional Activation: Not all neurons need to be active all the time. Only the optimal 'Chain of Execution'—that is, the optimal path of neurons required to solve the problem—would be conditionally activated. This is similar to how specific areas of the brain activate for specific tasks, and it would enable far more efficient and powerful reasoning.
The ability to conceive of such advanced neural network architectures 'not with code, but with natural language,' and to simulate their logical validity—this is the most innovative value that the 'Code of "Conduct' offers. We are no longer just users of AI; we are becoming the 'architects' who design the structure of future intelligence with our own language. Isn't this the most realistic way for us to get a 'taste' of the potential of AGI?
I am not in a position to develop AI models myself, but this makes me wonder if current AI models are not already being developed with such a structure.