Comparing changes

v0.3.0 → v0.4.0

22 commits 16 files changed

Commits

f84ec1d chore: update Gemfile.lock mo khan 2025-11-10 18:39:04

754d5f3 docs: update README mo khan 2025-11-10 18:38:07

cbf3ed7 chore: bump version mo khan 2025-11-10 18:37:56

9c54231 docs: fix homepage mo khan 2025-11-08 00:09:44

94a0329 feat: give eval tool access to the toolbox mo khan 2025-11-07 18:24:59

42cd146 feat: add an eval tool to allow an llm to use meta-programming to give itself more tools mo khan 2025-11-07 17:57:42

def5f80 refactor: move tool constants to Toolbox mo khan 2025-11-07 17:46:45

01a88f5 refactor: extract constants for each tool mo khan 2025-11-07 17:43:54

4736632 refactor: extract Shell class mo khan 2025-11-07 17:23:46

ae9aea3 refactor: extract Tool#name mo khan 2025-11-07 16:59:59

074912e refactor: extract a Tool.build method mo khan 2025-11-07 16:56:22

389115b style: remove empty line mo khan 2025-11-07 16:00:50

f63a691 Register tools through #add_tool method mo khan 2025-11-07 15:51:21

9b90cee docs: update README mo khan 2025-11-07 15:01:21

7dffe87 refactor: delegate to Tool instances mo khan 2025-11-06 22:09:22

1e53fbf refactor: delegate to the grep_tool instance mo khan 2025-11-06 20:48:58

0790bc5 refactor: move Tool to a separate file mo khan 2025-11-06 20:19:29

6a6ba04 refactor: extract a Tool class mo khan 2025-11-06 19:30:31

bc67200 refactor: add a Toolbox class mo khan 2025-11-06 19:23:19

b0eb9b4 test: generate specs mo khan 2025-11-05 23:59:34

cdee182 fix: fix the /context to take the current mode into account mo khan 2025-11-05 23:54:18

d75b620 refactor: swap the system prompt in the conversation#history_for method mo khan 2025-11-05 23:47:49

Changed files (16)

bin

test

lib

elelem

agent.rb

application.rb

conversation.rb

tool.rb

toolbox.rb

version.rb

elelem.rb

spec

elelem

agent_spec.rb

conversation_spec.rb

toolbox_spec.rb

spec_helper.rb

CHANGELOG.md

elelem.gemspec

Gemfile.lock

README.md

bin/test

@@ -5,4 +5,4 @@ set -e
 
 cd "$(dirname "$0")/.."
 
-bundle exec rake spec
+bundle exec rspec "$@"

lib/elelem/agent.rb

@@ -2,16 +2,12 @@
 
 module Elelem
   class Agent
-    attr_reader :conversation, :client, :tools
+    attr_reader :conversation, :client, :toolbox
 
-    def initialize(client)
+    def initialize(client, toolbox)
       @conversation = Conversation.new
       @client = client
-      @tools = {
-        read: [grep_tool, list_tool, read_tool],
-        write: [patch_tool, write_tool],
-        execute: [exec_tool]
-      }
+      @toolbox = toolbox
     end
 
     def repl
@@ -36,19 +32,18 @@ module Elelem
             puts "  → Mode: verify (read + execute)"
           when "/mode"
             puts "  Mode: #{mode.to_a.inspect}"
-            puts "  Tools: #{tools_for(mode).map { |t| t.dig(:function, :name) }}"
+            puts "  Tools: #{toolbox.tools_for(mode).map { |t| t.dig(:function, :name) }}"
           when "/exit" then exit
           when "/clear"
             conversation.clear
             puts "  → Conversation cleared"
-          when "/context" then puts conversation.dump
+          when "/context" then puts conversation.dump(mode)
           else
             puts help_banner
           end
         else
-          conversation.set_system_prompt(system_prompt_for(mode))
           conversation.add(role: :user, content: input)
-          result = execute_turn(conversation.history, tools: tools_for(mode))
+          result = execute_turn(conversation.history_for(mode), tools: toolbox.tools_for(mode))
           conversation.add(role: result[:role], content: result[:content])
         end
       end
@@ -70,33 +65,6 @@ module Elelem
       HELP
     end
 
-    def tools_for(modes)
-      modes.map { |mode| tools[mode] }.flatten
-    end
-
-    def system_prompt_for(mode)
-      base = "You are a reasoning coding and system agent."
-
-      case mode.to_a.sort
-      when [:read]
-        "#{base}\n\nRead and analyze. Understand before suggesting action."
-      when [:write]
-        "#{base}\n\nWrite clean, thoughtful code."
-      when [:execute]
-        "#{base}\n\nUse shell commands creatively to understand and manipulate the system."
-      when [:read, :write]
-        "#{base}\n\nFirst understand, then build solutions that integrate well."
-      when [:read, :execute]
-        "#{base}\n\nUse commands to deeply understand the system."
-      when [:write, :execute]
-        "#{base}\n\nCreate and execute freely. Have fun. Be kind."
-      when [:read, :write, :execute]
-        "#{base}\n\nYou have all tools. Use them wisely."
-      else
-        base
-      end
-    end
-
     def format_tool_call(name, args)
       case name
       when "execute"
@@ -143,7 +111,7 @@ module Elelem
             args = call.dig("function", "arguments")
 
             puts "Tool> #{format_tool_call(name, args)}"
-            result = run_tool(name, args)
+            result = toolbox.run_tool(name, args)
             turn_context << { role: "tool", content: JSON.dump(result) }
           end
 
@@ -154,120 +122,5 @@ module Elelem
         return { role: "assistant", content: content }
       end
     end
-
-    def run_exec(command, args: [], env: {}, cwd: Dir.pwd, stdin: nil)
-      cmd = command.is_a?(Array) ? command.first : command
-      cmd_args = command.is_a?(Array) ? command[1..] + args : args
-      stdout, stderr, status = Open3.capture3(env, cmd, *cmd_args, chdir: cwd, stdin_data: stdin)
-      {
-        "exit_status" => status.exitstatus,
-        "stdout" => stdout.to_s,
-        "stderr" => stderr.to_s
-      }
-    end
-
-    def expand_path(path)
-      Pathname.new(path).expand_path
-    end
-
-    def read_file(path)
-      full_path = expand_path(path)
-      full_path.exist? ? { content: full_path.read } : { error: "File not found: #{path}" }
-    end
-
-    def write_file(path, content)
-      full_path = expand_path(path)
-      FileUtils.mkdir_p(full_path.dirname)
-      { bytes_written: full_path.write(content) }
-    end
-
-    def run_tool(name, args)
-      case name
-      when "execute" then run_exec(args["cmd"], args: args["args"] || [], env: args["env"] || {}, cwd: args["cwd"].to_s.empty? ? Dir.pwd : args["cwd"], stdin: args["stdin"])
-      when "grep" then run_exec("git", args: ["grep", "-nI", args["query"]])
-      when "list" then run_exec("git", args: args["path"] ? ["ls-files", "--", args["path"]] : ["ls-files"])
-      when "patch" then run_exec("git", args: ["apply", "--index", "--whitespace=nowarn", "-p1"], stdin: args["diff"])
-      when "read" then read_file(args["path"])
-      when "write" then write_file(args["path"], args["content"])
-      else
-        { error: "Unknown tool", name: name, args: args }
-      end
-    rescue => error
-      { error: error.message, name: name, args: args }
-    end
-
-    def exec_tool
-      build_tool(
-        "execute",
-        "Execute shell commands directly. Commands run in a shell context. Examples: 'date', 'git status'.",
-        {
-          cmd: { type: "string" },
-          args: { type: "array", items: { type: "string" } },
-          env: { type: "object", additionalProperties: { type: "string" } },
-          cwd: { type: "string", description: "Working directory (defaults to current)" },
-          stdin: { type: "string" }
-        },
-        ["cmd"]
-      )
-    end
-
-    def grep_tool
-      build_tool(
-        "grep",
-        "Search all git-tracked files using git grep. Returns file paths with matching line numbers.",
-        { query: { type: "string" } },
-        ["query"]
-      )
-    end
-
-    def list_tool
-      build_tool(
-        "list",
-        "List all git-tracked files in the repository, optionally filtered by path.",
-        { path: { type: "string" } }
-      )
-    end
-
-    def patch_tool
-      build_tool(
-        "patch",
-        "Apply a unified diff patch via 'git apply'. Use for surgical edits to existing files.",
-        { diff: { type: "string" } },
-        ["diff"]
-      )
-    end
-
-    def read_tool
-      build_tool(
-        "read",
-        "Read complete contents of a file. Requires exact file path.",
-        { path: { type: "string" } },
-        ["path"]
-      )
-    end
-
-    def write_tool
-      build_tool(
-        "write",
-        "Write complete file contents (overwrites existing files). Creates parent directories automatically.",
-        { path: { type: "string" }, content: { type: "string" } },
-        ["path", "content"]
-      )
-    end
-
-    def build_tool(name, description, properties, required = [])
-      {
-        type: "function",
-        function: {
-          name: name,
-          description: description,
-          parameters: {
-            type: "object",
-            properties: properties,
-            required: required
-          }
-        }
-      }
-    end
   end
 end

lib/elelem/application.rb

@@ -20,8 +20,7 @@ module Elelem
         model: options[:model],
       )
       say "Agent (#{options[:model]})", :green
-      agent = Agent.new(client)
-
+      agent = Agent.new(client, Toolbox.new)
       agent.repl
     end

lib/elelem/conversation.rb

@@ -8,8 +8,10 @@ module Elelem
       @items = items
     end
 
-    def history
-      @items
+    def history_for(mode)
+      history = @items.dup
+      history[0] = { role: "system", content: system_prompt_for(mode) }
+      history
     end
 
     def add(role: :user, content: "")
@@ -28,18 +30,37 @@ module Elelem
       @items = default_context
     end
 
-    def set_system_prompt(prompt)
-      @items[0] = { role: :system, content: prompt }
+    def dump(mode)
+      JSON.pretty_generate(history_for(mode))
     end
 
-    def dump
-      JSON.pretty_generate(@items)
+    private
+
+    def default_context(prompt = system_prompt_for([]))
+      [{ role: "system", content: prompt }]
     end
 
-    private
+    def system_prompt_for(mode)
+      base = system_prompt
 
-    def default_context
-      [{ role: "system", content: system_prompt }]
+      case mode.sort
+      when [:read]
+        "#{base}\n\nRead and analyze. Understand before suggesting action."
+      when [:write]
+        "#{base}\n\nWrite clean, thoughtful code."
+      when [:execute]
+        "#{base}\n\nUse shell commands creatively to understand and manipulate the system."
+      when [:read, :write]
+        "#{base}\n\nFirst understand, then build solutions that integrate well."
+      when [:execute, :read]
+        "#{base}\n\nUse commands to deeply understand the system."
+      when [:execute, :write]
+        "#{base}\n\nCreate and execute freely. Have fun. Be kind."
+      when [:execute, :read, :write]
+        "#{base}\n\nYou have all tools. Use them wisely."
+      else
+        base
+      end
     end
 
     def system_prompt

lib/elelem/tool.rb

@@ -0,0 +1,47 @@
+# frozen_string_literal: true
+
+module Elelem
+  class Tool
+    attr_reader :name
+
+    def initialize(schema, &block)
+      @name = schema.dig(:function, :name)
+      @schema = schema
+      @block = block
+    end
+
+    def call(args)
+      return ArgumentError.new(args) unless valid?(args)
+
+      @block.call(args)
+    end
+
+    def valid?(args)
+      # TODO:: Use JSON Schema Validator
+      true
+    end
+
+    def to_h
+      @schema&.to_h
+    end
+
+    class << self
+      def build(name, description, properties, required = [])
+        new({
+          type: "function",
+          function: {
+            name: name,
+            description: description,
+            parameters: {
+              type: "object",
+              properties: properties,
+              required: required
+            }
+          }
+        }) do |args|
+          yield args
+        end
+      end
+    end
+  end
+end

lib/elelem/toolbox.rb

@@ -0,0 +1,84 @@
+# frozen_string_literal: true
+
+module Elelem
+  class Toolbox
+    READ_TOOL = Tool.build("read", "Read complete contents of a file. Requires exact file path.", { path: { type: "string" } }, ["path"]) do |args|
+      path = args["path"]
+      full_path = Pathname.new(path).expand_path
+      full_path.exist? ? { content: full_path.read } : { error: "File not found: #{path}" }
+    end
+
+    EXEC_TOOL = Tool.build("execute", "Execute shell commands directly. Commands run in a shell context. Examples: 'date', 'git status'.", { cmd: { type: "string" }, args: { type: "array", items: { type: "string" } }, env: { type: "object", additionalProperties: { type: "string" } }, cwd: { type: "string", description: "Working directory (defaults to current)" }, stdin: { type: "string" } }, ["cmd"]) do |args|
+      Elelem.shell.execute(
+        args["cmd"],
+        args: args["args"] || [],
+        env: args["env"] || {},
+        cwd: args["cwd"].to_s.empty? ? Dir.pwd : args["cwd"],
+        stdin: args["stdin"]
+      )
+    end
+
+    GREP_TOOL = Tool.build("grep", "Search all git-tracked files using git grep. Returns file paths with matching line numbers.", { query: { type: "string" } }, ["query"]) do |args|
+      Elelem.shell.execute("git", args: ["grep", "-nI", args["query"]])
+    end
+
+    LIST_TOOL = Tool.build("list", "List all git-tracked files in the repository, optionally filtered by path.", { path: { type: "string" } }) do |args|
+      Elelem.shell.execute("git", args: args["path"] ? ["ls-files", "--", args["path"]] : ["ls-files"])
+    end
+
+    PATCH_TOOL = Tool.build( "patch", "Apply a unified diff patch via 'git apply'. Use for surgical edits to existing files.", { diff: { type: "string" } }, ["diff"]) do |args|
+      Elelem.shell.execute("git", args: ["apply", "--index", "--whitespace=nowarn", "-p1"], stdin: args["diff"])
+    end
+
+    WRITE_TOOL = Tool.build("write", "Write complete file contents (overwrites existing files). Creates parent directories automatically.", { path: { type: "string" }, content: { type: "string" } }, ["path", "content"]) do |args|
+      full_path = Pathname.new(args["path"]).expand_path
+      FileUtils.mkdir_p(full_path.dirname)
+      { bytes_written: full_path.write(args["content"]) }
+    end
+
+    attr_reader :tools
+
+    def initialize
+      @tools_by_name = {}
+      @tools = { read: [], write: [], execute: [] }
+      add_tool(eval_tool(binding), :execute)
+      add_tool(EXEC_TOOL, :execute)
+      add_tool(GREP_TOOL, :read)
+      add_tool(LIST_TOOL, :read)
+      add_tool(PATCH_TOOL, :write)
+      add_tool(READ_TOOL, :read)
+      add_tool(WRITE_TOOL, :write)
+    end
+
+    def add_tool(tool, mode)
+      @tools[mode] << tool
+      @tools_by_name[tool.name] = tool
+    end
+
+    def register_tool(name, description, properties = {}, required = [], mode: :execute, &block)
+      add_tool(Tool.build(name, description, properties, required, &block), mode)
+    end
+
+    def tools_for(modes)
+      Array(modes).map { |mode| tools[mode].map(&:to_h) }.flatten
+    end
+
+    def run_tool(name, args)
+      @tools_by_name[name]&.call(args) || { error: "Unknown tool", name: name, args: args }
+    rescue => error
+      { error: error.message, name: name, args: args, backtrace: error.backtrace.first(5) }
+    end
+
+    def tool_schema(name)
+      @tools_by_name[name]&.to_h
+    end
+
+    private
+
+    def eval_tool(target_binding)
+      Tool.build("eval", "Evaluates Ruby code with full access to register new tools via the `register_tool(name, desc, properties, required, mode: :execute) { |args| ... }` method.", { ruby: { type: "string" } }, ["ruby"]) do |args|
+        { result: target_binding.eval(args["ruby"]) }
+      end
+    end
+  end
+end

lib/elelem/version.rb

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 
 module Elelem
-  VERSION = "0.3.0"
+  VERSION = "0.4.0"
 end

lib/elelem.rb

@@ -16,6 +16,8 @@ require "timeout"
 require_relative "elelem/agent"
 require_relative "elelem/application"
 require_relative "elelem/conversation"
+require_relative "elelem/tool"
+require_relative "elelem/toolbox"
 require_relative "elelem/version"
 
 Reline.input = $stdin
@@ -23,4 +25,29 @@ Reline.output = $stdout
 
 module Elelem
   class Error < StandardError; end
+
+  class Shell
+    def execute(command, args: [], env: {}, cwd: Dir.pwd, stdin: nil)
+      cmd = command.is_a?(Array) ? command.first : command
+      cmd_args = command.is_a?(Array) ? command[1..] + args : args
+      stdout, stderr, status = Open3.capture3(
+        env,
+        cmd,
+        *cmd_args,
+        chdir: cwd,
+        stdin_data: stdin
+      )
+      {
+        "exit_status" => status.exitstatus,
+        "stdout" => stdout.to_s,
+        "stderr" => stderr.to_s
+      }
+    end
+  end
+
+  class << self
+    def shell
+      @shell ||= Shell.new
+    end
+  end
 end

spec/elelem/agent_spec.rb

@@ -0,0 +1,36 @@
+# frozen_string_literal: true
+
+RSpec.describe Elelem::Agent do
+  let(:mock_client) { double("client") }
+  let(:agent) { described_class.new(mock_client, Elelem::Toolbox.new) }
+
+  describe "#initialize" do
+    it "creates a new conversation" do
+      expect(agent.conversation).to be_a(Elelem::Conversation)
+    end
+
+    it "stores the client" do
+      expect(agent.client).to eq(mock_client)
+    end
+
+    it "initializes tools for all modes" do
+      expect(agent.toolbox.tools[:read]).to be_an(Array)
+      expect(agent.toolbox.tools[:write]).to be_an(Array)
+      expect(agent.toolbox.tools[:execute]).to be_an(Array)
+    end
+  end
+
+  describe "integration with conversation" do
+    it "conversation uses mode-aware prompts" do
+      conversation = agent.conversation
+      conversation.add(role: :user, content: "test message")
+
+      read_history = conversation.history_for([:read])
+      write_history = conversation.history_for([:write])
+
+      expect(read_history[0][:content]).to include("Read and analyze")
+      expect(write_history[0][:content]).to include("Write clean, thoughtful code")
+      expect(read_history[0][:content]).not_to eq(write_history[0][:content])
+    end
+  end
+end

spec/elelem/conversation_spec.rb

@@ -0,0 +1,188 @@
+# frozen_string_literal: true
+
+RSpec.describe Elelem::Conversation do
+  let(:conversation) { described_class.new }
+
+  describe "#history_for" do
+    context "with empty conversation" do
+      it "returns history with mode-specific system prompt for read mode" do
+        history = conversation.history_for([:read])
+
+        expect(history.length).to eq(1)
+        expect(history[0][:role]).to eq("system")
+        expect(history[0][:content]).to include("Read and analyze")
+      end
+
+      it "returns history with mode-specific system prompt for write mode" do
+        history = conversation.history_for([:write])
+
+        expect(history[0][:content]).to include("Write clean, thoughtful code")
+      end
+
+      it "returns history with mode-specific system prompt for execute mode" do
+        history = conversation.history_for([:execute])
+
+        expect(history[0][:content]).to include("Use shell commands creatively")
+      end
+
+      it "returns history with mode-specific system prompt for read+write mode" do
+        history = conversation.history_for([:read, :write])
+
+        expect(history[0][:content]).to include("First understand, then build solutions")
+      end
+
+      it "returns history with mode-specific system prompt for read+execute mode" do
+        history = conversation.history_for([:read, :execute])
+
+        expect(history[0][:content]).to include("Use commands to deeply understand")
+      end
+
+      it "returns history with mode-specific system prompt for write+execute mode" do
+        history = conversation.history_for([:write, :execute])
+
+        expect(history[0][:content]).to include("Create and execute freely")
+      end
+
+      it "returns history with mode-specific system prompt for all tools mode" do
+        history = conversation.history_for([:read, :write, :execute])
+
+        expect(history[0][:content]).to include("You have all tools")
+      end
+
+      it "returns base system prompt for unknown mode" do
+        history = conversation.history_for([:unknown])
+
+        expect(history[0][:content]).not_to include("Read and analyze")
+        expect(history[0][:content]).not_to include("Write clean")
+      end
+
+      it "returns base system prompt for empty mode" do
+        history = conversation.history_for([])
+
+        expect(history[0][:role]).to eq("system")
+        expect(history[0][:content]).to be_a(String)
+      end
+    end
+
+    context "with mode order independence" do
+      it "returns same prompt for [:read, :write] and [:write, :read]" do
+        history1 = conversation.history_for([:read, :write])
+        history2 = conversation.history_for([:write, :read])
+
+        expect(history1[0][:content]).to eq(history2[0][:content])
+      end
+
+      it "returns same prompt for [:read, :execute] and [:execute, :read]" do
+        history1 = conversation.history_for([:read, :execute])
+        history2 = conversation.history_for([:execute, :read])
+
+        expect(history1[0][:content]).to eq(history2[0][:content])
+      end
+
+      it "returns same prompt for all permutations of [:read, :write, :execute]" do
+        history1 = conversation.history_for([:read, :write, :execute])
+        history2 = conversation.history_for([:execute, :read, :write])
+        history3 = conversation.history_for([:write, :execute, :read])
+
+        expect(history1[0][:content]).to eq(history2[0][:content])
+        expect(history2[0][:content]).to eq(history3[0][:content])
+      end
+    end
+
+    context "with populated conversation" do
+      before do
+        conversation.add(role: :user, content: "Hello")
+        conversation.add(role: :assistant, content: "Hi there")
+      end
+
+      it "preserves all conversation items" do
+        history = conversation.history_for([:read])
+
+        expect(history.length).to eq(3)
+        expect(history[1][:role]).to eq(:user)
+        expect(history[1][:content]).to eq("Hello")
+        expect(history[2][:role]).to eq(:assistant)
+        expect(history[2][:content]).to eq("Hi there")
+      end
+
+      it "updates system prompt without mutating original" do
+        original_items = conversation.instance_variable_get(:@items)
+        original_system_content = original_items[0][:content]
+
+        history = conversation.history_for([:read])
+
+        expect(history[0][:content]).not_to eq(original_system_content)
+        expect(original_items[0][:content]).to eq(original_system_content)
+      end
+
+      it "returns a copy, not the original array" do
+        history = conversation.history_for([:read])
+        original_items = conversation.instance_variable_get(:@items)
+
+        expect(history).not_to be(original_items)
+      end
+    end
+  end
+
+  describe "#add" do
+    it "adds user message to conversation" do
+      conversation.add(role: :user, content: "test message")
+      history = conversation.history_for([])
+
+      expect(history.length).to eq(2)
+      expect(history[1][:content]).to eq("test message")
+    end
+
+    it "merges consecutive messages with same role" do
+      conversation.add(role: :user, content: "part 1")
+      conversation.add(role: :user, content: "part 2")
+      history = conversation.history_for([])
+
+      expect(history.length).to eq(2)
+      expect(history[1][:content]).to eq("part 1part 2")
+    end
+
+    it "ignores nil content" do
+      conversation.add(role: :user, content: nil)
+      history = conversation.history_for([])
+
+      expect(history.length).to eq(1)
+    end
+
+    it "ignores empty content" do
+      conversation.add(role: :user, content: "")
+      history = conversation.history_for([])
+
+      expect(history.length).to eq(1)
+    end
+
+    it "raises error for unknown role" do
+      expect {
+        conversation.add(role: :unknown, content: "test")
+      }.to raise_error(/unknown role/)
+    end
+  end
+
+  describe "#clear" do
+    it "resets conversation to default context" do
+      conversation.add(role: :user, content: "test")
+      conversation.clear
+      history = conversation.history_for([])
+
+      expect(history.length).to eq(1)
+      expect(history[0][:role]).to eq("system")
+    end
+  end
+
+  describe "#dump" do
+    it "returns JSON representation with mode-specific prompt" do
+      conversation.add(role: :user, content: "test")
+      json = conversation.dump([:read])
+
+      parsed = JSON.parse(json)
+      expect(parsed).to be_an(Array)
+      expect(parsed.length).to eq(2)
+      expect(parsed[0]["content"]).to include("Read and analyze")
+    end
+  end
+end

spec/elelem/toolbox_spec.rb

@@ -0,0 +1,106 @@
+# frozen_string_literal: true
+#
+RSpec.describe Elelem::Toolbox do
+  subject { described_class.new }
+
+  describe "#tools_for" do
+    it "returns read tools for read mode" do
+      mode = Set[:read]
+      tools = subject.tools_for(mode)
+
+      tool_names = tools.map { |t| t.dig(:function, :name) }
+      expect(tool_names).to include("grep", "list", "read")
+      expect(tool_names).not_to include("write", "patch", "execute")
+    end
+
+    it "returns write tools for write mode" do
+      mode = Set[:write]
+      tools = subject.tools_for(mode)
+
+      tool_names = tools.map { |t| t.dig(:function, :name) }
+      expect(tool_names).to include("patch", "write")
+      expect(tool_names).not_to include("grep", "execute")
+    end
+
+    it "returns execute tools for execute mode" do
+      mode = Set[:execute]
+      tools = subject.tools_for(mode)
+
+      tool_names = tools.map { |t| t.dig(:function, :name) }
+      expect(tool_names).to include("execute")
+      expect(tool_names).not_to include("grep", "write")
+    end
+
+    it "returns all tools for auto mode" do
+      mode = Set[:read, :write, :execute]
+      tools = subject.tools_for(mode)
+
+      tool_names = tools.map { |t| t.dig(:function, :name) }
+      expect(tool_names).to include("grep", "list", "read", "patch", "write", "execute")
+    end
+
+    it "returns combined tools for build mode" do
+      mode = Set[:read, :write]
+      tools = subject.tools_for(mode)
+
+      tool_names = tools.map { |t| t.dig(:function, :name) }
+      expect(tool_names).to include("grep", "read", "write", "patch")
+      expect(tool_names).not_to include("execute")
+    end
+  end
+
+  describe "meta-programming with eval tool" do
+    it "allows LLM to register new tools dynamically" do
+      subject.run_tool("eval", {
+        "ruby" => <<~RUBY
+          register_tool("hello", "Says hello to a name", { name: { type: "string" } }, ["name"]) do |args|
+            { greeting: "Hello, " + args['name']+ "!" }
+          end
+        RUBY
+      })
+
+      expect(subject.tools_for(:execute)).to include(hash_including({
+        type: "function",
+        function: {
+          name: "hello",
+          description: "Says hello to a name",
+          parameters: {
+            type: "object",
+            properties: { name: { type: "string" } },
+            required: ["name"]
+          }
+        }
+      }))
+    end
+
+    it "allows LLM to call dynamically created tools" do
+      subject.run_tool("eval", {
+        "ruby" => <<~RUBY
+          register_tool("add", "Adds two numbers", { a: { type: "number" }, b: { type: "number" } }, ["a", "b"]) do |args|
+            { sum: args["a"] + args["b"] }
+          end
+        RUBY
+      })
+
+      result = subject.run_tool("add", { "a" => 5, "b" => 3 })
+      expect(result[:sum]).to eq(8)
+    end
+
+    it "allows LLM to inspect tool schemas" do
+      result = subject.run_tool("eval", { "ruby" => "tool_schema('read')" })
+      expect(result[:result]).to be_a(Hash)
+      expect(result[:result].dig(:function, :name)).to eq("read")
+    end
+
+    it "executes arbitrary Ruby code" do
+      result = subject.run_tool("eval", { "ruby" => "2 + 2" })
+      expect(result[:result]).to eq(4)
+    end
+
+    it "handles errors gracefully" do
+      result = subject.run_tool("eval", { "ruby" => "undefined_variable" })
+      expect(result[:error]).to include("undefined")
+      expect(result[:backtrace]).to be_an(Array)
+    end
+  end
+end

spec/spec_helper.rb

@@ -1,6 +1,6 @@
 # frozen_string_literal: true
 
-require "elelem"
+require_relative "../lib/elelem"
 
 RSpec.configure do |config|
   # Enable flags like --only-failures and --next-failure

CHANGELOG.md

@@ -1,5 +1,27 @@
 ## [Unreleased]
 
+## [0.4.0] - 2025-11-10
+
+### Added
+- **Eval Tool**: Meta-programming tool that allows the LLM to dynamically create and register new tools at runtime
+  - Eval tool has access to the toolbox for enhanced capabilities
+- Comprehensive test coverage with RSpec
+  - Agent specs
+  - Conversation specs
+  - Toolbox specs
+
+### Changed
+- **Architecture Improvements**: Significant refactoring for better separation of concerns
+  - Extracted Tool class to separate file (`lib/elelem/tool.rb`)
+  - Extracted Toolbox class to separate file (`lib/elelem/toolbox.rb`)
+  - Extracted Shell class for command execution
+  - Improved tool registration through `#add_tool` method
+  - Tool constants moved to Toolbox for better organization
+  - Agent class simplified by delegating to Tool instances
+
+### Fixed
+- `/context` command now correctly accounts for the current mode
+
 ## [0.3.0] - 2025-11-05
 
 ### Added

elelem.gemspec

@@ -10,7 +10,7 @@ Gem::Specification.new do |spec|
 
   spec.summary = "A REPL for Ollama."
   spec.description = "A REPL for Ollama."
-  spec.homepage = "https://www.mokhan.ca"
+  spec.homepage = "https://gitlab.com/mokhax/elelem"
   spec.license = "MIT"
   spec.required_ruby_version = ">= 3.4.0"
   spec.required_rubygems_version = ">= 3.3.11"
@@ -38,6 +38,8 @@ Gem::Specification.new do |spec|
     "lib/elelem/application.rb",
     "lib/elelem/conversation.rb",
     "lib/elelem/system_prompt.erb",
+    "lib/elelem/tool.rb",
+    "lib/elelem/toolbox.rb",
     "lib/elelem/version.rb",
   ]
   spec.bindir = "exe"
@@ -45,12 +47,15 @@ Gem::Specification.new do |spec|
   spec.require_paths = ["lib"]
 
   spec.add_dependency "erb"
+  spec.add_dependency "fileutils"
   spec.add_dependency "json"
   spec.add_dependency "json-schema"
   spec.add_dependency "logger"
   spec.add_dependency "net-llm"
   spec.add_dependency "open3"
+  spec.add_dependency "pathname"
   spec.add_dependency "reline"
+  spec.add_dependency "set"
   spec.add_dependency "thor"
   spec.add_dependency "timeout"
 end

Gemfile.lock

@@ -1,14 +1,17 @@
 PATH
   remote: .
   specs:
-    elelem (0.3.0)
+    elelem (0.4.0)
       erb
+      fileutils
       json
       json-schema
       logger
       net-llm
       open3
+      pathname
       reline
+      set
       thor
       timeout
 
@@ -22,6 +25,7 @@ GEM
     date (3.4.1)
     diff-lcs (1.6.2)
     erb (5.0.2)
+    fileutils (1.8.0)
     io-console (0.8.1)
     irb (1.15.2)
       pp (>= 0.6.0)
@@ -46,6 +50,7 @@ GEM
       uri (~> 1.0)
     open3 (0.2.1)
     openssl (3.3.1)
+    pathname (0.4.0)
     pp (0.6.2)
       prettyprint
     prettyprint (0.2.0)
@@ -72,6 +77,7 @@ GEM
       diff-lcs (>= 1.2.0, < 2.0)
       rspec-support (~> 3.13.0)
     rspec-support (3.13.4)
+    set (1.1.2)
     stringio (3.1.7)
     thor (1.3.2)
     timeout (0.4.3)

README.md

@@ -1,74 +1,61 @@
 # Elelem
 
-Fast, correct, autonomous - Pick two
+Fast, correct, autonomous – pick two.
 
-PURPOSE:
+## Purpose
 
-Elelem is a minimal coding agent written in Ruby. It is intended to
-assist me (a software engineer and computer science student) with writing,
-editing, and managing code and text files from the command line. It acts
-as a direct interface to an LLM, providing it with a simple text-based
-UI and access to the local filesystem.
+Elelem is a minimal coding agent written in Ruby. It is designed to help
+you write, edit, and manage code and plain-text files from the command line
+by delegating work to an LLM. The agent exposes a simple text-based UI and a
+set of built-in tools that give the LLM access to the local file system
+and Git.
 
-DESIGN PRINCIPLES:
+## Design Principles
 
-- Follows the Unix philosophy: simple, composable, minimal.
-- Convention over configuration.
-- Avoids unnecessary defensive checks, or complexity.
-- Assumes a mature and responsible LLM that behaves like a capable engineer.
-- Designed for my workflow and preferences.
-- Efficient and minimal like aider - https://aider.chat/
-- UX like Claude Code - https://docs.claude.com/en/docs/claude-code/overview
+* Unix philosophy – simple, composable, minimal.
+* Convention over configuration.
+* No defensive checks or complexity beyond what is necessary.
+* Assumes a mature, responsible LLM that behaves like a capable engineer.
+* Optimised for my personal workflow and preferences.
+* Efficient and minimal like *aider* – https://aider.chat/.
+* UX similar to Claude Code – https://docs.claude.com/en/docs/claude-code/overview.
 
-SYSTEM ASSUMPTIONS:
+## System Assumptions
 
-- This script is used on a Linux system with the following tools: Alacritty, tmux, Bash, and Vim.
-- It is always run inside a Git repository.
-- All project work is assumed to be version-controlled with Git.
-- Git is expected to be available and working; no checks are necessary.
+* Linux host with Alacritty, tmux, Bash, Vim.
+* Runs inside a Git repository.
+* Git is available and functional.
 
-SCOPE:
+## Scope
 
-- This program operates only on code and plain-text files.
-- It does not need to support binary files.
-- The LLM has full access to execute system commands.
-- There are no sandboxing, permission, or validation layers.
-- Execution is not restricted or monitored - responsibility is delegated to the LLM.
+Only plain-text and source-code files are supported. No binary handling,
+sandboxing, or permission checks are performed - the LLM has full access.
 
-CONFIGURATION:
+## Configuration
 
-- Avoid adding configuration options unless absolutely necessary.
-- Prefer hard-coded values that can be changed later if needed.
-- Only introduce environment variables after repeated usage proves them worthwhile.
+Prefer convention over configuration. Add environment variables only after
+repeated use proves their usefulness.
 
-UI EXPECTATIONS:
+## UI Expectations
 
-- The TUI must remain simple, fast, and predictable.
-- No mouse support or complex UI components are required.
-- Interaction is strictly keyboard-driven.
+Keyboard-driven, minimal TUI. No mouse support or complex widgets.
 
-CODING STANDARDS FOR LLM:
+## Coding Standards for the LLM
 
-- Do not add error handling or logging unless it is essential for functionality.
-- Keep methods short and single-purpose.
-- Use descriptive, conventional names.
-- Stick to Ruby's standard library whenever possible.
+* No extra error handling unless essential.
+* Keep methods short, single-purpose.
+* Descriptive, conventional names.
+* Use Ruby standard library where possible.
 
-HELPFUL LINKS:
+## Helpful Links
 
-- https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents
-- https://www.anthropic.com/engineering/writing-tools-for-agents
-- https://simonwillison.net/2025/Sep/30/designing-agentic-loops/
+* https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents
+* https://www.anthropic.com/engineering/writing-tools-for-agents
+* https://simonwillison.net/2025/Sep/30/designing-agentic-loops/
 
 ## Installation
 
-Install the gem and add to the application's Gemfile by executing:
-
-```bash
-bundle add elelem
-```
-
-If bundler is not being used to manage dependencies, install the gem by executing:
+Install the gem directly:
 
 ```bash
 gem install elelem
@@ -84,47 +71,86 @@ elelem chat
 
 ### Options
 
-- `--host`: Specify Ollama host (default: localhost:11434)
-- `--model`: Specify Ollama model (default: gpt-oss, currently only tested with gpt-oss)  
-- `--token`: Provide authentication token
+* `--host` – Ollama host (default: `localhost:11434`).
+* `--model` – Ollama model (default: `gpt-oss`).
+* `--token` – Authentication token.
 
 ### Examples
 
 ```bash
-# Chat with default model
+# Default model
 elelem chat
 
-# Chat with specific model and host
+# Specific model and host
 elelem chat --model llama2 --host remote-host:11434
 ```
 
-### Features
+## Mode System
 
-- **Interactive REPL**: Clean command-line interface for chatting
-- **Mode System**: Control agent capabilities with workflow modes (plan, build, verify, auto)
-- **Tool Execution**: Execute shell commands, read/write files, search code
-- **Streaming Responses**: Real-time streaming of AI responses
-- **Conversation History**: Maintains context across the session
+The agent exposes seven built‑in tools. You can switch which ones are
+available by changing the *mode*:
 
-### Mode System
+| Mode    | Enabled Tools                            |
+|---------|------------------------------------------|
+| plan    | `grep`, `list`, `read`                   |
+| build   | `grep`, `list`, `read`, `patch`, `write` |
+| verify  | `grep`, `list`, `read`, `execute`        |
+| auto    | All tools                                |
 
-Control what tools the agent can access:
+Use the following commands inside the REPL:
 
-```bash
-/mode plan    # Read-only (grep, list, read)
-/mode build   # Read + Write (grep, list, read, patch, write)
-/mode verify  # Read + Execute (grep, list, read, execute)
-/mode auto    # All tools enabled
+```text
+/mode plan    # Read‑only
+/mode build   # Read + Write
+/mode verify  # Read + Execute
+/mode auto    # All tools
+/mode         # Show current mode
 ```
 
-Each mode adapts the system prompt to guide appropriate behavior.
+The system prompt is adjusted per mode so the LLM knows which actions
+are permissible.
+
+## Features
+
+* **Interactive REPL** – clean, streaming chat.
+* **Toolbox** – file I/O, Git, shell execution.
+* **Streaming Responses** – output appears in real time.
+* **Conversation History** – persists across turns; can be cleared.
+* **Context Dump** – `/context` shows the current conversation state.
+
+## Toolbox Overview
+
+The `Toolbox` class is defined in `lib/elelem/toolbox.rb`. It supplies
+seven tools, each represented by a JSON schema that the LLM can call.
+
+| Tool      | Purpose                              | Parameters                           |
+| ----      | -------                              | ----------                           |
+| `eval`    | Dynamically create new tools         | `code`                               |
+| `grep`    | Search Git‑tracked files             | `query`                              |
+| `list`    | List tracked files                   | `path` (optional)                    |
+| `read`    | Read file contents                   | `path`                               |
+| `write`   | Overwrite a file                     | `path`, `content`                    |
+| `patch`   | Apply a unified diff via `git apply` | `diff`                               |
+| `execute` | Run shell commands                   | `cmd`, `args`, `env`, `cwd`, `stdin` |
+
+## Tool Definition
+
+The core `Tool` wrapper is defined in `lib/elelem/tool.rb`. Each tool is
+created with a name, description, JSON schema for arguments, and a block
+that performs the operation. The LLM calls a tool by name and passes the
+arguments as a hash.
+
+## Known Limitations
 
-## Development
+* Assumes the current directory is a Git repository.
+* No sandboxing – the LLM can run arbitrary commands.
+* Error handling is minimal; exceptions are returned as an `error` field.
 
-After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
+## Contributing
 
-To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).
+Feel free to open issues or pull requests. The repository follows the
+GitHub Flow.
 
 ## License
 
-The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
+MIT – see the bundled `LICENSE.txt`.