<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Martin's Dev Diary]]></title><description><![CDATA[Dive into AI's frontiers with practical applications, from cutting-edge models to LLMs in web apps. Join me in exploring AI development, sharing code, and discu]]></description><link>https://martinbowling.com</link><generator>RSS for Node</generator><lastBuildDate>Wed, 15 Apr 2026 20:22:59 GMT</lastBuildDate><atom:link href="https://martinbowling.com/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[ThinkChain: When Claude's Thinking Meets Tool Feedback Loops]]></title><description><![CDATA[The Inspiration!
I saw Pietro Schirano's tweet about his "chain of tools" and immediately thought - I must build this! You see, I've been obsessed with Claude's tool use capabilities lately, especially after Anthropic released their interleaved think...]]></description><link>https://martinbowling.com/thinkchain-when-claudes-thinking-meets-tool-feedback-loops</link><guid isPermaLink="true">https://martinbowling.com/thinkchain-when-claudes-thinking-meets-tool-feedback-loops</guid><category><![CDATA[Python]]></category><category><![CDATA[AI]]></category><category><![CDATA[claude]]></category><dc:creator><![CDATA[Martin Bowling]]></dc:creator><pubDate>Sat, 14 Jun 2025 20:21:05 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1749932448899/a719ff62-a2b8-4022-a52e-d57d0c99baf2.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1 id="heading-the-inspiration">The Inspiration!</h1>
<p>I saw <a target="_blank" href="https://x.com/skirano/status/1933208161244086430">Pietro Schirano's tweet</a> about his "chain of tools" and immediately thought - I must build this! You see, I've been obsessed with Claude's tool use capabilities lately, especially after Anthropic released their interleaved thinking features. Most Claude integrations I'd seen treated tools as black boxes - call a tool, get a result, move on. But what if tool results could feed back into Claude's thinking process in real-time?</p>
<p>That simple question led me down a rabbit hole that resulted in <strong>ThinkChain</strong> - a system where thinking, tool execution, and reasoning form a continuous feedback loop. Instead of the traditional linear flow of "call tool → get result → respond," ThinkChain creates something much more powerful: "think → call tool → think about results → respond intelligently."</p>
<p>What I discovered surprised me. When you inject tool results back into Claude's thinking stream, it doesn't just use tools - it becomes dramatically smarter about <em>how</em> it uses them. Here's what I built, what I learned, and why this changes everything about AI tool integration.</p>
<h2 id="heading-the-core-innovation-tool-result-injection">The Core Innovation: Tool Result Injection</h2>
<p>Let me show you the difference with a real example. Ask a traditional Claude integration "What's the weather in San Francisco and where should I eat dinner there?" and you get this flow:</p>
<p><strong>Traditional approach:</strong></p>
<pre><code class="lang-plaintext">User Question → Claude thinks → Calls weather tool → Gets result
               → Calls restaurant tool → Gets result → Combines results
</code></pre>
<p><strong>ThinkChain approach:</strong></p>
<pre><code class="lang-plaintext">User Question → Claude thinks → Calls weather tool → Thinks about weather
               → Calls restaurant tool with weather context → Thinks about both
               → Synthesizes intelligent response
</code></pre>
<p>The magic happens in those thinking steps after tool execution. Here's the actual technical implementation that makes this possible:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">stream_once</span>(<span class="hljs-params">messages, tools</span>):</span>
    <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> client.messages.stream(
        model=<span class="hljs-string">"claude-sonnet-4-20250514"</span>,
        messages=messages,
        tools=tools,
        extra_headers={<span class="hljs-string">"anthropic-beta"</span>: <span class="hljs-string">"interleaved-thinking-2025-05-14,fine-grained-tool-streaming-2025-05-14"</span>},
        thinking={<span class="hljs-string">"type"</span>: <span class="hljs-string">"enabled"</span>, <span class="hljs-string">"budget_tokens"</span>: <span class="hljs-number">1024</span>}
    ) <span class="hljs-keyword">as</span> stream:

        <span class="hljs-keyword">async</span> <span class="hljs-keyword">for</span> event <span class="hljs-keyword">in</span> stream:
            <span class="hljs-keyword">if</span> event.type == <span class="hljs-string">"tool_use"</span>:
                <span class="hljs-comment"># Execute the tool</span>
                result = <span class="hljs-keyword">await</span> execute_tool(event.name, event.input)

                <span class="hljs-comment"># This is the key: inject result back into thinking stream</span>
                transcript.append({
                    <span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>,
                    <span class="hljs-string">"content"</span>: [{<span class="hljs-string">"type"</span>: <span class="hljs-string">"tool_result"</span>, <span class="hljs-string">"content"</span>: result}]
                })

                <span class="hljs-comment"># Continue streaming - Claude thinks about the results</span>
                <span class="hljs-keyword">return</span> <span class="hljs-keyword">await</span> stream_once(transcript)
</code></pre>
<p>This creates a feedback loop where Claude's initial thinking leads to tool use, tool results inform continued thinking, and the final response incorporates both reasoning and tool outcomes. It's not just smarter - it's <em>thinking</em> smarter.</p>
<h2 id="heading-architecture-deep-dive-how-it-all-works">Architecture Deep Dive: How It All Works</h2>
<p>Building ThinkChain taught me that the real power isn't in having lots of tools - it's in how tools discover each other, execute cleanly, and feed results back intelligently. Here's how I architected it:</p>
<h3 id="heading-the-tool-discovery-system">The Tool Discovery System</h3>
<p>I wanted developers to just drop a Python file in a folder and have it work. No registration, no complex setup. Here's the discovery pipeline:</p>
<pre><code class="lang-plaintext">Local Tools (/tools/*.py) → Validation → Registry
                                         ↓
MCP Servers (config.json) → Connection → Registry → Unified Tool List → Claude API
</code></pre>
<p>Every tool implements this simple interface:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> tools.base <span class="hljs-keyword">import</span> BaseTool

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">WeatherTool</span>(<span class="hljs-params">BaseTool</span>):</span>
    name = <span class="hljs-string">"weathertool"</span>
    description = <span class="hljs-string">"""
    Gets current weather information for any location worldwide.

    Use this tool when users ask about:
    - Current weather in any city/location
    - Temperature anywhere
    - "What's the weather like in [location]?"
    """</span>

    input_schema = {
        <span class="hljs-string">"type"</span>: <span class="hljs-string">"object"</span>,
        <span class="hljs-string">"properties"</span>: {
            <span class="hljs-string">"location"</span>: {<span class="hljs-string">"type"</span>: <span class="hljs-string">"string"</span>, <span class="hljs-string">"description"</span>: <span class="hljs-string">"City and state/country"</span>},
            <span class="hljs-string">"units"</span>: {<span class="hljs-string">"type"</span>: <span class="hljs-string">"string"</span>, <span class="hljs-string">"enum"</span>: [<span class="hljs-string">"fahrenheit"</span>, <span class="hljs-string">"celsius"</span>], <span class="hljs-string">"default"</span>: <span class="hljs-string">"fahrenheit"</span>}
        },
        <span class="hljs-string">"required"</span>: [<span class="hljs-string">"location"</span>]
    }

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">execute</span>(<span class="hljs-params">self, **kwargs</span>) -&gt; str:</span>
        location = kwargs.get(<span class="hljs-string">"location"</span>)
        <span class="hljs-comment"># Hit wttr.in API for real weather data</span>
        response = requests.get(<span class="hljs-string">f"https://wttr.in/<span class="hljs-subst">{location}</span>?format=j1"</span>)
        data = response.json()

        <span class="hljs-comment"># Format for Claude</span>
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"🌤️ Weather for <span class="hljs-subst">{location}</span>:\nTemperature: <span class="hljs-subst">{data[<span class="hljs-string">'current_condition'</span>][<span class="hljs-number">0</span>][<span class="hljs-string">'temp_F'</span>]}</span>°F\n..."</span>
</code></pre>
<p>The beauty is that tools are just classes with four things: a name, description, input schema, and execute method. Drop the file in <code>/tools/</code>, and ThinkChain discovers it automatically.</p>
<h3 id="heading-real-example-flow">Real Example Flow</h3>
<p>Let me show you what happens when someone asks "What's the weather in San Francisco and find good restaurants there?":</p>
<pre><code class="lang-plaintext">[thinking] I need to check the weather first, then find restaurants that might be good for those conditions.

[tool_use:weathertool] ▶ {"location": "San Francisco, CA"}
[tool_result] 🌤️ Weather for San Francisco, CA:
Temperature: 62°F (feels like 62°F)
Conditions: Partly cloudy
Humidity: 38%
Wind: 5 mph WSW

[thinking] It's a pleasant 62°F and partly cloudy - perfect weather for outdoor dining or walking to restaurants. I should look for places with outdoor seating or patios.

[tool_use:duckduckgotool] ▶ {"query": "best restaurants San Francisco outdoor seating patio"}
[tool_result] [Restaurant results with outdoor dining options...]

[thinking] Given the nice weather, I can recommend these outdoor-friendly restaurants...
</code></pre>
<p>See how the weather result influences the restaurant search? That's the power of tool result injection - Claude doesn't just call tools sequentially, it thinks about results and makes smarter decisions.</p>
<h2 id="heading-building-real-tools-from-concept-to-code">Building Real Tools: From Concept to Code</h2>
<p>When I started building tools for ThinkChain, I learned that the description is just as important as the implementation. Claude needs to understand not just <em>what</em> your tool does, but <em>when</em> to use it.</p>
<p>Here's the complete weathertool implementation with everything I learned:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> tools.base <span class="hljs-keyword">import</span> BaseTool
<span class="hljs-keyword">import</span> requests
<span class="hljs-keyword">import</span> json

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">WeatherTool</span>(<span class="hljs-params">BaseTool</span>):</span>
    name = <span class="hljs-string">"weathertool"</span>

    <span class="hljs-comment"># This description is crucial - it helps Claude decide when to use the tool</span>
    description = <span class="hljs-string">"""
    Gets current weather information for any location worldwide. Returns temperature, 
    weather conditions, humidity, wind speed and direction.

    Use this tool when users ask about:
    - Current weather in any city/location
    - Temperature anywhere  
    - Weather conditions (sunny, cloudy, rainy, etc.)
    - "What's the weather like in [location]?"
    """</span>

    <span class="hljs-comment"># JSON Schema for input validation</span>
    input_schema = {
        <span class="hljs-string">"type"</span>: <span class="hljs-string">"object"</span>,
        <span class="hljs-string">"properties"</span>: {
            <span class="hljs-string">"location"</span>: {
                <span class="hljs-string">"type"</span>: <span class="hljs-string">"string"</span>,
                <span class="hljs-string">"description"</span>: <span class="hljs-string">"The city and state/country (e.g., 'San Francisco, CA' or 'London, UK')"</span>
            },
            <span class="hljs-string">"units"</span>: {
                <span class="hljs-string">"type"</span>: <span class="hljs-string">"string"</span>, 
                <span class="hljs-string">"description"</span>: <span class="hljs-string">"Temperature units"</span>,
                <span class="hljs-string">"enum"</span>: [<span class="hljs-string">"fahrenheit"</span>, <span class="hljs-string">"celsius"</span>, <span class="hljs-string">"kelvin"</span>],
                <span class="hljs-string">"default"</span>: <span class="hljs-string">"fahrenheit"</span>
            }
        },
        <span class="hljs-string">"required"</span>: [<span class="hljs-string">"location"</span>]
    }

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">execute</span>(<span class="hljs-params">self, **kwargs</span>) -&gt; str:</span>
        location = kwargs.get(<span class="hljs-string">"location"</span>)
        units = kwargs.get(<span class="hljs-string">"units"</span>, <span class="hljs-string">"fahrenheit"</span>)

        <span class="hljs-keyword">try</span>:
            <span class="hljs-comment"># Use wttr.in - free weather API, no key needed</span>
            response = requests.get(<span class="hljs-string">f"https://wttr.in/<span class="hljs-subst">{location}</span>?format=j1"</span>, timeout=<span class="hljs-number">10</span>)
            response.raise_for_status()
            data = response.json()

            current = data[<span class="hljs-string">'current_condition'</span>][<span class="hljs-number">0</span>]
            temp_c = int(current[<span class="hljs-string">'temp_C'</span>])
            temp_f = int(current[<span class="hljs-string">'temp_F'</span>])

            <span class="hljs-comment"># Format based on requested units</span>
            <span class="hljs-keyword">if</span> units.lower() == <span class="hljs-string">"celsius"</span>:
                temp = <span class="hljs-string">f"<span class="hljs-subst">{temp_c}</span>°C"</span>
            <span class="hljs-keyword">else</span>:  <span class="hljs-comment"># Default to fahrenheit</span>
                temp = <span class="hljs-string">f"<span class="hljs-subst">{temp_f}</span>°F"</span>

            <span class="hljs-comment"># Return formatted result that Claude can easily understand</span>
            <span class="hljs-keyword">return</span> <span class="hljs-string">f"""🌤️ Weather for <span class="hljs-subst">{location}</span>:
Temperature: <span class="hljs-subst">{temp}</span>
Conditions: <span class="hljs-subst">{current[<span class="hljs-string">'weatherDesc'</span>][<span class="hljs-number">0</span>][<span class="hljs-string">'value'</span>]}</span>
Humidity: <span class="hljs-subst">{current[<span class="hljs-string">'humidity'</span>]}</span>%
Wind: <span class="hljs-subst">{current[<span class="hljs-string">'windspeedMiles'</span>]}</span> mph <span class="hljs-subst">{current[<span class="hljs-string">'winddir16Point'</span>]}</span>"""</span>

        <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
            <span class="hljs-comment"># Always return string errors - Claude can handle them gracefully</span>
            <span class="hljs-keyword">return</span> <span class="hljs-string">f"❌ Error fetching weather data: <span class="hljs-subst">{str(e)}</span>"</span>
</code></pre>
<h3 id="heading-key-patterns-i-discovered">Key Patterns I Discovered</h3>
<p><strong>Rich Descriptions Win</strong>: The more context you give Claude about when to use your tool, the better it performs. Include example queries, keywords that should trigger it, and specific use cases.</p>
<p><strong>Error Handling Matters</strong>: Always catch exceptions and return string error messages. Claude is surprisingly good at handling errors gracefully when you give it clear information about what went wrong.</p>
<p><strong>Format for Claude</strong>: Structure your output to be easily parseable. Use emojis, clear labels, and consistent formatting. Claude works better with well-structured data.</p>
<p><strong>Input Validation</strong>: Use comprehensive JSON schemas. They prevent errors and help Claude understand exactly what parameters your tool expects.</p>
<h2 id="heading-mcp-integration-extending-beyond-local-tools">MCP Integration: Extending Beyond Local Tools</h2>
<p>One of the most exciting discoveries was integrating with MCP (Model Context Protocol) servers. MCP lets you connect to external servers that provide tools, dramatically expanding what's possible.</p>
<p>Here's how I added SQLite database operations:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"mcpServers"</span>: {
    <span class="hljs-attr">"sqlite"</span>: {
      <span class="hljs-attr">"command"</span>: <span class="hljs-string">"uvx"</span>,
      <span class="hljs-attr">"args"</span>: [<span class="hljs-string">"mcp-server-sqlite"</span>, <span class="hljs-string">"--db-path"</span>, <span class="hljs-string">"./database.db"</span>],
      <span class="hljs-attr">"description"</span>: <span class="hljs-string">"SQLite database operations"</span>,
      <span class="hljs-attr">"enabled"</span>: <span class="hljs-literal">true</span>
    }
  }
}
</code></pre>
<p>Just by adding that configuration and running <code>uvx install mcp-server-sqlite</code>, ThinkChain gained 6 new tools:</p>
<ul>
<li><p><code>mcp_sqlite_read_query</code> - Execute SELECT queries</p>
</li>
<li><p><code>mcp_sqlite_write_query</code> - Execute INSERT/UPDATE/DELETE</p>
</li>
<li><p><code>mcp_sqlite_create_table</code> - Create database tables</p>
</li>
<li><p><code>mcp_sqlite_list_tables</code> - List all tables</p>
</li>
<li><p><code>mcp_sqlite_describe_table</code> - Get table schema</p>
</li>
<li><p><code>mcp_sqlite_append_insight</code> - Add business insights</p>
</li>
</ul>
<p>The power comes from combining ecosystems. Now I can ask: "Check the weather in our office locations from the database, then find restaurants near each one" and Claude seamlessly uses both local tools (weather) and MCP tools (database) together.</p>
<p>What blew my mind was how naturally Claude chains these together. It doesn't see a difference between local Python tools and remote MCP servers - they're all just tools in its toolkit.</p>
<h2 id="heading-the-enhanced-ui-making-it-beautiful">The Enhanced UI: Making It Beautiful</h2>
<p>Here's something I learned early: if you're building developer tools, the experience matters just as much as the functionality. I could have stopped at a basic CLI, but I wanted ThinkChain to <em>feel</em> as intelligent as it actually is.</p>
<p>So I built two interfaces:</p>
<p><a target="_blank" href="http://thinkchain.py"><strong>thinkchain.py</strong></a> - The full experience with Rich formatting, progress bars, and interactive features <strong>thinkchain_</strong><a target="_blank" href="http://cli.py"><strong>cli.py</strong></a> - Minimal CLI for when you just need it to work <a target="_blank" href="http://run.py"><strong>run.py</strong></a> - Smart launcher that detects available libraries and picks the best option</p>
<p>Here's what the enhanced UI looks like when it starts up:</p>
<pre><code class="lang-plaintext">╔═══════════════════════════════════════════════════════════════════╗
║  ████████╗██╗  ██╗██╗███╗   ██╗██╗  ██╗ ██████╗██╗  ██╗ █████╗ ██╗███╗   ██╗  ║
║  ╚══██╔══╝██║  ██║██║████╗  ██║██║ ██╔╝██╔════╝██║  ██║██╔══██╗██║████╗  ██║  ║
║     ██║   ███████║██║██╔██╗ ██║█████╔╝ ██║     ███████║███████║██║██╔██╗ ██║  ║
║     ██║   ██╔══██║██║██║╚██╗██║██╔═██╗ ██║     ██╔══██║██╔══██║██║██║╚██╗██║  ║
║     ██║   ██║  ██║██║██║ ╚████║██║  ██╗╚██████╗██║  ██║██║  ██║██║██║ ╚████║  ║
║     ╚═╝   ╚═╝  ╚═╝╚═╝╚═╝  ╚═══╝╚═╝  ╚═╝ ╚═════╝╚═╝  ╚═╝╚═╝  ╚═╝╚═╝╚═╝  ╚═══╝  ║
║          🧠 Claude Chat with Advanced Tool Integration &amp; Thinking 💭          ║
╚═══════════════════════════════════════════════════════════════════════════════╝

Claude Tool Discovery Chat
🔧 Local: 11 tools │ 🌐 MCP: 6 servers │ 💭 Thinking: ON │ 🔋 Ready
</code></pre>
<p>But the real magic happens during conversations. Watch what happens when Claude uses a tool:</p>
<pre><code class="lang-plaintext">👤 You: What's the weather in Cross Lanes, WV?

💭 Thinking: I'll check the current weather in Cross Lanes, WV for you.

🔧 Tool Use: weathertool

🔧 Executing: weathertool
╭───────────────────────── Arguments for weathertool ──────────────────────────╮
│ {                                                                            │
│   "location": "Cross Lanes, WV"                                              │
│ }                                                                            │
╰──────────────────────────────────────────────────────────────────────────────╯
🔧 weathertool: Executing...
🔧 weathertool: Completed (0.8s)
╭────────────────────────── Result from weathertool ───────────────────────────╮
│ 🌤️ Weather for Cross Lanes, WV:                                               │
│ Temperature: 73°F (feels like 77°F)                                          │
│ Conditions: Heavy rain with thunderstorm                                     │
│ Humidity: 79%                                                                │
│ Wind: 2 mph WSW                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

🔄 Continuing with tool results...

💭 Thinking: The current weather in Cross Lanes, WV shows stormy conditions...

🤖 Claude: The current weather in Cross Lanes, WV is stormy with heavy rain and thunderstorms...
</code></pre>
<p>Every step is visualized: thinking appears in italic blue, tool execution shows progress with timing, and results are formatted in beautiful boxes. You can actually <em>watch</em> Claude think through problems.</p>
<h3 id="heading-technical-implementation">Technical Implementation</h3>
<p>The enhanced UI is built with the Rich library, but here's the clever part - it gracefully degrades:</p>
<pre><code class="lang-python"><span class="hljs-keyword">try</span>:
    <span class="hljs-keyword">from</span> rich.console <span class="hljs-keyword">import</span> Console
    <span class="hljs-keyword">from</span> rich.panel <span class="hljs-keyword">import</span> Panel
    <span class="hljs-keyword">from</span> rich.progress <span class="hljs-keyword">import</span> Progress
    <span class="hljs-keyword">from</span> ui_components <span class="hljs-keyword">import</span> ui  <span class="hljs-comment"># Enhanced UI components</span>
    UI_AVAILABLE = <span class="hljs-literal">True</span>
<span class="hljs-keyword">except</span> ImportError:
    UI_AVAILABLE = <span class="hljs-literal">False</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">print_tool_execution</span>(<span class="hljs-params">name, status, duration=None</span>):</span>
    <span class="hljs-keyword">if</span> UI_AVAILABLE:
        <span class="hljs-keyword">if</span> status == <span class="hljs-string">"executing"</span>:
            ui.print(<span class="hljs-string">f"🔧 [yellow]Executing:[/yellow] <span class="hljs-subst">{name}</span>"</span>)
        <span class="hljs-keyword">elif</span> status == <span class="hljs-string">"completed"</span>:
            ui.print(<span class="hljs-string">f"🔧 [green]Completed:[/green] <span class="hljs-subst">{name}</span> (<span class="hljs-subst">{duration:<span class="hljs-number">.1</span>f}</span>s)"</span>)
    <span class="hljs-keyword">else</span>:
        <span class="hljs-comment"># Fallback to basic text</span>
        print(<span class="hljs-string">f"[tool_use:<span class="hljs-subst">{name}</span>] <span class="hljs-subst">{status}</span>"</span>)
</code></pre>
<p>The slash commands system was particularly fun to build:</p>
<pre><code class="lang-python">command_words = [<span class="hljs-string">'/help'</span>, <span class="hljs-string">'/tools'</span>, <span class="hljs-string">'/refresh'</span>, <span class="hljs-string">'/config'</span>, <span class="hljs-string">'/exit'</span>]

user_input = ui.get_input(<span class="hljs-string">"Enter command or message"</span>, command_words)

<span class="hljs-keyword">if</span> user_input.startswith(<span class="hljs-string">'/'</span>):
    command_parts = user_input[<span class="hljs-number">1</span>:].split()
    command = command_parts[<span class="hljs-number">0</span>].lower()

    <span class="hljs-keyword">if</span> command == <span class="hljs-string">'tools'</span>:
        show_tools_command()  <span class="hljs-comment"># Beautiful table of all tools</span>
    <span class="hljs-keyword">elif</span> command == <span class="hljs-string">'config'</span>:
        handle_config_command(args)  <span class="hljs-comment"># Interactive configuration</span>
</code></pre>
<p>You get tab completion, command history with arrow keys, and rich formatting throughout. But if you don't have Rich installed, everything still works - it just falls back to plain text.</p>
<h2 id="heading-lessons-learned-and-developer-insights">Lessons Learned and Developer Insights</h2>
<p>Building ThinkChain taught me things about AI tool integration that I never expected. Here are the biggest insights:</p>
<h3 id="heading-what-worked-incredibly-well">What Worked Incredibly Well</h3>
<p><strong>Tool result injection is a game-changer.</strong> I cannot overstate this. When Claude can think about tool results before responding, the quality of responses improves dramatically. It's not just using tools - it's reasoning about their outputs.</p>
<p><strong>Automatic tool discovery scales effortlessly.</strong> I started with 2 tools, now have 17, and adding new ones is still just "drop file in folder, restart." The discovery system handles all the complexity.</p>
<p><strong>Rich descriptions make Claude smarter.</strong> The difference between a tool with a basic description and one with rich context about when to use it is night and day. Claude makes much better tool selection decisions with good descriptions.</p>
<p><strong>MCP integration unlocks unlimited possibilities.</strong> Once I connected to MCP servers, I realized this isn't just about the tools I build - it's about connecting to an entire ecosystem.</p>
<h3 id="heading-challenges-that-surprised-me">Challenges That Surprised Me</h3>
<p><strong>Managing async MCP connections was trickier than expected.</strong> MCP servers run as separate processes, and coordinating their lifecycle with the main application required careful async handling:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">cleanup_mcp_servers</span>():</span>
    <span class="hljs-string">"""Gracefully shutdown all MCP server connections"""</span>
    <span class="hljs-keyword">for</span> server_name, client <span class="hljs-keyword">in</span> self.active_clients.items():
        <span class="hljs-keyword">try</span>:
            <span class="hljs-keyword">await</span> client.close()
        <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
            logger.error(<span class="hljs-string">f"Error during cleanup of MCP server <span class="hljs-subst">{server_name}</span>: <span class="hljs-subst">{e}</span>"</span>)
</code></pre>
<p><strong>Tool failure handling needs to be bulletproof.</strong> When a tool fails, you can't just crash - Claude needs to understand what went wrong and potentially try alternative approaches:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">execute_tool_sync</span>(<span class="hljs-params">name: str, args: dict</span>) -&gt; str:</span>
    <span class="hljs-keyword">try</span>:
        result = tool_function(args)
        <span class="hljs-keyword">return</span> result
    <span class="hljs-keyword">except</span> requests.RequestException <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"❌ Network error calling <span class="hljs-subst">{name}</span>: <span class="hljs-subst">{str(e)}</span>"</span>
    <span class="hljs-keyword">except</span> ValidationError <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"❌ Invalid input for <span class="hljs-subst">{name}</span>: <span class="hljs-subst">{str(e)}</span>"</span>
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"❌ Unexpected error in <span class="hljs-subst">{name}</span>: <span class="hljs-subst">{str(e)}</span>"</span>
</code></pre>
<p><strong>Thinking budget optimization matters more than I thought.</strong> Initially I set the thinking budget to 16,000 tokens, but I found that 1,024-2,048 tokens often works better. Too much thinking budget and Claude overthinks simple problems. Too little and it can't reason through complex tool chains.</p>
<h3 id="heading-performance-insights">Performance Insights</h3>
<p><strong>Tool execution timing varies wildly.</strong> Weather API calls take 0.5-1 second, web scraping can take 3-5 seconds, and database operations are nearly instant. The UI progress indicators help users understand what's happening.</p>
<p><strong>Streaming vs. batch processing trade-offs.</strong> Streaming gives better user experience but requires more complex error handling. I ended up with a hybrid approach - streaming for the conversation, but batch processing for tool discovery at startup.</p>
<p><strong>Memory usage grows with tool count.</strong> Each tool keeps its schema in memory, and MCP connections maintain persistent state. With 17 tools I'm using about 50MB, which is totally reasonable, but it's something to watch.</p>
<h3 id="heading-code-patterns-that-emerged">Code Patterns That Emerged</h3>
<p>Here are some patterns I found myself using repeatedly:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Tool result validation pattern</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">validate_and_format_result</span>(<span class="hljs-params">result: str, tool_name: str</span>) -&gt; str:</span>
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> result:
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"❌ <span class="hljs-subst">{tool_name}</span> returned empty result"</span>

    <span class="hljs-comment"># Try to parse as JSON for structured data</span>
    <span class="hljs-keyword">try</span>:
        parsed = json.loads(result)
        <span class="hljs-keyword">return</span> json.dumps(parsed, indent=<span class="hljs-number">2</span>)  <span class="hljs-comment"># Pretty print</span>
    <span class="hljs-keyword">except</span>:
        <span class="hljs-keyword">return</span> result  <span class="hljs-comment"># Return as-is if not JSON</span>

<span class="hljs-comment"># Graceful degradation pattern  </span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">safe_tool_execution</span>(<span class="hljs-params">tool_func, *args, **kwargs</span>):</span>
    <span class="hljs-keyword">try</span>:
        <span class="hljs-keyword">return</span> tool_func(*args, **kwargs)
    <span class="hljs-keyword">except</span> ImportError <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"❌ Missing dependency: <span class="hljs-subst">{e}</span>"</span>
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"❌ Tool execution failed: <span class="hljs-subst">{e}</span>"</span>

<span class="hljs-comment"># Configuration management pattern</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">update_config</span>(<span class="hljs-params">key: str, value: Any</span>) -&gt; bool:</span>
    <span class="hljs-keyword">if</span> key <span class="hljs-keyword">in</span> ALLOWED_CONFIG_KEYS:
        CONFIG[key] = value
        save_config_to_file()  <span class="hljs-comment"># Persist changes</span>
        <span class="hljs-keyword">return</span> <span class="hljs-literal">True</span>
    <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span>
</code></pre>
<p>What I realized is that building AI tools isn't just about the AI part - it's about creating robust, developer-friendly systems that handle edge cases gracefully and provide great experiences.</p>
<h2 id="heading-fork-it-and-make-it-yours">Fork It and Make It Yours</h2>
<p>Here's the thing - ThinkChain is designed to be forked and extended. I built it with MIT license specifically because I want to see what developers build with it.</p>
<p>The architecture is modular by design. Want to add tools for your domain? Drop Python files in <code>/tools/</code>. Want to connect to specialized MCP servers? Edit <code>mcp_config.json</code>. Want to customize the UI? Modify the Rich components.</p>
<h3 id="heading-ideas-for-domain-specific-forks">Ideas for Domain-Specific Forks</h3>
<p><strong>Data Science ThinkChain</strong>: Add pandas tools for data manipulation, matplotlib for visualization, jupyter tools for notebook integration. Imagine asking Claude to "load this dataset, analyze trends, and create visualizations" and watching it think through each step.</p>
<p><strong>Web Development ThinkChain</strong>: React component generators, npm package managers, git integration tools, deployment automation. "Create a new React component with these props and add it to the project" becomes a conversation, not a manual process.</p>
<p><strong>DevOps ThinkChain</strong>: Docker container tools, Kubernetes deployment tools, AWS/GCP integration, monitoring dashboards. "Check the health of our production services and scale if needed" with full reasoning about the decisions.</p>
<p><strong>Research ThinkChain</strong>: Academic paper search tools, citation managers, data analysis tools, LaTeX generators. "Find recent papers on this topic and summarize their methodologies" with tool-driven research.</p>
<h3 id="heading-getting-started-with-your-fork">Getting Started with Your Fork</h3>
<p>The process is straightforward:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Fork and clone</span>
git <span class="hljs-built_in">clone</span> https://github.com/yourusername/your-thinkchain-fork.git
<span class="hljs-built_in">cd</span> your-thinkchain-fork

<span class="hljs-comment"># Install dependencies</span>
uv pip install -r requirements.txt

<span class="hljs-comment"># Create your first tool</span>
vim tools/yourtool.py

<span class="hljs-comment"># Test it</span>
python thinkchain.py
/refresh  <span class="hljs-comment"># Loads your new tool</span>
<span class="hljs-string">"Use my new tool for X"</span>  <span class="hljs-comment"># Test with Claude</span>
</code></pre>
<h3 id="heading-what-i-hope-you-build">What I Hope You Build</h3>
<p>I'm excited to see domain-specific forks, novel tool combinations, and creative MCP integrations. Maybe someone builds ThinkChain for legal research, or scientific computing, or creative writing. The possibilities are endless.</p>
<p>If you build something cool, let me know! I'd love to feature community forks and see how people extend the system.</p>
<h2 id="heading-whats-next">What's Next</h2>
<p>Building ThinkChain opened my eyes to what's possible when AI tools can think about their own tool use. Here's what I'm excited about for the future:</p>
<h3 id="heading-technical-improvements-im-working-on">Technical Improvements I'm Working On</h3>
<p><strong>Better error recovery</strong>: When tools fail, Claude should be able to suggest alternative approaches or debug the problem. I'm experimenting with giving Claude access to error logs and system state.</p>
<p><strong>Tool composition workflows</strong>: Instead of just chaining tools, what if Claude could compose them into reusable workflows? "Remember this sequence of tools as a 'data analysis workflow' for future use."</p>
<p><strong>Multi-model support</strong>: Claude is amazing, but different models have different strengths. What if you could use GPT-4 for creative tasks and Claude for analytical ones, all in the same conversation?</p>
<p><strong>Performance optimizations</strong>: Some tool chains could run in parallel instead of sequentially. I'm exploring how to let Claude mark which tools can run concurrently.</p>
<h3 id="heading-the-bigger-picture">The Bigger Picture</h3>
<p>What excites me most is that ThinkChain represents a shift from "AI that uses tools" to "AI that thinks about tools." When Claude can reason about tool results, it makes fundamentally better decisions about which tools to use and how to use them.</p>
<p>I think this is just the beginning. As more MCP servers come online, as tool ecosystems mature, and as AI models get better at reasoning, we're going to see AI systems that don't just automate tasks - they intelligently orchestrate complex workflows.</p>
<p>The future isn't AI replacing human developers - it's AI becoming incredibly sophisticated development partners that can think through problems, use tools intelligently, and explain their reasoning every step of the way.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Pietro's tweet about "chain of tools" sparked an idea, but what I discovered while building ThinkChain was something bigger: when you let AI think about tool results, everything changes.</p>
<p>Claude doesn't just use tools anymore - it reasons about them, learns from them, and makes intelligent decisions about how to combine them. The feedback loop between thinking and tool execution creates a kind of intelligence I hadn't seen before.</p>
<p>For developers, this means we need to think differently about AI integration. It's not enough to just give AI access to tools - we need to design systems that let AI think about tool results and use that thinking to make better decisions.</p>
<p>The technical patterns are surprisingly straightforward: tool result injection, async streaming, graceful error handling, and rich user experiences. But the implications are profound. We're moving from AI assistants that follow scripts to AI partners that can reason through complex problems.</p>
<p>ThinkChain is my exploration of this idea, but it's really just the beginning. The best AI tools aren't just smart - they're tools that make AI smarter.</p>
<p>Fork it, extend it, and build something amazing. I can't wait to see what you create.</p>
<hr />
<p><strong>Want to try ThinkChain?</strong> Check out the <a target="_blank" href="https://github.com/martinbowling/ThinkChain">GitHub repository</a> and start building your own tool-thinking feedback loops.</p>
<p><strong>Have ideas or questions?</strong> Find me on <a target="_blank" href="https://x.com/martinbowling">Twitter/X</a> - I'd love to hear what you're building!</p>
]]></content:encoded></item><item><title><![CDATA[Building DeepBuild]]></title><description><![CDATA[🚀 DeepBuild: Where AI Meets Code Magic!
Hey there, code enthusiasts! Martin Bowling here! 👋 Today I'm pulling back the curtain on DeepBuild, our AI-powered software wizard. Get ready for a fun journey through the codebase where I'll show you how ev...]]></description><link>https://martinbowling.com/building-deepbuild</link><guid isPermaLink="true">https://martinbowling.com/building-deepbuild</guid><category><![CDATA[DeepSeek-v3]]></category><category><![CDATA[AI]]></category><category><![CDATA[Deepseek]]></category><category><![CDATA[Node.js]]></category><category><![CDATA[TypeScript]]></category><dc:creator><![CDATA[Martin Bowling]]></dc:creator><pubDate>Tue, 31 Dec 2024 23:09:58 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/tikhtH3QRSQ/upload/78d6552f27c51da452c5fa7862e433be.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1 id="heading-deepbuild-where-ai-meets-code-magic">🚀 DeepBuild: Where AI Meets Code Magic!</h1>
<p>Hey there, code enthusiasts! Martin Bowling here! 👋 Today I'm pulling back the curtain on <strong>DeepBuild</strong>, our AI-powered software wizard. Get ready for a fun journey through the codebase where I'll show you how everything ticks, why we made certain choices, and how you can join the party with your own contributions!</p>
<hr />
<h2 id="heading-what-is-deepbuild">🤔 What Is DeepBuild?</h2>
<p>Think of DeepBuild as your AI-powered coding companion that can spawn entire software projects from scratch! We've mixed together some awesome ingredients:</p>
<ul>
<li><p>⚛️ <strong>Next.js</strong> (powering our slick front end and serverless API routes)</p>
</li>
<li><p>📝 <strong>TypeScript</strong> (because who doesn't love type safety?)</p>
</li>
<li><p>🎨 <strong>Tailwind CSS</strong> (for that pixel-perfect styling)</p>
</li>
<li><p>🪝 <strong>Custom React hooks</strong> and context providers (keeping our state game strong)</p>
</li>
<li><p>💾 <strong>IndexedDB</strong> (via a custom file manager) to store and manage project data in your browser</p>
</li>
</ul>
<p>When you put it all together, users can:</p>
<ol>
<li><p>Dream up a project idea</p>
</li>
<li><p>Watch as AI crafts a detailed project blueprint</p>
</li>
<li><p>Have a chat with the AI to refine the details</p>
</li>
<li><p>See complete files materialize like magic! ✨</p>
</li>
</ol>
<hr />
<h2 id="heading-the-core-concept">🎯 The Core Concept</h2>
<p>Here's the secret sauce:</p>
<ol>
<li><p>🗣️ <strong>You</strong> describe your dream project</p>
</li>
<li><p>🤖 <strong>The AI</strong> cooks up some JSON containing:</p>
<ul>
<li><p>📋 A <strong>Project Brief</strong> (the grand plan)</p>
</li>
<li><p>❓ <strong>Clarifying questions</strong> (getting those details just right)</p>
</li>
<li><p>📁 A <strong>file structure</strong> (the blueprint)</p>
</li>
</ul>
</li>
<li><p>💬 <strong>You</strong> chat with the AI to refine everything</p>
</li>
<li><p>⚡ <strong>DeepBuild</strong> transforms those ideas into real, working code!</p>
</li>
</ol>
<p>Here's a peek at the prompts making the magic happen:</p>
<pre><code class="lang-ts"><span class="hljs-keyword">export</span> <span class="hljs-keyword">const</span> BRIEF_SYSTEM_PROMPT = <span class="hljs-string">`
  You are DeepBuild, an elite software engineer...
  ...
  Remember: Always wrap your response in &lt;final_json&gt; tags and ensure it's valid JSON.
`</span>;

<span class="hljs-keyword">export</span> <span class="hljs-keyword">const</span> IMPLEMENTATION_SYSTEM_PROMPT = <span class="hljs-string">`
  You are DeepBuild, an elite software engineer...
  ...
  Remember: Always wrap your response in &lt;final_json&gt; tags and ensure it's valid JSON.
`</span>;
</code></pre>
<hr />
<h2 id="heading-a-tour-of-the-code">🏗️ A Tour of the Code</h2>
<h3 id="heading-1-root-layout-amp-global-styles">1. 🎨 Root Layout &amp; Global Styles</h3>
<ul>
<li><p><strong>File:</strong> <code>app/layout.tsx</code></p>
</li>
<li><p><strong>File:</strong> <code>app/globals.css</code></p>
</li>
</ul>
<p>Our <code>layout.tsx</code> is where the magic begins! It sets up the main HTML structure and brings in our <code>FileProvider</code> context:</p>
<pre><code class="lang-tsx">export default function RootLayout({ children }: { children: React.ReactNode }) {
  return (
    &lt;html lang="en" suppressHydrationWarning&gt;
      &lt;body className={inter.className}&gt;
        &lt;FileProvider&gt;{children}&lt;/FileProvider&gt;
      &lt;/body&gt;
    &lt;/html&gt;
  );
}
</code></pre>
<p>We're all in on <strong>Tailwind</strong> for styling, with dark mode support baked right into <code>globals.css</code> 🌙</p>
<h3 id="heading-2-main-dashboard">2. 📊 Main Dashboard</h3>
<ul>
<li><strong>File:</strong> <code>app/page.tsx</code></li>
</ul>
<p>The dashboard is your command center! Here's where you can:</p>
<ul>
<li><p>🆕 Create shiny new projects</p>
</li>
<li><p>🗑️ Clean up old ones</p>
</li>
<li><p>📦 Export as <code>.zip</code> files</p>
</li>
<li><p>⚙️ Tweak your AI settings</p>
</li>
</ul>
<pre><code class="lang-tsx">export default function Dashboard() {
  // Hooks, state, event handlers here...
  // ...
  return (
    &lt;div className="container py-8 px-12"&gt;
      {/* Header + new project button */}
      {/* List of projects as ProjectCard */}
      {/* Settings dialog */}
    &lt;/div&gt;
  );
}
</code></pre>
<h3 id="heading-3-components-galore">3. 🧩 Components Galore</h3>
<p>Our <code>components/</code> folder is where all the real action happens! Here's what we're working with:</p>
<ul>
<li><p>🎴 <code>ProjectCard.tsx</code>: Shows off each project with its current status</p>
</li>
<li><p>🆕 <code>NewProjectModal.tsx</code>: A sleek dialog for birthing new projects</p>
</li>
<li><p>🖼️ <code>ProjectView.tsx</code>: The main stage where all the magic happens</p>
</li>
<li><p>✏️ <code>Editor.tsx</code>: Powered by <code>@monaco-editor/react</code> for that pro-level coding experience</p>
</li>
</ul>
<p>Check out how we tie everything together in <code>ProjectView.tsx</code>:</p>
<pre><code class="lang-tsx">&lt;div className="flex-1 flex"&gt;
  &lt;div className="w-64 border-r border-border bg-muted/30"&gt;
    &lt;FileTree
      files={project.files}
      selectedFile={selectedFile}
      onFileSelect={setSelectedFile}
      onFileRefresh={handleFileRefresh}
    /&gt;
  &lt;/div&gt;

  &lt;div className="flex-1 border-r border-border"&gt;
    {selectedFile ? (
      &lt;Editor
        file={selectedFile}
        language={getLanguageFromPath(selectedFile.path)}
      /&gt;
    ) : (
      &lt;div className="h-full flex items-center justify-center text-muted-foreground"&gt;
        Select a file to view or edit
      &lt;/div&gt;
    )}
  &lt;/div&gt;

  &lt;div className="w-96 flex flex-col h-full"&gt;
    &lt;ProjectChat
      // ...
    /&gt;
  &lt;/div&gt;
&lt;/div&gt;
</code></pre>
<h3 id="heading-4-indexeddb-filemanager">4. 💾 IndexedDB 'FileManager'</h3>
<ul>
<li><strong>File:</strong> <code>lib/fileOperations.ts</code></li>
</ul>
<p>We're keeping it local with IndexedDB for data storage! Our <code>FileManager</code> class handles all the heavy lifting:</p>
<pre><code class="lang-ts"><span class="hljs-keyword">class</span> FileManager {
  <span class="hljs-keyword">private</span> db: IDBDatabase | <span class="hljs-literal">null</span> = <span class="hljs-literal">null</span>;

  <span class="hljs-keyword">constructor</span>(<span class="hljs-params"></span>) {
    <span class="hljs-built_in">this</span>.initDB();
  }

  <span class="hljs-keyword">async</span> saveProject(name: <span class="hljs-built_in">string</span>, brief: ProjectBrief): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">string</span>&gt; {
    <span class="hljs-comment">// Creates a new project + files in IndexedDB</span>
  }

  <span class="hljs-comment">// ...</span>
}
</code></pre>
<p>No server required - everything stays right in your browser! 🏠</p>
<h3 id="heading-5-prompts-and-ai-hooks">5. 🤖 Prompts and AI Hooks</h3>
<ul>
<li><strong>File:</strong> <code>hooks/useChat.ts</code></li>
</ul>
<p>The AI communication happens through <code>sendChatMessage()</code> in <code>lib/api.ts</code>. We can talk to either the <strong>DeepSeek</strong> or <strong>Hyperbolic</strong> model, depending on your preferences. We scan responses for those special <code>&lt;final_json&gt;</code> tags to extract the structured data we need:</p>
<pre><code class="lang-ts"><span class="hljs-keyword">const</span> response = <span class="hljs-keyword">await</span> sendChatMessage(apiMessages);
<span class="hljs-keyword">const</span> match = response.match(<span class="hljs-regexp">/&lt;final_json&gt;([^]*?)&lt;\/final_json&gt;/</span>);
<span class="hljs-keyword">let</span> parsedResponse = <span class="hljs-literal">null</span>;

<span class="hljs-keyword">if</span> (match) {
  parsedResponse = <span class="hljs-built_in">JSON</span>.parse(match[<span class="hljs-number">1</span>].trim());
} <span class="hljs-keyword">else</span> {
  parsedResponse = <span class="hljs-built_in">JSON</span>.parse(response); <span class="hljs-comment">// fallback</span>
}
</code></pre>
<hr />
<h2 id="heading-example-creating-a-new-project">⚡ Example: Creating a New Project</h2>
<p>Here's what happens when you hit that "New Project" button:</p>
<ol>
<li><p>📝 <strong>You</strong> enter your <strong>Project Name</strong> and describe your vision</p>
</li>
<li><p>🤖 <strong>DeepBuild</strong> sends your description to the AI with our special <code>BRIEF_SYSTEM_PROMPT</code>:</p>
<pre><code class="lang-ts"> <span class="hljs-keyword">const</span> briefResponse = <span class="hljs-keyword">await</span> sendMessage(
   <span class="hljs-string">`Create a project brief for the following utility app: <span class="hljs-subst">${description}</span>`</span>,
   <span class="hljs-string">'brief'</span>
 );
</code></pre>
</li>
<li><p>🎁 The AI returns structured JSON with everything we need</p>
</li>
<li><p>💾 We store it all in IndexedDB with a fresh <code>projectId</code></p>
</li>
<li><p>💬 You can then chat with the AI to generate each file</p>
</li>
</ol>
<p>Here's the code that kicks it all off:</p>
<pre><code class="lang-tsx">const response = await sendMessage(
  `Create a project brief for a ${description}. Include implementation details...`,
  'brief'
);
const match = response.match(/&lt;final_json&gt;([^]*?)&lt;\/final_json&gt;/);
// ...
const parsedResponse = JSON.parse(match[1].trim());
// Then we create the project in IndexedDB via fileManager.saveProject
</code></pre>
<hr />
<h2 id="heading-contributing-to-deepbuild">🌟 Contributing to DeepBuild</h2>
<p>Ready to help make DeepBuild even more awesome? Here are some cool ways to contribute:</p>
<ol>
<li><p>🎯 <strong>New AI prompts</strong>: Got ideas for framework-specific prompts?</p>
</li>
<li><p>🛠️ <strong>Error handling</strong>: Help us make the system more robust</p>
</li>
<li><p>💅 <strong>UI/UX improvements</strong>: Add new language support, better diffs, or other dev-friendly features</p>
</li>
<li><p>🔌 <strong>Plugins</strong>: Help us build a system for specialized generators (Docker, AWS, etc.)</p>
</li>
</ol>
<h2 id="heading-how-to-contribute">🤝 How to Contribute</h2>
<ol>
<li><p>📦 <strong>Clone</strong> the repo and get those dependencies installed</p>
</li>
<li><p>🌱 <strong>Create</strong> your feature branch</p>
</li>
<li><p>✨ <strong>Make</strong> your magic happen</p>
</li>
<li><p>🚀 <strong>Open</strong> a pull request and let our CI pipeline do its thing!</p>
</li>
</ol>
<p>We're on a mission to make software generation not just possible, but absolutely delightful! Join us in building the future of AI-assisted development.</p>
<h2 id="heading-wrapping-up">🎬 Wrapping Up</h2>
<p>And there you have it, friends! That's the inside scoop on how <strong>DeepBuild</strong> works its magic. Remember:</p>
<ol>
<li><p>💭 Dream up your project</p>
</li>
<li><p>🤖 Let AI craft the plan</p>
</li>
<li><p>🔨 Guide the process until your codebase is ready!</p>
</li>
</ol>
<p>Got questions? Want to collaborate? <a target="_blank" href="mailto:martin@martinbowling.com">Drop me a line</a>! And don't forget to smash that ⭐ button if you're thinking about contributing!</p>
<p>Thanks for joining me on this code adventure! Now go forth and build something awesome!</p>
<p><strong>🚀 Martin Bowling</strong></p>
<h2 id="heading-quick-links">🔗 Quick Links</h2>
<ul>
<li><p>🚀 <strong>Try DeepBuild</strong>:</p>
<ul>
<li><p>Template: <a target="_blank" href="https://replit.com/@MartinBowling/deepbuild?v=1">DeepBuild on Replit</a></p>
</li>
<li><p>Hosted Version: <a target="_blank" href="http://deepbuild.app">deepbuild.app</a></p>
</li>
</ul>
</li>
<li><p>💻 <strong>Source Code</strong>: <a target="_blank" href="https://github.com/martinbowling/deepbuild">GitHub Repository</a></p>
</li>
<li><p>🤖 <strong>Powered By</strong>:</p>
<ul>
<li><p><a target="_blank" href="https://deepseek.ai">DeepSeek AI</a> - State-of-the-art code generation</p>
</li>
<li><p><a target="_blank" href="https://hyperbolic.io">Hyperbolic Labs</a> - Decentralized inference</p>
</li>
<li><p><a target="_blank" href="https://replit.com">Replit</a> - Development platform and hosting</p>
</li>
</ul>
</li>
</ul>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://replit.com/@MartinBowling/deepbuild">https://replit.com/@MartinBowling/deepbuild</a></div>
]]></content:encoded></item><item><title><![CDATA[Building an llms.txt Generator]]></title><description><![CDATA[As a developer who's deeply interested in both web technologies and AI, I've been fascinated by how we can better connect these two worlds. That's why I created the llms.txt Generator - a tool that helps websites communicate their content more effect...]]></description><link>https://martinbowling.com/building-an-ai-powered-llmstxt-generator</link><guid isPermaLink="true">https://martinbowling.com/building-an-ai-powered-llmstxt-generator</guid><category><![CDATA[Next.js]]></category><category><![CDATA[Python]]></category><category><![CDATA[gradio]]></category><category><![CDATA[AI]]></category><category><![CDATA[groq]]></category><category><![CDATA[#GroqAI]]></category><dc:creator><![CDATA[Martin Bowling]]></dc:creator><pubDate>Mon, 25 Nov 2024 21:43:44 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/cckf4TsHAuw/upload/8b2a3f1351b009cb58b819e03972cdc1.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>As a developer who's deeply interested in both web technologies and AI, I've been fascinated by how we can better connect these two worlds. That's why I created the llms.txt Generator - a tool that helps websites communicate their content more effectively with AI models. The project is based on the <a target="_blank" href="https://llmstxt.org/">llms.txt specification</a>, a proposed standard released in September 2023 that aims to help websites provide structured, AI-friendly content similar to how robots.txt helps search engines.</p>
<p>The llms.txt specification, created by <a target="_blank" href="https://jeremy.fast.ai/">Jeremy Howard</a>, introduces a standardized way for websites to provide AI-readable content through a <code>/llms.txt</code> file in their root directory. This file contains markdown-formatted summaries, documentation, and links to key resources, making it easier for AI models to understand and process website content during inference time. Today, I want to share the journey of building this tool and the technical decisions that went into it.</p>
<h2 id="heading-understanding-llmstxt">Understanding llms.txt</h2>
<p>Before diving into the implementation, let's talk about what llms.txt is and why it matters. Similar to robots.txt, llms.txt is a proposed standard that helps AI models better understand website content. It provides AI-generated summaries of your website's pages in a structured format. Here's what a basic llms.txt file looks like:</p>
<pre><code class="lang-plaintext"># Example Website

&gt; This is the homepage of Example.com, featuring our main products and services.

## Main Pages

- [/about](/about): About page describing our company history and mission
- [/products](/products): Comprehensive catalog of our product offerings
- [/contact](/contact): Contact information and support resources
</code></pre>
<h2 id="heading-technical-architecture">Technical Architecture</h2>
<p>I built two versions of this tool: a modern Next.js web application and a Python implementation using Gradio. The Next.js version offers a polished UI with real-time updates, while the Gradio version provides a simpler interface that's perfect for quick deployments and testing.</p>
<p>Let's look at both implementations, starting with the Next.js version. Here's how the sitemap discovery process works:</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">const</span> handleSubmit = <span class="hljs-keyword">async</span> (e: React.FormEvent) =&gt; {
  <span class="hljs-comment">// ... initialization code ...</span>

  <span class="hljs-keyword">try</span> {
    <span class="hljs-keyword">const</span> normalizedUrl = normalizeUrl(url);
    <span class="hljs-keyword">const</span> robotsUrl = <span class="hljs-keyword">new</span> URL(<span class="hljs-string">'/robots.txt'</span>, normalizedUrl).toString();

    <span class="hljs-keyword">let</span> sitemapUrls: string[] = [];

    <span class="hljs-keyword">try</span> {
      <span class="hljs-keyword">const</span> robotsTxt = <span class="hljs-keyword">await</span> fetchWithProxy(robotsUrl);
      sitemapUrls = extractSitemapUrlsFromRobotsTxt(robotsTxt);
    } <span class="hljs-keyword">catch</span> (error) {
      addDebugMessage(<span class="hljs-string">`Failed to fetch robots.txt: <span class="hljs-subst">${error.message}</span>`</span>);
    }

    <span class="hljs-comment">// If no sitemaps found in robots.txt, try common locations</span>
    <span class="hljs-keyword">if</span> (sitemapUrls.length === <span class="hljs-number">0</span>) {
      <span class="hljs-keyword">const</span> commonLocations = getCommonSitemapUrls(normalizedUrl);
      <span class="hljs-comment">// ... sitemap discovery logic ...</span>
    }
  }
  <span class="hljs-comment">// ... rest of the implementation</span>
}
</code></pre>
<h2 id="heading-working-with-ai-models">Working with AI Models</h2>
<p>One of the most exciting aspects of this project was integrating multiple AI providers. I chose to support both Hyperbolic and Groq, as they offer distinct and complementary advantages:</p>
<p><strong>Hyperbolic</strong> specializes in decentralized AI computing by aggregating idle computing resources, making AI development more accessible and cost-effective. Their open-access platform enables individuals and organizations to train and host AI models collaboratively.</p>
<p><strong>Groq</strong> focuses on delivering exceptional AI inference speed and efficiency through their Language Processing Unit (LPU) technology. Designed specifically for AI inference, Groq's hardware and software platform offers instant speed and energy efficiency at scale.</p>
<p>Here's how we generate summaries using each provider.</p>
<p>First, the Hyperbolic implementation:</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">generateHyperbolicSummary</span>(<span class="hljs-params">url: string, apiKey: string</span>): <span class="hljs-title">Promise</span>&lt;<span class="hljs-title">string</span>&gt; </span>{
  <span class="hljs-keyword">const</span> response = <span class="hljs-keyword">await</span> fetch(<span class="hljs-string">'https://api.hyperbolic.xyz/v1/chat/completions'</span>, {
    <span class="hljs-attr">method</span>: <span class="hljs-string">'POST'</span>,
    <span class="hljs-attr">headers</span>: {
      <span class="hljs-string">'Content-Type'</span>: <span class="hljs-string">'application/json'</span>,
      <span class="hljs-string">'Authorization'</span>: <span class="hljs-string">`Bearer <span class="hljs-subst">${apiKey}</span>`</span>,
    },
    <span class="hljs-attr">body</span>: <span class="hljs-built_in">JSON</span>.stringify({
      <span class="hljs-attr">model</span>: <span class="hljs-string">'meta-llama/Meta-Llama-3.1-8B-Instruct'</span>,
      <span class="hljs-attr">messages</span>: [
        {
          <span class="hljs-attr">role</span>: <span class="hljs-string">'user'</span>,
          <span class="hljs-attr">content</span>: <span class="hljs-string">`Generate a concise 1-sentence summary of the purpose of this webpage: <span class="hljs-subst">${url}</span>`</span>
        }
      ],
      <span class="hljs-attr">max_tokens</span>: <span class="hljs-number">200</span>,
      <span class="hljs-attr">temperature</span>: <span class="hljs-number">0.7</span>,
      <span class="hljs-attr">top_p</span>: <span class="hljs-number">0.9</span>,
      <span class="hljs-attr">stream</span>: <span class="hljs-literal">false</span>
    }),
  });

  <span class="hljs-keyword">const</span> json = <span class="hljs-keyword">await</span> response.json();
  <span class="hljs-keyword">return</span> json.choices[<span class="hljs-number">0</span>].message.content;
}
</code></pre>
<p>And here's the Groq implementation:</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">generateGroqSummary</span>(<span class="hljs-params">url: string, content: str, api_key: string</span>): <span class="hljs-title">Promise</span>&lt;<span class="hljs-title">string</span>&gt; </span>{
  <span class="hljs-keyword">const</span> groq = <span class="hljs-keyword">new</span> Groq({ apiKey });

  <span class="hljs-keyword">const</span> completion = <span class="hljs-keyword">await</span> groq.chat.completions.create({
    <span class="hljs-attr">messages</span>: [{
      <span class="hljs-attr">role</span>: <span class="hljs-string">'user'</span>,
      <span class="hljs-attr">content</span>: <span class="hljs-string">`Generate a concise 1-sentence summary of this webpage content:\n\nURL: <span class="hljs-subst">${url}</span>\n\nContent: <span class="hljs-subst">${content}</span>`</span>
    }],
    <span class="hljs-attr">model</span>: <span class="hljs-string">"llama-3.2-1b-preview"</span>,
    <span class="hljs-attr">temperature</span>: <span class="hljs-number">0.7</span>,
    <span class="hljs-attr">max_tokens</span>: <span class="hljs-number">200</span>
  });

  <span class="hljs-keyword">return</span> completion.choices[<span class="hljs-number">0</span>]?.message?.content || <span class="hljs-string">''</span>;
}
</code></pre>
<h2 id="heading-content-extraction-challenge">Content Extraction Challenge</h2>
<p>One of the biggest challenges was extracting clean, meaningful content from web pages. Raw HTML is messy and full of navigation elements, ads, and other noise. That's where the Markdowner API comes in - it helps us get clean, formatted content that's perfect for AI processing.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">getPageContent</span>(<span class="hljs-params">url: string, markdownerKey?: string</span>): <span class="hljs-title">Promise</span>&lt;<span class="hljs-title">string</span>&gt; </span>{
  <span class="hljs-keyword">const</span> headers = {<span class="hljs-string">"Accept"</span>: <span class="hljs-string">"text/plain"</span>};
  <span class="hljs-keyword">if</span> (markdownerKey) {
    headers[<span class="hljs-string">"Authorization"</span>] = <span class="hljs-string">`Bearer <span class="hljs-subst">${markdownerKey}</span>`</span>;
  }

  <span class="hljs-keyword">const</span> response = <span class="hljs-keyword">await</span> fetch(<span class="hljs-string">"https://md.dhr.wtf/"</span>, {
    <span class="hljs-attr">method</span>: <span class="hljs-string">"POST"</span>,
    headers,
    <span class="hljs-attr">body</span>: <span class="hljs-built_in">JSON</span>.stringify({ url })
  });

  <span class="hljs-keyword">return</span> response.text();
}
</code></pre>
<h2 id="heading-python-implementation-with-gradio">Python Implementation with Gradio</h2>
<p>In addition to the Next.js version, I also built a Python implementation using Gradio for a more streamlined deployment option. The Gradio version offers a simple yet powerful interface for generating llms.txt files. Here's how the core processing pipeline works:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">process_website</span>(<span class="hljs-params">
    url: str,
    hyperbolic_key: str = <span class="hljs-string">""</span>,
    groq_key: str = <span class="hljs-string">""</span>,
    markdowner_key: str = <span class="hljs-string">""</span>,
    use_hyperbolic: bool = True,
    progress=gr.Progress(<span class="hljs-params"></span>)
</span>) -&gt; Tuple[str, str, List[str], str]:</span>
    <span class="hljs-keyword">try</span>:
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> (use_hyperbolic <span class="hljs-keyword">and</span> hyperbolic_key) <span class="hljs-keyword">and</span> <span class="hljs-keyword">not</span> (<span class="hljs-keyword">not</span> use_hyperbolic <span class="hljs-keyword">and</span> groq_key):
            <span class="hljs-keyword">return</span> <span class="hljs-string">"Error: Please provide an API key for the selected AI provider"</span>, <span class="hljs-literal">None</span>, [], <span class="hljs-string">""</span>

        base_url = normalize_url(url)
        progress(<span class="hljs-number">0</span>, desc=<span class="hljs-string">"Initializing..."</span>)

        <span class="hljs-comment"># Try robots.txt first</span>
        sitemap_urls = []
        <span class="hljs-keyword">try</span>:
            robots_url = urljoin(base_url, <span class="hljs-string">'/robots.txt'</span>)
            robots_content = fetch_with_proxy(robots_url)
            sitemap_urls = extract_sitemap_urls_from_robots(robots_content)
        <span class="hljs-keyword">except</span>:
            <span class="hljs-keyword">pass</span>

        <span class="hljs-comment"># Process sitemaps and generate summaries</span>
        progress(<span class="hljs-number">0.4</span>, desc=<span class="hljs-string">"Processing sitemaps..."</span>)

        <span class="hljs-comment"># ... rest of processing logic ...</span>

        <span class="hljs-keyword">return</span> llms_txt, json.dumps(summaries), all_urls, llms_full_txt

    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        print(<span class="hljs-string">f"Error in process_website: <span class="hljs-subst">{str(e)}</span>"</span>)
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"Processing failed: <span class="hljs-subst">{str(e)}</span>"</span>, <span class="hljs-literal">None</span>, [], <span class="hljs-string">""</span>
</code></pre>
<p>The Gradio interface is built with a clean, user-friendly design that includes features like:</p>
<pre><code class="lang-python"><span class="hljs-keyword">with</span> gr.Blocks(title=<span class="hljs-string">"llms.txt Generator"</span>, theme=gr.themes.Soft()) <span class="hljs-keyword">as</span> demo:
    gr.Markdown(<span class="hljs-string">"""
    # llms.txt Generator 🤖✨
    Generate AI-powered llms.txt files for any website 🌐
    """</span>)

    <span class="hljs-keyword">with</span> gr.Row():
        url_input = gr.Textbox(
            label=<span class="hljs-string">"Website URL"</span>,
            placeholder=<span class="hljs-string">"Enter website URL"</span>
        )
        markdowner_key = gr.Textbox(
            label=<span class="hljs-string">"Markdowner API Key (Optional)"</span>,
            placeholder=<span class="hljs-string">"For higher rate limits"</span>,
            type=<span class="hljs-string">"password"</span>,
            container=<span class="hljs-literal">True</span>,
            scale=<span class="hljs-number">2</span>
        )

    <span class="hljs-comment"># AI Provider Selection</span>
    <span class="hljs-keyword">with</span> gr.Row():
        <span class="hljs-keyword">with</span> gr.Column():
            use_hyperbolic = gr.Checkbox(
                label=<span class="hljs-string">"Use Hyperbolic"</span>,
                value=<span class="hljs-literal">True</span>,
                interactive=<span class="hljs-literal">True</span>
            )
            hyperbolic_key = gr.Textbox(
                label=<span class="hljs-string">"Hyperbolic API Key"</span>,
                type=<span class="hljs-string">"password"</span>,
                visible=<span class="hljs-literal">True</span>,
                placeholder=<span class="hljs-string">"Enter your Hyperbolic API key"</span>
            )

        <span class="hljs-keyword">with</span> gr.Column():
            use_groq = gr.Checkbox(
                label=<span class="hljs-string">"Use Groq"</span>,
                value=<span class="hljs-literal">False</span>,
                interactive=<span class="hljs-literal">True</span>
            )
            groq_key = gr.Textbox(
                label=<span class="hljs-string">"Groq API Key"</span>,
                type=<span class="hljs-string">"password"</span>,
                visible=<span class="hljs-literal">False</span>,
                placeholder=<span class="hljs-string">"Enter your Groq API key"</span>
            )
</code></pre>
<p>One of the key features of the Gradio version is its robust sitemap processing and URL extraction:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">extract_urls_from_sitemap</span>(<span class="hljs-params">content: str</span>) -&gt; List[str]:</span>
    urls = []
    <span class="hljs-keyword">try</span>:
        root = ET.fromstring(content)
        ns = {
            <span class="hljs-string">'ns'</span>: root.tag.split(<span class="hljs-string">'}'</span>)[<span class="hljs-number">0</span>].strip(<span class="hljs-string">'{'</span>)
        } <span class="hljs-keyword">if</span> <span class="hljs-string">'}'</span> <span class="hljs-keyword">in</span> root.tag <span class="hljs-keyword">else</span> {}

        <span class="hljs-comment"># Handle sitemap index</span>
        <span class="hljs-keyword">if</span> <span class="hljs-string">'sitemapindex'</span> <span class="hljs-keyword">in</span> root.tag:
            <span class="hljs-keyword">for</span> sitemap <span class="hljs-keyword">in</span> root.findall(<span class="hljs-string">'.//ns:loc'</span>, ns):
                <span class="hljs-keyword">try</span>:
                    sitemap_content = fetch_with_proxy(sitemap.text.strip())
                    urls.extend(extract_urls_from_sitemap(sitemap_content))
                <span class="hljs-keyword">except</span> Exception:
                    <span class="hljs-keyword">continue</span>
        <span class="hljs-comment"># Handle urlset</span>
        <span class="hljs-keyword">else</span>:
            <span class="hljs-keyword">for</span> url <span class="hljs-keyword">in</span> root.findall(<span class="hljs-string">'.//ns:loc'</span>, ns):
                urls.append(url.text.strip())
    <span class="hljs-keyword">except</span> ET.ParseError:
        <span class="hljs-keyword">pass</span>
    <span class="hljs-keyword">return</span> urls
</code></pre>
<p>The Gradio implementation offers several advantages over the web version:</p>
<ol>
<li><p>Easy deployment to platforms like Hugging Face Spaces</p>
</li>
<li><p>Built-in rate limiting and progress tracking</p>
</li>
<li><p>Simplified dependency management</p>
</li>
<li><p>Automatic file download handling</p>
</li>
<li><p>Real-time processing feedback</p>
</li>
</ol>
<p>Both versions of the tool complement each other - the Next.js version provides a polished web experience, while the Gradio version offers a more straightforward path for quick deployments and testing.</p>
<h2 id="heading-implementation-details">Implementation Details</h2>
<p>The project includes several key features that make it production-ready:</p>
<ul>
<li><p>Automatic sitemap discovery through robots.txt and common locations</p>
</li>
<li><p>Rate limiting to respect API constraints</p>
</li>
<li><p>Progress tracking for long-running operations</p>
</li>
<li><p>Error handling and retry logic</p>
</li>
<li><p>Clean, modern UI with dark mode support</p>
</li>
</ul>
<h2 id="heading-future-improvements">Future Improvements</h2>
<p>While the current version is fully functional, I have several improvements planned:</p>
<ul>
<li><p>Support for more AI providers</p>
</li>
<li><p>Batch processing for large websites</p>
</li>
<li><p>Custom summary templates</p>
</li>
<li><p>API endpoint for programmatic access</p>
</li>
</ul>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Building the llms.txt Generator has been an exciting journey into the intersection of web technologies and AI. It's open source and available on GitHub, and I'm looking forward to seeing how the community uses and improves it.</p>
<p>The llms.txt standard is still evolving, and tools like this will help shape how websites and AI models interact in the future. ## Getting Started</p>
<p>There are several ways to start using the llms.txt Generator:</p>
<ol>
<li><p><strong>Hosted Version</strong>: If you want to try it out immediately without any setup, visit <a target="_blank" href="http://llms.website">llms.website</a> for a fully managed version that's always up to date. This is the easiest way to get started and requires no technical knowledge.</p>
</li>
<li><p><strong>Self-Hosting</strong>: For developers who want full control, you can self-host either the Next.js or Gradio version. The complete source code is available on <a target="_blank" href="https://github.com/martinbowling/llms.website">GitHub</a>.</p>
</li>
<li><p><strong>Quick Deploy Options</strong>:</p>
<ul>
<li><p><strong>Next.js Version</strong>: Use our <a target="_blank" href="https://replit.com/@MartinBowling/llmswebsite?v=1">Next.js Replit template</a> to get a web version running in minutes</p>
</li>
<li><p><strong>Gradio Version</strong>: Try our <a target="_blank" href="https://replit.com/@MartinBowling/llmstxt-generator-Gradio-Version?v=1">Gradio Replit template</a> for a Python-based implementation</p>
</li>
<li><p>Both templates can also be easily deployed to platforms like Hugging Face Spaces or your preferred hosting provider</p>
</li>
</ul>
</li>
</ol>
<p>Each option has its advantages:</p>
<ul>
<li><p>The hosted version at <a target="_blank" href="http://llms.website">llms.website</a> is perfect for content teams and non-technical users</p>
</li>
<li><p>Self-hosting gives you complete control over the environment and API keys</p>
</li>
<li><p>Platform deployments offer a middle ground with some customization while maintaining ease of use</p>
</li>
</ul>
<h2 id="heading-resources-amp-links">Resources &amp; Links</h2>
<ul>
<li><p><strong>Main Project</strong></p>
<ul>
<li><p>🌐 Hosted Version: <a target="_blank" href="http://llms.website">llms.website</a></p>
</li>
<li><p>📦 GitHub Repository: <a target="_blank" href="http://github.com/martinbowling/llms.website">github.com/martinbowling/llms.website</a></p>
</li>
</ul>
</li>
<li><p><strong>Templates</strong></p>
<ul>
<li><p>⚡ Next.js Template: <a target="_blank" href="https://replit.com/@MartinBowling/llmswebsite?v=1">Replit Template</a></p>
</li>
<li><p>🐍 Gradio Template: <a target="_blank" href="https://replit.com/@MartinBowling/llmstxt-generator-Gradio-Version?v=1">Replit Template</a></p>
</li>
</ul>
</li>
</ul>
<p>Choose the option that best fits your needs and technical comfort level. The hosted version is recommended for most users, while the templates and self-hosting options are perfect for developers who want to customize the tool or integrate it into their own workflows.</p>
<p>If you have any questions about deployment options or need help getting started, feel free to reach out to me on <a target="_blank" href="https://twitter.com/martinbowling">Twitter</a> or through the GitHub repository. I'm always happy to help users get started with whichever deployment option they choose!</p>
]]></content:encoded></item><item><title><![CDATA[Llamaberry - Cultivating Chain of Thought Reasoning in AI]]></title><description><![CDATA[Hey there, AI adventurers, I'm, ready to take you on a journey through the fascinating world of AI reasoning. Today, we're diving deep into Llamaberry, my implementation of a multi-turn Chain of Thought (CoT) reasoning system. Buckle up, because this...]]></description><link>https://martinbowling.com/llamaberry-cultivating-chain-of-thought-reasoning-in-ai</link><guid isPermaLink="true">https://martinbowling.com/llamaberry-cultivating-chain-of-thought-reasoning-in-ai</guid><category><![CDATA[AI]]></category><category><![CDATA[chain of thought]]></category><category><![CDATA[groq]]></category><category><![CDATA[Python]]></category><category><![CDATA[gradio]]></category><dc:creator><![CDATA[Martin Bowling]]></dc:creator><pubDate>Sat, 14 Sep 2024 19:13:19 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1726340324179/a171d010-bc47-49dd-a23b-d68485fac1b4.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there, AI adventurers, I'm, ready to take you on a journey through the fascinating world of AI reasoning. Today, we're diving deep into Llamaberry, my implementation of a multi-turn Chain of Thought (CoT) reasoning system. Buckle up, because this is going to be one berry interesting ride! 🚀</p>
<h2 id="heading-the-evolution-of-ai-reasoning">The Evolution of AI Reasoning 📚</h2>
<p>Before we sink our teeth into Llamaberry, let's take a quick trip down memory lane. AI reasoning has come a long way, baby!</p>
<ul>
<li><p><strong>1950s-1960s</strong>: Rule-based systems. Think of these as the "If This, Then That" of the AI world.</p>
</li>
<li><p><strong>1980s-1990s</strong>: Expert systems. These were like having a virtual expert in a specific field.</p>
</li>
<li><p><strong>2000s-2010s</strong>: Machine learning models. Now we're talking! AI that can learn from data.</p>
</li>
<li><p><strong>2020s</strong>: Large Language Models (LLMs) and Chain of Thought reasoning. This is where things get really exciting!</p>
</li>
</ul>
<p>Chain of Thought reasoning is like giving AI a notepad to show its work. Instead of just spitting out an answer, it walks us through its thinking process. And that's where Llamaberry comes in! 🌟</p>
<h2 id="heading-whats-llamaberry-you-ask">What's Llamaberry, You Ask? 🤔</h2>
<p>Imagine if you could teach an AI to think not just fast, but thoroughly - like a human expert pondering a complex problem. That's exactly what I set out to do with Llamaberry. It's my implementation of a multi-turn Chain of Thought (CoT) reasoning system, powered by the Llama 3.1 70B model on Groq. Fancy name aside, it's all about making AI think more like us... only faster and without needing coffee breaks! ☕</p>
<h2 id="heading-the-secret-sauce-multi-turn-reasoning">The Secret Sauce: Multi-Turn Reasoning 🥣</h2>
<p>Here's where it gets interesting. Instead of asking the AI to give us an answer in one go, we're asking it to think through the problem in multiple steps, or "turns." It's like watching a master chef prepare a complex dish - each step builds on the last, resulting in something far more refined than a quick microwave meal.</p>
<p>Let's break it down:</p>
<ol>
<li><p><strong>Turn 1</strong> 🥚: The AI takes a first crack at the problem.</p>
</li>
<li><p><strong>Turn 2</strong> 🍳: It looks back at its first attempt and tries to improve or refine its thinking.</p>
</li>
<li><p><strong>Turn 3</strong> 🧑‍🍳: One more round of reflection and refinement.</p>
</li>
<li><p><strong>Synthesis</strong> 🍽️: Finally, it puts all these thoughts together into one coherent, well-reasoned answer.</p>
</li>
</ol>
<p>To visualize this process, here's a delightful little diagram that shows how each turn builds on the previous one:</p>
<pre><code class="lang-plaintext">    Turn 1         Turn 2         Turn 3
   🧠💡📝         🔍🤔📊         🔬🧪🎯
 ┌─────────┐    ┌─────────┐    ┌─────────┐
 │  Input  │    │ Output  │───▶│  Input  │
 └─────────┘    └─────────┘    └─────────┘
      │              ▲              |
      │              │              │
      ▼              │              ▼
 ┌─────────┐    ┌─────────┐    ┌─────────┐
 │Reasoning│    │Reasoning│    │Reasoning│
 └─────────┘    └─────────┘    └─────────┘
      │              ▲              │
      │              │              │
      ▼              │              ▼
 ┌─────────┐    ┌─────────┐    ┌─────────┐
 │ Output  │───▶│  Input  │    │ Output  │
 └─────────┘    └─────────┘    └─────────┘
                                    │
                                    │
                                    ▼
                               ┌─────────┐
                               │Synthesis│
                               └─────────┘
                                    │
                                    │
                                    ▼
                            🎉 Final Answer 🎉
</code></pre>
<p>Isn't that neat? You can see how each turn's output becomes the input for the next turn, allowing the AI to refine its thinking at each stage. And at the end, all of this thoughtful pondering gets distilled into one final, well-reasoned answer. It's like watching a berry ripen to perfection! 🍓</p>
<h2 id="heading-the-recipe-how-it-all-works">The Recipe: How It All Works 📝</h2>
<p>Now, let's get into the nitty-gritty. Don't worry; I promise to keep it as smooth as a berry smoothie!</p>
<h3 id="heading-step-1-setting-the-stage">Step 1: Setting the Stage 🎭</h3>
<p>We start by setting up our AI assistant with some ground rules. Here's a snippet of what that looks like:</p>
<pre><code class="lang-python">initial_system_prompt = <span class="hljs-string">"""You are an AI assistant capable of detailed, step-by-step thinking. When presented with a question or problem, break down your thought process into clear, logical steps. For each step, explain your reasoning. Conclude with a final answer. Use the following markdown structure:

## Reasoning
1. [First step]
   **Explanation:** [Detailed explanation of this step]
2. [Second step]
   **Explanation:** [Detailed explanation of this step]
...

## Answer
[Final answer]

Be comprehensive and show your reasoning clearly."""</span>
</code></pre>
<p>This is like giving our AI chef a recipe to follow. It knows it needs to break down its thinking into steps and explain each one, all neatly formatted in markdown.</p>
<h3 id="heading-step-2-the-thinking-process">Step 2: The Thinking Process 🤔💭</h3>
<p>For each turn of reasoning, we ask the AI to think about the problem. But here's the cool part - after the first turn, we also ask it to consider what it thought before. It's like asking a friend, "Hey, remember what you said earlier? Let's think about that again."</p>
<p>Here's how we generate each turn:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">generate_turn</span>(<span class="hljs-params">query: str, previous_turns: list = None</span>) -&gt; str:</span>
    is_first_turn = previous_turns <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span> <span class="hljs-keyword">or</span> len(previous_turns) == <span class="hljs-number">0</span>
    <span class="hljs-keyword">if</span> is_first_turn:
        messages = [{
            <span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>,
            <span class="hljs-string">"content"</span>: initial_system_prompt
        }, {
            <span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>,
            <span class="hljs-string">"content"</span>: query
        }]
    <span class="hljs-keyword">else</span>:
        previous_content = <span class="hljs-string">"\n\n"</span>.join(previous_turns)
        messages = [{
            <span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>,
            <span class="hljs-string">"content"</span>: followup_system_prompt
        }, {
            <span class="hljs-string">"role"</span>:
            <span class="hljs-string">"user"</span>,
            <span class="hljs-string">"content"</span>:
            <span class="hljs-string">f"Original Query: <span class="hljs-subst">{query}</span>\n\nPrevious Turns:\n<span class="hljs-subst">{previous_content}</span>\n\nProvide the next turn of reasoning."</span>
        }]

    <span class="hljs-keyword">return</span> <span class="hljs-keyword">await</span> call_llm(messages)
</code></pre>
<p>This function is the heart of our multi-turn reasoning. It decides whether we're on the first turn or a subsequent one, and structures the prompt accordingly. For turns after the first, it includes all previous turns in the context, allowing the AI to build on its earlier thoughts.</p>
<h3 id="heading-step-3-putting-it-all-together">Step 3: Putting It All Together 🧩</h3>
<p>After three turns of thinking, we ask our AI to look at all its thoughts and come up with a final, well-reasoned answer. It's like asking a group of experts to discuss a problem and then write a report together.</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">synthesize_turns</span>(<span class="hljs-params">query: str, turns: list</span>) -&gt; str:</span>
    turns_text = <span class="hljs-string">"\n\n"</span>.join(
        [<span class="hljs-string">f"Turn <span class="hljs-subst">{i+<span class="hljs-number">1</span>}</span>:\n<span class="hljs-subst">{turn}</span>"</span> <span class="hljs-keyword">for</span> i, turn <span class="hljs-keyword">in</span> enumerate(turns)])
    messages = [{
        <span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>,
        <span class="hljs-string">"content"</span>: synthesis_prompt
    }, {
        <span class="hljs-string">"role"</span>:
        <span class="hljs-string">"user"</span>,
        <span class="hljs-string">"content"</span>:
        <span class="hljs-string">f"Original Query: <span class="hljs-subst">{query}</span>\n\nTurns of Reasoning:\n<span class="hljs-subst">{turns_text}</span>"</span>
    }]
    <span class="hljs-keyword">return</span> <span class="hljs-keyword">await</span> call_llm(messages)
</code></pre>
<p>This synthesis step is crucial. It's where our AI takes all its thoughts from the previous turns and distills them into a coherent, final answer.</p>
<h2 id="heading-single-turn-vs-multi-turn-a-tasty-comparison">Single-Turn vs Multi-Turn: A Tasty Comparison 🍎🍏</h2>
<p>Let's compare single-turn and multi-turn reasoning with a simple example. Imagine we ask the AI: "What are the potential impacts of artificial intelligence on the job market?"</p>
<p><strong>Single-Turn Reasoning:</strong></p>
<pre><code class="lang-plaintext">AI: Artificial intelligence will likely have significant impacts on the job market, including:
1. Automation of routine tasks
2. Creation of new job roles in AI development and maintenance
3. Potential job displacement in certain industries
4. Increased demand for skills in data analysis and machine learning

While AI may eliminate some jobs, it will also create new opportunities and increase productivity in many fields.
</code></pre>
<p><strong>Multi-Turn Reasoning with Llamaberry:</strong></p>
<p>Turn 1:</p>
<pre><code class="lang-plaintext">AI: Let's approach this step-by-step:

1. Identify current AI capabilities
   Explanation: AI can perform data analysis, pattern recognition, and automate routine tasks.

2. Analyze vulnerable job sectors
   Explanation: Manufacturing, customer service, and data entry are likely to be affected.

3. Consider new job opportunities
   Explanation: AI development, data science, and AI ethics consultation may see growth.

4. Examine potential economic effects
   Explanation: Increased productivity might lead to economic growth, but wealth distribution could be a challenge.

Initial Answer: AI will likely cause significant shifts in the job market, with both job losses in some sectors and new opportunities in others.
</code></pre>
<p>Turn 2:</p>
<pre><code class="lang-plaintext">AI: Let's refine our thinking:

1. Reassess the timeline of AI impact
   Explanation: The effects won't be immediate but will unfold over years or decades.

2. Consider the adaptability of the workforce
   Explanation: Retraining and education will be crucial for workers to transition to new roles.

3. Analyze potential societal changes
   Explanation: The nature of work itself might change, with more emphasis on creativity and emotional intelligence.

4. Examine global differences
   Explanation: AI's impact will vary across different countries and economies.

Refined Answer: AI's impact on the job market will be complex and gradual, requiring proactive measures in education and economic policy to ensure a smooth transition.
</code></pre>
<p>Turn 3:</p>
<pre><code class="lang-plaintext">AI: Let's delve deeper:

1. Explore the concept of human-AI collaboration
   Explanation: Many jobs may evolve to involve working alongside AI rather than being replaced by it.

2. Consider the ethical implications
   Explanation: Issues like algorithmic bias in hiring processes need to be addressed.

3. Analyze the potential for new economic models
   Explanation: Universal Basic Income or reduced working hours might become more prominent discussions.

4. Examine historical parallels
   Explanation: Previous technological revolutions, like the Industrial Revolution, can offer insights into potential outcomes.

Further Refined Answer: While AI will undoubtedly transform the job market, its impact will likely be nuanced, involving collaboration between humans and AI, ethical considerations, and potentially new economic paradigms.
</code></pre>
<p>Synthesis:</p>
<pre><code class="lang-plaintext">AI: After careful consideration across multiple turns of reasoning, we can synthesize our thoughts on the potential impacts of artificial intelligence on the job market:

1. Job Transformation: AI will lead to significant changes in the job market, but these changes will be gradual and complex. While some jobs may be automated, many others will be transformed to involve collaboration between humans and AI. Industries like manufacturing, customer service, and data entry are likely to see the most immediate impacts.

2. New Opportunities: The rise of AI will create new job categories, particularly in fields like AI development, data science, AI ethics consultation, and human-AI interaction design. There will likely be an increased demand for skills in creativity, emotional intelligence, and complex problem-solving that AI cannot easily replicate.

3. Economic and Social Implications: The integration of AI into the workforce has the potential to increase productivity and economic growth. However, it also raises concerns about wealth distribution and may necessitate discussions about new economic models, such as Universal Basic Income or reduced working hours.

4. Education and Adaptation: The changing job market will require a strong emphasis on continuous learning and adaptation. Educational systems and workforce training programs will need to evolve to prepare people for the new reality of working alongside AI.

5. Ethical Considerations: As AI becomes more prevalent in hiring and work processes, addressing issues like algorithmic bias and ensuring fair opportunities for all workers will be crucial.

6. Global Variations: The impact of AI on the job market will not be uniform across the globe. Different countries and economies will experience varying effects based on their economic structures, technological readiness, and policy responses.

7. Historical Context: While AI represents a significant shift, it's important to view it in the context of previous technological revolutions. Like the Industrial Revolution, the AI revolution will likely lead to short-term disruptions but potentially long-term benefits if managed properly.

In conclusion, while AI will undoubtedly have profound impacts on the job market, these impacts will be multifaceted and evolve over time. The key to navigating this transition successfully lies in proactive measures in education, economic policy, and ethical governance, ensuring that the benefits of AI are broadly shared while mitigating potential negative effects.
</code></pre>
<p>As you can see, the multi-turn approach allows for a much more nuanced, comprehensive analysis of the question. It considers multiple angles, refines its thinking, and produces a more thoughtful and well-rounded final answer.</p>
<h2 id="heading-why-is-this-cool">Why Is This Cool? 😎</h2>
<ol>
<li><p><strong>Depth of Thought</strong>: By thinking in multiple turns, the AI can dig deeper into problems, considering angles it might miss in a single pass.</p>
</li>
<li><p><strong>Self-Improvement</strong>: The AI gets to critique and improve its own thinking, just like we do when we mull over a problem.</p>
</li>
<li><p><strong>Transparency</strong>: We can see the AI's thought process, making it easier to understand and trust its conclusions.</p>
</li>
<li><p><strong>Flexibility</strong>: This approach can handle a wide range of questions, from simple queries to complex problems that require nuanced thinking.</p>
</li>
<li><p><strong>Markdown Magic</strong>: By using markdown formatting, our AI's thoughts are not just thorough, but also beautifully structured and easy to read!</p>
</li>
</ol>
<h2 id="heading-real-world-applications">Real-World Applications 🌍</h2>
<p>The multi-turn reasoning approach of Llamaberry has numerous potential applications across various fields:</p>
<ol>
<li><p><strong>Medical Diagnosis</strong>: Doctors could use this system to get a second opinion, with the AI considering multiple angles of a patient's symptoms and medical history.</p>
</li>
<li><p><strong>Legal Analysis</strong>: Lawyers could use it to analyze complex cases, considering multiple interpretations of the law and previous case precedents.</p>
</li>
<li><p><strong>Scientific Research</strong>: Researchers could use it to generate hypotheses, considering multiple angles and potential experimental approaches.</p>
</li>
<li><p><strong>Business Strategy</strong>: Executives could use it to analyze market trends and make strategic decisions, considering multiple scenarios and potential outcomes.</p>
</li>
<li><p><strong>Education</strong>: Teachers could use it to create personalized learning plans, considering multiple aspects of a student's learning style and needs.</p>
</li>
</ol>
<h2 id="heading-limitations-and-ethical-considerations">Limitations and Ethical Considerations 🚧</h2>
<p>While Llamaberry and similar multi-turn reasoning systems are powerful, it's important to acknowledge their limitations and potential ethical concerns:</p>
<ol>
<li><p><strong>Bias</strong>: Like all AI systems, Llamaberry can potentially perpetuate biases present in its training data.</p>
</li>
<li><p><strong>Hallucination</strong>: Even with multiple turns, AI can sometimes generate plausible-sounding but incorrect information.</p>
</li>
<li><p><strong>Computational Cost</strong>: Multi-turn reasoning requires more computational resources than single-turn approaches.</p>
</li>
<li><p><strong>Over-reliance</strong>: There's a risk of people relying too heavily on AI for decision-making without appropriate human oversight.</p>
</li>
<li><p><strong>Privacy</strong>: When used with sensitive information, proper data protection measures must be in place.</p>
</li>
</ol>
<h2 id="heading-how-to-get-started">How to Get Started 🚀</h2>
<p>Want to try out Llamaberry for yourself? I've got great news – it's now easier than ever! Thanks to the magic of Replit, you can start experimenting with Llamaberry in just a few clicks. Here's how:</p>
<ol>
<li><p><strong>Head to Replit</strong>: Click on this link to access the Llamaberry template: <a target="_blank" href="https://replit.com/@MartinBowling/Llamaberry-Powered-By-Groq?v=1">https://replit.com/@MartinBowling/Llamaberry-Powered-By-Groq?v=1</a></p>
</li>
<li><p><strong>Fork the Template</strong>: Once you're on the Replit page, click the "Fork" button to create your own copy of the Llamaberry project.</p>
</li>
<li><p><strong>Get Your Groq API Key</strong>: If you haven't already, sign up for a Groq account and obtain your API key.</p>
</li>
<li><p><strong>Set Up Your Environment</strong>: In your forked Replit project, find the "Secrets" tab (it looks like a lock icon). Add a new secret with the key <code>GROQ_API_KEY</code> and your Groq API key as the value.</p>
</li>
<li><p><strong>Run the Project</strong>: Click the "Run" button at the top of the Replit interface. This will start up the Llamaberry application.</p>
</li>
<li><p><strong>Start Experimenting</strong>: Once the application is running, you'll see a Gradio interface where you can input your questions and see Llamaberry's multi-turn reasoning in action!</p>
</li>
</ol>
<p>That's it! No need to worry about Python versions, installing dependencies, or configuring your local environment. Replit takes care of all that for you, so you can focus on the fun part – exploring AI reasoning!</p>
<p>Remember, the output will be in beautiful markdown format, making it easy to read and understand the AI's thought process. You can ask questions, analyze the reasoning steps, and even modify the code to experiment with different prompts or parameters.</p>
<p>So what are you waiting for? Dive in and start your AI reasoning adventure with Llamaberry! 🚀🧠🍓</p>
<h2 id="heading-wrapping-up">Wrapping Up 🎁</h2>
<p>And there you have it, folks! Llamaberry is my attempt at making AI reasoning a bit more... well, reasonable. It's not perfect (what is?), but it's a step towards AI that can think more like us humans - iteratively, reflectively, and thoroughly. And now, with markdown formatting, it's not just smart thinking, it's pretty thinking too!</p>
<p>Next time you're pondering a tough question, remember: even AIs sometimes need to sleep on it (or at least think about it three times) before coming up with their best answer. And isn't that just deliciously human of them?</p>
<p>Until next time, keep your thoughts flowing and your berries sweet!</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://replit.com/@MartinBowling/Llamaberry-Powered-By-Groq?v=1">https://replit.com/@MartinBowling/Llamaberry-Powered-By-Groq?v=1</a></div>
]]></content:encoded></item><item><title><![CDATA[Unleashing the Power of Speculative RAG]]></title><description><![CDATA[🎉🚀 FriYAY Demo: Speculative RAG 🤖🔮
Today we're diving into an intriguing development in the world of artificial intelligence: Speculative Retrieval-Augmented Generation, commonly known as Speculative RAG. This technique represents an interesting ...]]></description><link>https://martinbowling.com/unleashing-the-power-of-speculative-rag</link><guid isPermaLink="true">https://martinbowling.com/unleashing-the-power-of-speculative-rag</guid><category><![CDATA[AI]]></category><category><![CDATA[RAG ]]></category><category><![CDATA[Speculative RAG]]></category><category><![CDATA[Python]]></category><category><![CDATA[groq]]></category><dc:creator><![CDATA[Martin Bowling]]></dc:creator><pubDate>Fri, 23 Aug 2024 16:44:48 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/UZe35tk5UoA/upload/9ddfcfe1dafb8240755e16506ea69b83.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1 id="heading-friyay-demo-speculative-rag">🎉🚀 FriYAY Demo: Speculative RAG 🤖🔮</h1>
<p>Today we're diving into an intriguing development in the world of artificial intelligence: Speculative Retrieval-Augmented Generation, commonly known as Speculative RAG. This technique represents an interesting step forward in how AI systems can process and respond to complex queries. Let's explore how Speculative RAG works and why it's garnering attention in AI circles.</p>
<h2 id="heading-whats-the-big-deal-with-speculative-rag">What's the Big Deal with Speculative RAG?</h2>
<p>Imagine you're at a party, and someone asks a question. You've got a few friends there: the Quick Thinker, the Bookworm, and the Judge. Speculative RAG is like having these three friends work together to give the best possible answer. Let's break it down:</p>
<ol>
<li><p><strong>The Quick Thinker (Generalist LLM)</strong>: This is the friend who quickly decides if the question needs a simple answer or if it's time to call in the big guns.</p>
</li>
<li><p><strong>The Bookworm (Specialist LLM)</strong>: If the question is tricky, this friend dives into the books (or in our case, the provided documents) and comes up with detailed answers.</p>
</li>
<li><p><strong>The Judge (Evaluator LLM)</strong>: This friend listens to all the answers from the Bookworm and picks the best one.</p>
</li>
</ol>
<p>Now, let's take a look at the technical implementation of Speculative RAG and see how these components work together in practice.</p>
<h2 id="heading-the-code-where-the-magic-happens">The Code: Where the Magic Happens</h2>
<p>Let's walk through the key parts of our Speculative RAG implementation. Don't worry if you're not a coding guru – I'll break it down for you!</p>
<h3 id="heading-setting-the-stage">Setting the Stage</h3>
<p>First, we need to import our tools and set up our AI models:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> asyncio
<span class="hljs-keyword">from</span> groq <span class="hljs-keyword">import</span> AsyncGroq
<span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> List, Tuple

<span class="hljs-comment"># Initialize Groq client</span>
client = AsyncGroq(api_key=os.environ.get(<span class="hljs-string">"GROQ_API_KEY"</span>))

<span class="hljs-comment"># Define models</span>
generalist_model = <span class="hljs-string">"llama-3.1-8b-instant"</span>
specialist_model = <span class="hljs-string">"mixtral-8x7b-32768"</span>
</code></pre>
<p>Here, we're using the Groq API to access our AI models. The generalist is quick on its feet, while the specialist is the heavyweight champion of knowledge.</p>
<h3 id="heading-the-quick-thinker-in-action">The Quick Thinker in Action</h3>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">generalist_llm</span>(<span class="hljs-params">query: str</span>) -&gt; Tuple[bool, str]:</span>
    messages = [
        {<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: generalist_system_prompt},
        {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: query}
    ]
    response = <span class="hljs-keyword">await</span> call_llm(generalist_model, messages, temperature=<span class="hljs-number">0.5</span>, max_tokens=<span class="hljs-number">10</span>)
    is_complex = response.lower().strip() == <span class="hljs-string">'yes'</span>
    <span class="hljs-keyword">return</span> is_complex, <span class="hljs-string">f"Generalist decision: <span class="hljs-subst">{<span class="hljs-string">'Knowledge-intensive'</span> <span class="hljs-keyword">if</span> is_complex <span class="hljs-keyword">else</span> <span class="hljs-string">'Simple'</span>}</span>"</span>
</code></pre>
<p>This function is our Quick Thinker. It looks at the question and decides if we need to call in the Bookworm or if it can handle it solo. The <code>temperature=0.5</code> keeps it focused, and <code>max_tokens=10</code> ensures a quick yes/no response.</p>
<h3 id="heading-the-bookworms-deep-dive">The Bookworm's Deep Dive</h3>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">specialist_llm</span>(<span class="hljs-params">query: str, document: str</span>) -&gt; Tuple[str, str, str]:</span>
    messages = [
        {<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: specialist_system_prompt},
        {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">f"Query: <span class="hljs-subst">{query}</span>\n\nDocument:\n<span class="hljs-subst">{document}</span>"</span>}
    ]
    response = <span class="hljs-keyword">await</span> call_llm(specialist_model, messages, temperature=<span class="hljs-number">0.8</span>, max_tokens=<span class="hljs-number">2048</span>)

    parts = response.split(<span class="hljs-string">"Rationale:"</span>, <span class="hljs-number">1</span>)
    draft = parts[<span class="hljs-number">0</span>].strip()
    rationale = parts[<span class="hljs-number">1</span>].strip() <span class="hljs-keyword">if</span> len(parts) &gt; <span class="hljs-number">1</span> <span class="hljs-keyword">else</span> <span class="hljs-string">"No explicit rationale provided."</span>

    <span class="hljs-keyword">return</span> draft, rationale, <span class="hljs-string">f"Specialist generated a draft."</span>
</code></pre>
<p>Here's where the Bookworm shines. It takes the question and a document, then generates a detailed answer (draft) and explains its thinking (rationale). We use a higher <code>temperature=0.8</code> to encourage creative thinking and allow for longer responses with <code>max_tokens=2048</code>.</p>
<h3 id="heading-the-judges-wise-decision">The Judge's Wise Decision</h3>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">evaluator_llm</span>(<span class="hljs-params">query: str, drafts_and_rationales: List[Tuple[str, str]]</span>) -&gt; Tuple[int, str, str]:</span>
    drafts_text = <span class="hljs-string">"\n\n"</span>.join([<span class="hljs-string">f"Draft <span class="hljs-subst">{i+<span class="hljs-number">1</span>}</span>:\n<span class="hljs-subst">{draft}</span>\nRationale:\n<span class="hljs-subst">{rationale}</span>"</span> 
                               <span class="hljs-keyword">for</span> i, (draft, rationale) <span class="hljs-keyword">in</span> enumerate(drafts_and_rationales)])
    messages = [
        {<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: evaluator_system_prompt},
        {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">f"Query: <span class="hljs-subst">{query}</span>\n\nDrafts and Rationales:\n<span class="hljs-subst">{drafts_text}</span>"</span>}
    ]
    response = <span class="hljs-keyword">await</span> call_llm(generalist_model, messages, temperature=<span class="hljs-number">0.3</span>, max_tokens=<span class="hljs-number">512</span>)

    lines = response.split(<span class="hljs-string">'\n'</span>)
    best_draft_num = int(lines[<span class="hljs-number">0</span>].split(<span class="hljs-string">':'</span>)[<span class="hljs-number">1</span>].strip()) - <span class="hljs-number">1</span>
    rationale = <span class="hljs-string">'\n'</span>.join(lines[<span class="hljs-number">1</span>:]).strip()

    <span class="hljs-keyword">return</span> best_draft_num, rationale, <span class="hljs-string">f"Evaluator selected Draft <span class="hljs-subst">{best_draft_num + <span class="hljs-number">1</span>}</span> as the best."</span>
</code></pre>
<p>The Judge (Evaluator) reviews all the drafts, picks the best one, and explains why. We use a lower <code>temperature=0.3</code> here because we want a more deterministic, focused decision.</p>
<h3 id="heading-putting-it-all-together">Putting It All Together</h3>
<p>The heart of our Speculative RAG system is the <code>speculative_rag</code> function:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">speculative_rag</span>(<span class="hljs-params">query: str, documents: List[str]</span>) -&gt; Tuple[str, str]:</span>
    process_log = []

    <span class="hljs-comment"># Step 1: Quick Thinker decides</span>
    is_complex, gen_log = <span class="hljs-keyword">await</span> generalist_llm(query)
    process_log.append(gen_log)

    <span class="hljs-keyword">if</span> is_complex:
        <span class="hljs-comment"># Step 2: Bookworm generates drafts</span>
        tasks = [process_document(query, doc) <span class="hljs-keyword">for</span> doc <span class="hljs-keyword">in</span> documents]
        drafts_and_rationales = <span class="hljs-keyword">await</span> asyncio.gather(*tasks)
        process_log.append(<span class="hljs-string">f"Specialist generated <span class="hljs-subst">{len(drafts_and_rationales)}</span> drafts."</span>)

        <span class="hljs-comment"># Step 3: Judge selects the best draft</span>
        best_draft_num, eval_rationale, eval_log = <span class="hljs-keyword">await</span> evaluator_llm(query, drafts_and_rationales)
        process_log.append(eval_log)

        <span class="hljs-comment"># Step 4: Craft final response using the best draft</span>
        best_draft = drafts_and_rationales[best_draft_num][<span class="hljs-number">0</span>]
        final_response, final_log = <span class="hljs-keyword">await</span> final_response_llm(query, best_draft, eval_rationale)
        process_log.append(final_log)
    <span class="hljs-keyword">else</span>:
        <span class="hljs-comment"># For simple queries, Quick Thinker handles it</span>
        final_response = <span class="hljs-keyword">await</span> call_llm(generalist_model, [
            {<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"You are a helpful assistant. Please answer the following query concisely."</span>},
            {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: query}
        ], temperature=<span class="hljs-number">0.7</span>, max_tokens=<span class="hljs-number">512</span>)
        process_log.append(<span class="hljs-string">f"Simple query: Generalist provided the response."</span>)

    <span class="hljs-keyword">return</span> final_response, <span class="hljs-string">"\n"</span>.join(process_log)
</code></pre>
<p>This function orchestrates the whole show. It's like the party host making sure everyone plays their part:</p>
<ol>
<li><p>The Quick Thinker assesses the question.</p>
</li>
<li><p>If it's complex, the Bookworm generates answers for each document.</p>
</li>
<li><p>The Judge picks the best answer.</p>
</li>
<li><p>We use the best answer to craft a final, polished response.</p>
</li>
</ol>
<p>If the question is simple, the Quick Thinker handles it directly. Efficiency at its finest!</p>
<h2 id="heading-why-this-is-cool">Why This is Cool</h2>
<p>Speculative RAG is like having a team of AI experts working together to answer your questions. It's smart enough to handle simple queries quickly and to dive deep when needed. Plus, it can use multiple documents to generate comprehensive answers.</p>
<p>The best part? It's all happening behind the scenes. To the user, it looks like they're just asking a question and getting a great answer. But we know the truth – there's a whole party of AI friends working hard to make it happen!</p>
<h2 id="heading-wrapping-up">Wrapping Up</h2>
<p>And there you have it, folks! We've just taken a whirlwind tour of Speculative RAG. We've seen how it combines the speed of a quick-thinking generalist with the depth of a knowledgeable specialist and the wisdom of a fair judge.</p>
<p>This approach opens up exciting possibilities for more efficient, accurate, and context-aware AI interactions. Whether you're building a chatbot, a question-answering system, or just exploring the frontiers of AI, Speculative RAG is definitely a technique worth keeping in your toolbox.</p>
<p>Remember, the key to great AI is not just about having a big brain – it's about knowing when to think quick, when to dig deep, and when to call for a second opinion. Speculative RAG does all of that and more!</p>
<p>So, next time you're chatting with an AI and it gives you a surprisingly good answer, you can smile and think, "I know what's going on behind the curtain!" And who knows? Maybe you'll be inspired to create your own AI party with Speculative RAG as the guest of honor.</p>
<p>Until next time, keep coding, keep learning, and keep pushing the boundaries of what's possible with AI!</p>
<p>Grab the code at <a target="_blank" href="https://replit.com/@MartinBowling/Speculative-RAG-with-Groq?v=1">replit</a></p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://replit.com/@MartinBowling/Speculative-RAG-with-Groq?v=1">https://replit.com/@MartinBowling/Speculative-RAG-with-Groq?v=1</a></div>
]]></content:encoded></item><item><title><![CDATA[Building Secret Shopper:]]></title><description><![CDATA[FriYAY: Meet Secret Shopper
Happy Friday 🎉 I'm excited to share a project I've been working on that combines two powerful tools: Highlight and MultiOn. 🛠️💡But before we dive into the code, I want to take a moment to highlight some amazing opportun...]]></description><link>https://martinbowling.com/building-secret-shopper</link><guid isPermaLink="true">https://martinbowling.com/building-secret-shopper</guid><category><![CDATA[MultiOn]]></category><category><![CDATA[Next.js]]></category><category><![CDATA[highlight]]></category><category><![CDATA[OCR ]]></category><category><![CDATA[#agent]]></category><category><![CDATA[AI]]></category><category><![CDATA[ai agents]]></category><dc:creator><![CDATA[Martin Bowling]]></dc:creator><pubDate>Fri, 16 Aug 2024 20:36:24 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/vHvUqnt31wU/upload/e84c169c2e6f491358f2cbea2df9cb44.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1 id="heading-friyay-meet-secret-shopper">FriYAY: Meet Secret Shopper</h1>
<p>Happy Friday 🎉 I'm excited to share a project I've been working on that combines two powerful tools: Highlight and MultiOn. 🛠️💡But before we dive into the code, I want to take a moment to highlight some amazing opportunities for developers in the Highlight ecosystem.</p>
<h2 id="heading-getting-started-with-highlight">Getting Started with Highlight</h2>
<p>First off, if you're not already familiar with Highlight, their comprehensive <a target="_blank" href="https://docs.highlight.ing">developer docs</a> are an absolute goldmine of information. Whether you're just getting started or looking to level up your Highlight skills, these docs are your go-to resource.</p>
<p>For those who love to connect and collaborate, the <a target="_blank" href="https://highlight.ing/discord">Highlight Discord community</a> is the place to be. It's a vibrant hub where developers share ideas, troubleshoot issues, and showcase their latest creations.</p>
<h3 id="heading-highlight-grant-program">Highlight Grant Program</h3>
<p>Now, here's something that might really pique your interest: Highlight's incredible grant program for developers. They're offering $10,000 per developer, with $5,000 upfront and another $5,000 upon app publication. The best part? No equity is taken! But wait, there's more:</p>
<ul>
<li><p>Free office space in NYC's Upper West Side whenever you need it (perfect for those intense coding sessions or team meetups)</p>
</li>
<li><p>An additional $20,000 follow-on grant available, depending on your app's post-launch traction</p>
</li>
</ul>
<p>If you're ready to take your development journey to the next level with Highlight, you can <a target="_blank" href="https://docs.google.com/forms/d/e/1FAIpQLSc02_q5xyG_VeIQuc0YXfoR2RiVwuBNGkh8WmUTENasUQ3oVQ/viewform">sign up for the grant program here</a>.</p>
<h2 id="heading-introducing-multion-your-ai-agent-api">Introducing MultiOn: Your AI Agent API</h2>
<p>To get started with this project, you'll also need a MultiOn API key. MultiOn offers an Agent API that allows you to build and embed AI agents that can complete tasks and workflows on the web for your users. You can <a target="_blank" href="https://www.multion.ai/api">get your API key here</a>.</p>
<h3 id="heading-what-multion-offers">What MultiOn Offers:</h3>
<ul>
<li><p><strong>Versatile Actions</strong>: From ordering on Amazon to sending calendar invites, booking flights, or controlling the AWS console, MultiOn's general-purpose agents can handle a wide range of tasks.</p>
</li>
<li><p><strong>Cross-Sector Applications</strong>: Whether you're in e-commerce, travel, financial services, or developing smart devices, MultiOn can automate tasks and assist customers.</p>
</li>
<li><p><strong>Seamless Integration</strong>: MultiOn allows users to complete purchases or bookings without leaving your website, fill complex forms, and aggregate data from various online sources.</p>
</li>
</ul>
<p>With these resources and support at your fingertips, the possibilities are endless. And speaking of possibilities, let me show you what I've been cooking up with Highlight and MultiOn...</p>
<h2 id="heading-the-power-of-context-aware-computing">The Power of Context-Aware Computing</h2>
<p>Before we dive into the code, let's talk about why this project is so cool. We're living in an age where our computers can understand what's on our screens and act on that information in real-time. That's where Highlight comes in – it's a game-changing platform that allows developers to create apps that can "see" and interpret what's on a user's screen.</p>
<p>Pair that with MultiOn, a tool that can interact with web pages programmatically, and you've got a recipe for some seriously powerful applications.</p>
<h2 id="heading-introducing-secret-shopper">Introducing Secret Shopper</h2>
<p>Secret Shopper is a background shopping tool that works quietly while you browse. It detects when you're looking at a product, then automatically fetches price comparisons from other retailers. Let's break down how it works:</p>
<h3 id="heading-1-screen-content-analysis">1. Screen Content Analysis</h3>
<p>The heart of Secret Shopper is its ability to understand what's on your screen. We use Highlight's OCR capabilities to read the screen content, then pass that to a language model for classification. Here's how it works in detail:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">classifyProductPage</span>(<span class="hljs-params">ocrContent: <span class="hljs-built_in">string</span></span>): <span class="hljs-title">Promise</span>&lt;<span class="hljs-title">ProductClassificationResult</span>&gt; </span>{
  <span class="hljs-keyword">try</span> {
    <span class="hljs-keyword">const</span> messages: LLMMessage[] = [
      {
        role: <span class="hljs-string">"system"</span>,
        content: <span class="hljs-string">"You are an AI trained to analyze OCR content and determine if it's from a product page..."</span>
      },
      {
        role: <span class="hljs-string">"user"</span>,
        content: <span class="hljs-string">`Analyze this OCR content: <span class="hljs-subst">${ocrContent}</span>`</span>
      }
    ];

    <span class="hljs-keyword">const</span> textPrediction = <span class="hljs-keyword">await</span> Highlight.inference.getTextPrediction(messages);
    <span class="hljs-keyword">let</span> fullResponse = <span class="hljs-string">''</span>;
    <span class="hljs-keyword">for</span> <span class="hljs-keyword">await</span> (<span class="hljs-keyword">const</span> chunk <span class="hljs-keyword">of</span> textPrediction) {
      fullResponse += chunk;
    }

    <span class="hljs-keyword">const</span> jsonMatch = fullResponse.match(<span class="hljs-regexp">/\{[\s\S]*\}/</span>);

    <span class="hljs-keyword">if</span> (jsonMatch) {
      <span class="hljs-keyword">const</span> jsonString = jsonMatch[<span class="hljs-number">0</span>];
      <span class="hljs-keyword">return</span> <span class="hljs-built_in">JSON</span>.parse(jsonString);
    } <span class="hljs-keyword">else</span> {
      <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">'Failed to extract JSON from the response'</span>);
    }
  } <span class="hljs-keyword">catch</span> (error) {
    <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error classifying the page:'</span>, error);
    <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">'Failed to classify the page.'</span>);
    }
}
</code></pre>
<p>This function does several important things:</p>
<ol>
<li><p>It sets up a conversation with the AI model, giving it instructions on how to analyze the OCR content.</p>
</li>
<li><p>It uses Highlight's <code>getTextPrediction</code> method to get a response from the AI.</p>
</li>
<li><p>It processes the response, extracting the JSON part that contains the classification results.</p>
</li>
</ol>
<p>We also have an alternative implementation using the Groq API, which shows how you can easily swap out different AI providers:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">classifyProductPageGroq</span>(<span class="hljs-params">ocrContent: <span class="hljs-built_in">string</span></span>): <span class="hljs-title">Promise</span>&lt;<span class="hljs-title">ProductClassificationResult</span>&gt; </span>{
  <span class="hljs-keyword">try</span> {
    <span class="hljs-keyword">const</span> chatCompletion = <span class="hljs-keyword">await</span> groq.chat.completions.create({
      messages: [
        {
          role: <span class="hljs-string">"system"</span>,
          content: <span class="hljs-string">"You are an AI trained to analyze OCR content..."</span>
        },
        {
          role: <span class="hljs-string">"user"</span>,
          content: <span class="hljs-string">`Analyze this OCR content: <span class="hljs-subst">${ocrContent}</span>`</span>
        }
      ],
      model: <span class="hljs-string">"llama-3.1-70b-versatile"</span>,
      temperature: <span class="hljs-number">0.5</span>,
      max_tokens: <span class="hljs-number">1024</span>,
      top_p: <span class="hljs-number">1</span>,
      stream: <span class="hljs-literal">false</span>,
      stop: <span class="hljs-literal">null</span>
    });

    <span class="hljs-comment">// ... process and return the result</span>
  } <span class="hljs-keyword">catch</span> (error) {
    <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error classifying the page:'</span>, error);
    <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">'Failed to classify the page.'</span>);
  }
}
</code></pre>
<h3 id="heading-2-fetching-price-comparisons">2. Fetching Price Comparisons</h3>
<p>Once we know we're looking at a product, it's time to fetch some price comparisons. This is where MultiOn comes in:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">retrieveProductListings</span>(<span class="hljs-params">product: <span class="hljs-built_in">string</span>, multionKey: <span class="hljs-built_in">string</span></span>): <span class="hljs-title">Promise</span>&lt;<span class="hljs-title">ProductListing</span>[]&gt; </span>{
  <span class="hljs-keyword">const</span> multion = <span class="hljs-keyword">new</span> MultiOnClient({ apiKey: multionKey });

  <span class="hljs-keyword">try</span> {
    <span class="hljs-keyword">const</span> response = <span class="hljs-keyword">await</span> multion.retrieve({
      cmd: <span class="hljs-string">`Get all non-sponsored listings on Google for <span class="hljs-subst">${product}</span> with the price, description, and URL...`</span>,
      url: <span class="hljs-string">`https://www.google.com/search?q=<span class="hljs-subst">${<span class="hljs-built_in">encodeURIComponent</span>(product)}</span>`</span>,
      fields: [<span class="hljs-string">"price"</span>, <span class="hljs-string">"description"</span>, <span class="hljs-string">"url"</span>]
    });

    <span class="hljs-comment">// ... process and return the results</span>
  } <span class="hljs-keyword">catch</span> (error) {
    <span class="hljs-built_in">console</span>.error(<span class="hljs-string">"Error retrieving product listings:"</span>, error);
    <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">"Failed to retrieve product listings."</span>);
  }
}
</code></pre>
<p>This function uses MultiOn to search for our product and return a list of prices and retailers. It's like having a little web-scraping genie at your command! 🧞‍♂️</p>
<h3 id="heading-3-managing-state-with-react-context">3. Managing State with React Context</h3>
<p>We use React's Context API to manage the state of our application. Here's a snippet from our <code>ProductContext</code>:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">const</span> ProductProvider: React.FC&lt;{ children: React.ReactNode }&gt; = <span class="hljs-function">(<span class="hljs-params">{ children }</span>) =&gt;</span> {
  <span class="hljs-keyword">const</span> [productGroups, setProductGroups] = useState&lt;ProductGroup[]&gt;([]);
  <span class="hljs-keyword">const</span> [checkInterval, setCheckInterval] = useState(<span class="hljs-number">60</span>);
  <span class="hljs-keyword">const</span> [multionKey, setMultionKey] = useState(<span class="hljs-string">''</span>);
  <span class="hljs-comment">// ... other state variables</span>

  <span class="hljs-keyword">const</span> addProductGroup = useCallback(<span class="hljs-function">(<span class="hljs-params">name: <span class="hljs-built_in">string</span>, products: ProductListing[]</span>) =&gt;</span> {
    setProductGroups(<span class="hljs-function">(<span class="hljs-params">prevGroups</span>) =&gt;</span> {
      <span class="hljs-keyword">const</span> existingGroup = prevGroups.find(<span class="hljs-function">(<span class="hljs-params">group</span>) =&gt;</span> group.name === name);
      <span class="hljs-keyword">if</span> (existingGroup) {
        <span class="hljs-comment">// Update existing group</span>
      } <span class="hljs-keyword">else</span> {
        <span class="hljs-comment">// Add new group</span>
      }
    });
  }, []);

  <span class="hljs-comment">// ... other methods</span>

  <span class="hljs-keyword">const</span> fetchProductUpdates = useCallback(<span class="hljs-keyword">async</span> () =&gt; {
    <span class="hljs-keyword">try</span> {
      <span class="hljs-keyword">const</span> context = <span class="hljs-keyword">await</span> Highlight.user.getContext();
      <span class="hljs-keyword">const</span> ocrContent = context.environment?.ocrScreenContents || <span class="hljs-string">''</span>;

      <span class="hljs-keyword">if</span> (ocrContent) {
        <span class="hljs-keyword">const</span> classification = <span class="hljs-keyword">await</span> classifyProductPage(ocrContent);

        <span class="hljs-keyword">if</span> (classification.isProductPage || classification.isProductDiscussion || classification.isProductMention) {
          <span class="hljs-keyword">const</span> productName = classification.productName;

          <span class="hljs-keyword">if</span> (productName) {
            <span class="hljs-comment">// Fetch and store product listings</span>
            <span class="hljs-keyword">const</span> listings = <span class="hljs-keyword">await</span> retrieveProductListings(productName, multionKey);
            addProductGroup(productName, listings);
          }
        }
      }
    } <span class="hljs-keyword">catch</span> (error) {
      <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error fetching product updates:'</span>, error);
    }
  }, [productGroups, addProductGroup, multionKey, backgroundPermissionGranted]);

  <span class="hljs-comment">// ... context value and provider</span>
};
</code></pre>
<p>This context manages the state of our application, including the list of product groups, settings, and the main <code>fetchProductUpdates</code> function that ties everything together.</p>
<h3 id="heading-4-the-user-interface">4. The User Interface</h3>
<p>Our main application component, <code>SecretShopperApp</code>, uses this context to render the UI and manage the background processing:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> SecretShopperApp: React.FC = <span class="hljs-function">() =&gt;</span> {
  <span class="hljs-keyword">const</span> { 
    productGroups, 
    removeProductGroup, 
    fetchProductUpdates, 
    checkInterval,
    showOnboarding,
    setShowOnboarding
  } = useContext(ProductContext);

  <span class="hljs-keyword">const</span> [isPaused, setIsPaused] = useState(<span class="hljs-literal">false</span>);

  useEffect(<span class="hljs-function">() =&gt;</span> {
    <span class="hljs-keyword">let</span> intervalId: NodeJS.Timeout | <span class="hljs-literal">null</span> = <span class="hljs-literal">null</span>;

    <span class="hljs-keyword">if</span> (!isPaused) {
      intervalId = <span class="hljs-built_in">setInterval</span>(<span class="hljs-function">() =&gt;</span> {
        fetchProductUpdates();
      }, checkInterval * <span class="hljs-number">1000</span>);
    }

    <span class="hljs-keyword">return</span> <span class="hljs-function">() =&gt;</span> {
      <span class="hljs-keyword">if</span> (intervalId) <span class="hljs-built_in">clearInterval</span>(intervalId);
    };
  }, [fetchProductUpdates, checkInterval, isPaused]);

  <span class="hljs-comment">// ... render UI</span>
};
</code></pre>
<p>This component sets up an interval to fetch product updates regularly, but only when the app isn't paused. It's a simple way to give users control while still leveraging the power of background processing.</p>
<h2 id="heading-the-power-of-background-processing">The Power of Background Processing</h2>
<p>One of the coolest things about Secret Shopper is that it runs in the background, thanks to Highlight's background permission feature. This is managed in our <code>OnboardingScreen</code> component:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> requestBackgroundPermission = useCallback(<span class="hljs-keyword">async</span> () =&gt; {
  <span class="hljs-keyword">try</span> {
    <span class="hljs-keyword">const</span> granted = <span class="hljs-keyword">await</span> Highlight.permissions.requestBackgroundPermission();
    <span class="hljs-keyword">if</span> (granted) {
      <span class="hljs-keyword">await</span> Highlight.app.setBackgroundStatus(<span class="hljs-literal">true</span>);
      setBackgroundPermissionGranted(<span class="hljs-literal">true</span>);
      <span class="hljs-keyword">await</span> saveToAppStorage(<span class="hljs-string">"backgroundPermissionGranted"</span>, <span class="hljs-string">"true"</span>);
      setError(<span class="hljs-literal">null</span>);
    } <span class="hljs-keyword">else</span> {
      setError(<span class="hljs-string">"Background permission was not granted. This app requires background access to function properly."</span>);
    }
    <span class="hljs-keyword">return</span> granted;
  } <span class="hljs-keyword">catch</span> (error) {
    <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error requesting background permission:'</span>, error);
    setError(<span class="hljs-string">"An error occurred while requesting background permission. Please try again."</span>);
    <span class="hljs-keyword">return</span> <span class="hljs-literal">false</span>;
  }
}, []);
</code></pre>
<p>This function requests background permission from the user and sets up the app to run in the background if granted.</p>
<h2 id="heading-wrapping-up">Wrapping Up</h2>
<p>Secret Shopper demonstrates the power of context-aware computing. By leveraging Highlight's screen analysis capabilities and MultiOn's web interaction abilities, we've created an app that understands what we're looking at and acts on that information automatically.</p>
<p>The possibilities for this kind of technology are endless. Imagine an app that automatically takes notes in your meetings, or one that suggests relevant documents based on what you're working on. The future of computing is context-aware, and tools like Highlight and MultiOn are making it possible for developers to build these kinds of intelligent, responsive applications.</p>
<p>That's all for now, folks! Feel free to dive <a target="_blank" href="https://replit.com/@MartinBowling/Secret-Shopper-Highlight-and-MultiOn-Power-Shopping-Agent?v=1">into the code</a>, experiment with it, and see what you can build. Happy coding! 👨‍💻🚀</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://replit.com/@MartinBowling/Secret-Shopper-Highlight-and-MultiOn-Power-Shopping-Agent?v=1">https://replit.com/@MartinBowling/Secret-Shopper-Highlight-and-MultiOn-Power-Shopping-Agent?v=1</a></div>
]]></content:encoded></item><item><title><![CDATA[Building a Safe and Fun AI Chat Experience with Llama Guard 3 🦙🛡️]]></title><description><![CDATA[Hey there, tech enthusiasts and curious minds! 👋 Today, I'm excited to share a project I've been working on that combines the power of AI with the importance of online safety. Let's dive into how we can create a moderated AI chat experience using Ll...]]></description><link>https://martinbowling.com/building-a-safe-and-fun-ai-chat-experience-with-llama-guard-3</link><guid isPermaLink="true">https://martinbowling.com/building-a-safe-and-fun-ai-chat-experience-with-llama-guard-3</guid><category><![CDATA[AI]]></category><category><![CDATA[Llama3]]></category><category><![CDATA[LLaMa]]></category><category><![CDATA[Python]]></category><category><![CDATA[gradio]]></category><category><![CDATA[repl.it]]></category><category><![CDATA[llm]]></category><category><![CDATA[AI Safety]]></category><category><![CDATA[groq]]></category><dc:creator><![CDATA[Martin Bowling]]></dc:creator><pubDate>Tue, 13 Aug 2024 00:54:41 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/QOL_IwPmj6E/upload/27fa13119b57e7159da694b1dd61980b.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there, tech enthusiasts and curious minds! 👋 Today, I'm excited to share a project I've been working on that combines the power of AI with the importance of online safety. Let's dive into how we can create a moderated AI chat experience using Llama Guard 3 and the Groq API. 🚀</p>
<h2 id="heading-why-content-moderation-matters">Why Content Moderation Matters 🤔</h2>
<p>In the world of AI and large language models, it's crucial to ensure that our digital conversations remain respectful and safe. That's where content moderation comes in. It helps us filter out potentially harmful or inappropriate content, making sure our AI assistants behave responsibly. Think of it as a friendly bouncer for your chat app! 🕴️</p>
<h2 id="heading-enter-llama-guard-3">Enter Llama Guard 3 🦙</h2>
<p>Llama Guard 3 is our superhero in this story. It's a powerful 8B parameter model designed to classify content in both AI inputs and outputs. It's like having a wise old llama watching over our conversations, making sure everything stays friendly and appropriate. 🦸‍♂️</p>
<h2 id="heading-lets-build-our-safe-chat-app">Let's Build Our Safe Chat App! 🛠️</h2>
<p>I've put together a fun little project using Python and the Gradio library to create a web-based chat interface. Here's a sneak peek at how we set things up:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> asyncio
<span class="hljs-keyword">import</span> gradio <span class="hljs-keyword">as</span> gr
<span class="hljs-keyword">from</span> groq <span class="hljs-keyword">import</span> AsyncGroq

client = AsyncGroq(api_key=os.environ.get(<span class="hljs-string">"GROQ_API_KEY"</span>))

chat_model = <span class="hljs-string">"llama3-70b-8192"</span>
moderation_model = <span class="hljs-string">"llama-guard-3-8b"</span>

system_prompt = <span class="hljs-string">"""You are a helpful, respectful, and honest assistant. Always strive to answer the user's questions to the best of your ability, while maintaining a polite and professional tone."""</span>
</code></pre>
<p>This sets up our connection to the Groq API and defines which models we'll use for chat and moderation. The system prompt gives our AI assistant its personality – think of it as its morning pep talk! ☀️</p>
<h2 id="heading-the-magic-of-moderation">The Magic of Moderation ✨</h2>
<p>Here's where Llama Guard 3 really shines. We use it to check each message before our main AI responds:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">moderate_message</span>(<span class="hljs-params">message</span>):</span>
    <span class="hljs-string">"""Moderate the user message using Llama Guard 3."""</span>
    response = <span class="hljs-keyword">await</span> client.chat.completions.create(
        model=moderation_model,
        messages=[{<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: message}],
        temperature=<span class="hljs-number">0.1</span>,
        max_tokens=<span class="hljs-number">10</span>,
    )
    <span class="hljs-keyword">return</span> response.choices[<span class="hljs-number">0</span>].message.content.strip().lower() == <span class="hljs-string">"safe"</span>
</code></pre>
<p>If Llama Guard 3 gives us the thumbs up, we proceed with generating a response. If not, we politely decline to engage with the unsafe content. It's like having a wise friend who knows when to change the subject! 😉</p>
<h2 id="heading-streaming-responses-for-a-smooth-experience">Streaming Responses for a Smooth Experience 🌊</h2>
<p>To make our chat feel more natural and responsive, we stream the AI's responses:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">chat_response</span>(<span class="hljs-params">message, history</span>):</span>
    <span class="hljs-string">"""Generate a chat response, including moderation."""</span>
    is_safe = <span class="hljs-keyword">await</span> moderate_message(message)

    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> is_safe:
        <span class="hljs-keyword">yield</span> <span class="hljs-string">"I apologize, but I can't respond to that type of message. Let's keep our conversation respectful and appropriate."</span>
        <span class="hljs-keyword">return</span>

    <span class="hljs-comment"># ... (code for generating and streaming the response)</span>
</code></pre>
<p>This way, the response appears gradually, just like a real person typing. It's much more engaging than waiting for the whole message to pop up at once!</p>
<h2 id="heading-a-pretty-interface-with-gradio">A Pretty Interface with Gradio 🎨</h2>
<p>Gradio makes it super easy to create a good-looking web interface for our chat app. Here's a snippet of how we set it up:</p>
<pre><code class="lang-python"><span class="hljs-keyword">with</span> gr.Blocks(theme=gr.themes.Soft(), fill_height=<span class="hljs-literal">True</span>, head=js) <span class="hljs-keyword">as</span> demo:
    gr.Markdown(<span class="hljs-string">"# SafeChat: Moderated AI Conversation"</span>)
    gr.Markdown(<span class="hljs-string">"This chat interface uses Llama Guard 3 to moderate messages and ensure safe interactions."</span>)

    chatbot = gr.Chatbot(height=<span class="hljs-number">450</span>, show_label=<span class="hljs-literal">False</span>)
    message = gr.Textbox(
        placeholder=<span class="hljs-string">"Type your message here..."</span>,
        label=<span class="hljs-string">"User Input"</span>,
        show_label=<span class="hljs-literal">False</span>,
        container=<span class="hljs-literal">False</span>
    )
    submit = gr.Button(<span class="hljs-string">"Send"</span>, variant=<span class="hljs-string">"primary"</span>)
    clear = gr.Button(<span class="hljs-string">"Clear"</span>)

    <span class="hljs-comment"># ... (code for handling user input and bot responses)</span>
</code></pre>
<p>This creates a clean, user-friendly interface with a chat window, input box, and buttons for sending messages and clearing the chat.</p>
<h2 id="heading-try-it-out">Try It Out! 🚀</h2>
<p>I've made this project a template on Replit, so you can give it a spin right now! Just head over to <a target="_blank" href="https://replit.com/@MartinBowling/SafeChat-with-Llama-Guard-3?v=1">Replit</a> and start chatting. Feel free to ask about quantum physics, plan a trip to Japan, or discuss the finer points of machine learning algorithms. Our AI assistant is ready for a fun and safe conversation!</p>
<h2 id="heading-wrapping-up">Wrapping Up 🎁</h2>
<p>Building this safe chat experience has been a blast, and I hope you enjoy using it as much as I enjoyed creating it. It's amazing to see how we can combine cutting-edge AI technology with responsible safeguards to create something both powerful and trustworthy.</p>
<p>Remember, in the world of AI, with great power comes great responsibility. By using tools like Llama Guard 3, we can ensure that our AI interactions remain positive, informative, and safe for everyone.</p>
<p>Happy chatting, and may your conversations be ever insightful and respectful! 🌟💬</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://replit.com/@MartinBowling/SafeChat-with-Llama-Guard-3?v=1">https://replit.com/@MartinBowling/SafeChat-with-Llama-Guard-3?v=1</a></div>
]]></content:encoded></item><item><title><![CDATA[Groq Schwag goes hard]]></title><description><![CDATA[Opened the door today and saw a package from Groq inc. I ripped it open and man was I blown away. Look at the sweet stuff:


Pop into their discord to join the ever growing community of Groq devs. The future is bright and thanks to Groq's LPU's it 🚀...]]></description><link>https://martinbowling.com/groq-schwag-goes-hard</link><guid isPermaLink="true">https://martinbowling.com/groq-schwag-goes-hard</guid><dc:creator><![CDATA[Martin Bowling]]></dc:creator><pubDate>Mon, 12 Aug 2024 21:19:51 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1723497564795/508b5edf-2544-404f-a68f-ad06b6de6193.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Opened the door today and saw a package from Groq inc. I ripped it open and man was I blown away. Look at the sweet stuff:</p>
<p><img src="https://pbs.twimg.com/media/GUz246xWcAEtt6T?format=jpg&amp;name=large" alt="Image" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1723498211767/3ff54736-14da-4489-b14f-98313f816639.jpeg" alt class="image--center mx-auto" /></p>
<p>Pop into their <a target="_blank" href="http://discord.com/invite/groq">discord</a> to join the ever growing community of Groq devs. The future is bright and thanks to Groq's LPU's it 🚀🚀🚀🚀🚀</p>
]]></content:encoded></item><item><title><![CDATA[Optimizing Prompts from Llama 3.1 70B to 8B]]></title><description><![CDATA[As an AI developer and enthusiast, I'm always looking for ways to push the boundaries of what's possible with language models. Today, I'm thrilled to share my latest project: a tool that optimizes prompts from Groq's Llama 3.1 70B model to work effic...]]></description><link>https://martinbowling.com/optimizing-prompts-from-llama-31-70b-to-8b</link><guid isPermaLink="true">https://martinbowling.com/optimizing-prompts-from-llama-31-70b-to-8b</guid><category><![CDATA[Python]]></category><category><![CDATA[groq]]></category><category><![CDATA[#Llama3_1]]></category><category><![CDATA[llama 3.1]]></category><category><![CDATA[#PromptEngineering]]></category><category><![CDATA[#AIEfficiency]]></category><category><![CDATA[#AIOptimization]]></category><dc:creator><![CDATA[Martin Bowling]]></dc:creator><pubDate>Tue, 23 Jul 2024 04:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/ulUnRNuC_ok/upload/d405d382349e07e13adbe8b6a53afd71.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>As an AI developer and enthusiast, I'm always looking for ways to push the boundaries of what's possible with language models. Today, I'm thrilled to share my latest project: a tool that optimizes prompts from Groq's Llama 3.1 70B model to work efficiently with their 8B model. Built on Replit and inspired by Matt Shumer's gpt-prompt-engineer, this tool is designed to make advanced prompt optimization accessible to everyone.</p>
<h2 id="heading-the-challenge-balancing-power-and-efficiency">The Challenge: Balancing Power and Efficiency</h2>
<p>Large language models like Llama 3.1 70B are incredibly powerful, but they come with significant computational costs. On the other hand, smaller models like Llama 3.1 8B are more efficient but may lack some of the capabilities of their larger counterparts. The challenge lies in bridging this gap – how can we leverage the power of larger models to optimize prompts for smaller, more efficient ones?</p>
<h2 id="heading-the-solution-a-prompt-optimization-pipeline">The Solution: A Prompt Optimization Pipeline</h2>
<p>My tool addresses this challenge through a sophisticated prompt optimization pipeline. Here's how it works:</p>
<ol>
<li><p><strong>Task Description</strong>: Users input a description of the task they want to accomplish.</p>
</li>
<li><p><strong>Initial Prompt Generation</strong>: The tool uses the 70B model to generate an initial prompt based on the task description.</p>
</li>
<li><p><strong>Prompt Variation</strong>: It then creates multiple variations of the prompt, aiming for diversity and improvement.</p>
</li>
<li><p><strong>System Prompt Generation</strong>: A system prompt is created to guide the 8B model in understanding the task.</p>
</li>
<li><p><strong>8B Model Testing</strong>: The tool tests the optimized prompts on the 8B model, providing detailed results.</p>
</li>
</ol>
<p>Let's dive into the key components of this tool:</p>
<h3 id="heading-generating-the-initial-prompt">Generating the Initial Prompt</h3>
<pre><code class="lang-python">
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">generate_prompt</span>(<span class="hljs-params">task</span>):</span>

completion = client.chat.completions.create(

model=MODEL_70B,

messages=[{

<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>,

<span class="hljs-string">"content"</span>: <span class="hljs-string">"You are an AI assistant that generates example prompts for given tasks. Respond with only the prompt, nothing else."</span>

}, {

<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>,

<span class="hljs-string">"content"</span>: <span class="hljs-string">f"Generate a sample prompt for the following task: <span class="hljs-subst">{task}</span>"</span>

}],

temperature=<span class="hljs-number">0.7</span>,

max_tokens=<span class="hljs-number">500</span>,

)

<span class="hljs-keyword">return</span> completion.choices[<span class="hljs-number">0</span>].message.content.strip()
</code></pre>
<p>This function uses the 70B model to generate an initial prompt based on the user's task description. It's designed to be focused and concise, providing a solid starting point for optimization.</p>
<h3 id="heading-creating-prompt-variations">Creating Prompt Variations</h3>
<pre><code class="lang-python">
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">generate_candidate_prompts</span>(<span class="hljs-params">task, prompt_example</span>):</span>

completion = client.chat.completions.create(

model=MODEL_70B,

messages=[{

<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>,

<span class="hljs-string">"content"</span>: <span class="hljs-string">"Given an example prompt, create seven additional prompts for the same task that are even better. Ensure the new prompts are diverse and unique from one another."</span>

}, {

<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>,

<span class="hljs-string">"content"</span>: <span class="hljs-string">f"Task: <span class="hljs-subst">{task}</span>\nExample prompt: <span class="hljs-subst">{prompt_example}</span>"</span>

}],

temperature=<span class="hljs-number">0.5</span>,

max_tokens=<span class="hljs-number">4000</span>,

)

prompts = completion.choices[<span class="hljs-number">0</span>].message.content.strip().split(<span class="hljs-string">'\n'</span>)

<span class="hljs-keyword">return</span> prompts[:<span class="hljs-number">7</span>] <span class="hljs-comment"># Ensure we only return 7 prompts</span>
</code></pre>
<p>This function generates multiple variations of the initial prompt, aiming for diversity and improvement. By creating a range of options, we increase the chances of finding an optimal prompt for the 8B model.</p>
<h3 id="heading-generating-the-system-prompt">Generating the System Prompt</h3>
<pre><code class="lang-python">
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">generate_system_prompt</span>(<span class="hljs-params">task, prompts</span>):</span>

completion = client.chat.completions.create(

model=MODEL_70B,

messages=[{

<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>,

<span class="hljs-string">"content"</span>: <span class="hljs-string">"Given a user-description of their task and a set of prompts, write a fantastic system prompt that describes the task to be done perfectly."</span>

}, {

<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>,

<span class="hljs-string">"content"</span>: <span class="hljs-string">f"Task: <span class="hljs-subst">{task}</span>\nPrompts: <span class="hljs-subst">{json.dumps(prompts)}</span>"</span>

}],

temperature=<span class="hljs-number">0.5</span>,

max_tokens=<span class="hljs-number">1000</span>,

)

<span class="hljs-keyword">return</span> completion.choices[<span class="hljs-number">0</span>].message.content.strip()
</code></pre>
<p>This function creates a system prompt that helps guide the 8B model in understanding and executing the task. It's a crucial step in ensuring that the smaller model can perform as effectively as possible.</p>
<h3 id="heading-testing-on-the-8b-model">Testing on the 8B Model</h3>
<pre><code class="lang-python">
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_8b</span>(<span class="hljs-params">system_prompt, prompt</span>):</span>

completion = client.chat.completions.create(

model=MODEL_8B,

messages=[{

<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>,

<span class="hljs-string">"content"</span>: system_prompt

}, {

<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>,

<span class="hljs-string">"content"</span>: prompt

}],

temperature=<span class="hljs-number">0.5</span>,

max_tokens=<span class="hljs-number">2000</span>,

)

<span class="hljs-keyword">return</span> completion.choices[<span class="hljs-number">0</span>].message.content
</code></pre>
<p>This function tests the optimized prompts on the 8B model, allowing us to evaluate the effectiveness of our optimization process.</p>
<h2 id="heading-the-user-interface-simplifying-complexity">The User Interface: Simplifying Complexity</h2>
<p>To make this tool accessible to a wide range of users, I've implemented a Gradio interface. This provides a user-friendly way to interact with the complex underlying processes:</p>
<pre><code class="lang-python">
<span class="hljs-keyword">with</span> gr.Blocks() <span class="hljs-keyword">as</span> iface:

gr.Markdown(<span class="hljs-string">"# Llama 3.1 70B to 8B Prompt Conversion"</span>)

gr.Markdown(<span class="hljs-string">"Convert prompts from Llama 3.1 70B to Llama 3.1 8B"</span>)

<span class="hljs-keyword">with</span> gr.Row():

task_input = gr.Textbox(label=<span class="hljs-string">"Task Description"</span>)

generate_button = gr.Button(<span class="hljs-string">"Generate &amp; Test Example Prompt"</span>)

<span class="hljs-keyword">with</span> gr.Row():

prompt_input = gr.Textbox(label=<span class="hljs-string">"Prompt Example"</span>, lines=<span class="hljs-number">5</span>)

initial_response = gr.Textbox(label=<span class="hljs-string">"70B Model Response"</span>, lines=<span class="hljs-number">5</span>)

output = gr.Textbox(label=<span class="hljs-string">"Conversion Output"</span>, lines=<span class="hljs-number">30</span>)

groq_code_output = gr.Code(

label=<span class="hljs-string">"Groq Code with 8B System Prompt and New Prompt"</span>,

language=<span class="hljs-string">"python"</span>)

submit_button = gr.Button(<span class="hljs-string">"Convert and Test"</span>)

<span class="hljs-comment"># ... (click event handlers)</span>
</code></pre>
<p>This interface allows users to input their task, generate an initial prompt, and then run the full optimization process with just a few clicks.</p>
<h2 id="heading-why-this-matters">Why This Matters</h2>
<p>This tool has significant implications for AI development and deployment:</p>
<ol>
<li><p><strong>Efficiency</strong>: By optimizing prompts for smaller models, we can reduce computational costs without sacrificing too much capability.</p>
</li>
<li><p><strong>Accessibility</strong>: It makes advanced prompt engineering techniques available to a wider audience, fostering innovation.</p>
</li>
<li><p><strong>Research</strong>: It provides a platform for studying the differences between large and small language models, potentially leading to new insights in AI development.</p>
</li>
</ol>
<h2 id="heading-try-it-yourself">Try It Yourself</h2>
<p>I've made this tool available as a Replit template, so you can try it out for yourself. Just add your Groq API key, and you're ready to go! Check it out here: </p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://replit.com/@MartinBowling/Llama-31-70b-to-8b-Prompt-Conversion?v=1">https://replit.com/@MartinBowling/Llama-31-70b-to-8b-Prompt-Conversion?v=1</a></div>
<h2 id="heading-conclusion">Conclusion</h2>
<p>As we continue to push the boundaries of AI, tools like this will play a crucial role in making advanced technologies more accessible and efficient. I'm excited to see how the community will use and build upon this project. Whether you're optimizing AI deployments, studying model capabilities, or developing new prompt engineering strategies, I hope this tool will be a valuable resource in your AI journey.</p>
<p>Feel free to fork the project, contribute, or share your results. Let's work together to unlock the full potential of AI, making it more efficient and accessible for everyone!</p>
]]></content:encoded></item><item><title><![CDATA[Building Infinite Fourth Trivia]]></title><description><![CDATA[Hey there, fellow developers and trivia enthusiasts! Martin Bowling here, and I'm excited to share the story behind my latest project: Infinite Fourth Trivia. As we celebrate Independence Day, I wanted to create something that combines my love for te...]]></description><link>https://martinbowling.com/building-infinite-fourth-trivia</link><guid isPermaLink="true">https://martinbowling.com/building-infinite-fourth-trivia</guid><category><![CDATA[Gemma2]]></category><category><![CDATA[AI]]></category><category><![CDATA[aidemos]]></category><category><![CDATA[Gemma AI]]></category><category><![CDATA[groq]]></category><category><![CDATA[python development]]></category><category><![CDATA[Python]]></category><category><![CDATA[FastAPI]]></category><category><![CDATA[Google gemma]]></category><dc:creator><![CDATA[Martin Bowling]]></dc:creator><pubDate>Thu, 04 Jul 2024 04:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/1i8xRkE8gXo/upload/1c1fb817be8e25f2c7189d491cbfb746.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there, fellow developers and trivia enthusiasts! Martin Bowling here, and I'm excited to share the story behind my latest project: Infinite Fourth Trivia. As we celebrate Independence Day, I wanted to create something that combines my love for technology, American history, and good old-fashioned fun. Let me take you through the journey of building this never-ending patriotic quiz game!</p>
<h2 id="heading-the-spark-of-inspiration">The Spark of Inspiration</h2>
<p>It all started when I was thinking about how to make this year's Fourth of July celebration a bit more special. As an indie developer always looking to push the boundaries, I thought, "What if we could have an AI-powered trivia game that never runs out of questions?" And thus, Infinite Fourth Trivia was born!</p>
<h2 id="heading-leveraging-cutting-edge-ai">Leveraging Cutting-Edge AI</h2>
<p>The heart of this project lies in its ability to generate an endless stream of USA-themed questions. To make this possible, I turned to two incredible technologies:</p>
<ol>
<li><p><strong>Groq's Lightning-Fast Inference</strong>: Groq's API allowed me to generate questions at breakneck speeds, ensuring a smooth user experience.</p>
</li>
<li><p><strong>Google's Gemma 2 9b Model</strong>: This powerful language model provides the brains behind our questions, ensuring they're diverse, accurate, and engaging.</p>
</li>
</ol>
<p>Here's a snippet of how I set up the Groq client in my FastAPI backend:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> groq <span class="hljs-keyword">import</span> AsyncGroq
<span class="hljs-keyword">import</span> os

client = AsyncGroq(api_key=os.environ.get(<span class="hljs-string">"GROQ_API_KEY"</span>))
GROQ_MODEL = <span class="hljs-string">"gemma2-9b-it"</span>
</code></pre>
<h2 id="heading-crafting-the-perfect-question">Crafting the Perfect Question</h2>
<p>One of the most crucial parts of the project was designing the prompt for our AI model. I wanted to ensure that questions were not only factual but also engaging and appropriate for a Fourth of July celebration. Here's a simplified version of the prompt I used:</p>
<pre><code class="lang-python">trivia_prompt = <span class="hljs-string">f"""Generate a Fourth of July themed trivia multiple-choice question. The question should be <span class="hljs-subst">{difficulty}</span> difficulty, focusing on US independence, American Revolution, patriotic symbols, or Fourth of July traditions.

Format the response as a JSON object with the following structure:
{{
    "question": "The full text of the question",
    "options": ["A. First option", "B. Second option", "C. Third option", "D. Fourth option"],
    "correct_answer": "The letter of the correct answer (A, B, C, or D)",
    "explanation": "A brief explanation of the correct answer"
}}

Ensure that the question is factual, engaging, and appropriate for a Fourth of July celebration. Aim for a mix of historical facts, cultural traditions, and interesting tidbits about American independence and patriotism.

IMPORTANT: Do not repeat any of the following recent questions:
<span class="hljs-subst">{recent_questions_str}</span>

Reply only with the JSON object, no additional comments or explanations.
"""</span>
</code></pre>
<h2 id="heading-building-a-responsive-frontend">Building a Responsive Frontend</h2>
<p>For the frontend, I wanted something that was both visually appealing and easy to use. I opted for a combination of HTML, CSS (with Tailwind for quick styling), and JavaScript. Here's a small taste of the JavaScript that powers the game logic:</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">fetchAndDisplayQuestion</span>(<span class="hljs-params">difficulty</span>) </span>{
    <span class="hljs-built_in">console</span>.log(<span class="hljs-string">'Fetching and displaying question with difficulty:'</span>, difficulty);
    <span class="hljs-keyword">try</span> {
        <span class="hljs-keyword">const</span> response = <span class="hljs-keyword">await</span> axios.post(<span class="hljs-string">'/fourth_trivia'</span>, { difficulty }, {
            <span class="hljs-attr">headers</span>: {
                <span class="hljs-string">'Content-Type'</span>: <span class="hljs-string">'application/json'</span>
            }
        });
        <span class="hljs-built_in">console</span>.log(<span class="hljs-string">'Response received:'</span>, response.data);
        <span class="hljs-keyword">if</span> (response.data.error) {
            <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error from server:'</span>, response.data.error);
            displayError(<span class="hljs-string">"Sorry, there was an error fetching the question. Please try again."</span>);
        } <span class="hljs-keyword">else</span> {
            displayQuestion(response.data);
            prefetchNextQuestion(difficulty);  <span class="hljs-comment">// Prefetch the next question after displaying current one</span>
        }
    } <span class="hljs-keyword">catch</span> (error) {
        <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error fetching question:'</span>, error);
        displayError(<span class="hljs-string">"Sorry, there was an error connecting to the server. Please check your internet connection and try again."</span>);
    }
}
</code></pre>
<h2 id="heading-challenges-and-learnings">Challenges and Learnings</h2>
<p>Building Infinite Fourth Trivia wasn't without its challenges. Ensuring question uniqueness, handling various edge cases, and optimizing for performance were all hurdles I had to overcome. But with each challenge came a learning opportunity, and I'm excited to apply these lessons to future projects.</p>
<h2 id="heading-the-road-ahead">The Road Ahead</h2>
<p>This is just the beginning for Infinite Fourth Trivia. I have plans to expand the game with more features, such as multiplayer modes, themed question sets, and perhaps even voice-activated gameplay. The possibilities are as endless as the questions themselves!</p>
<h2 id="heading-try-it-out">Try It Out!</h2>
<p>I'm thrilled to share Infinite Fourth Trivia with all of you. Whether you're a history buff, a casual trivia fan, or just looking for a fun way to celebrate the Fourth, I hope you'll give it a try. You can find the game at <a target="_blank" href="https://infinite4thtrivia.replit.app">https://infinite4thtrivia.replit.app</a>, and the source code is available on Replit.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://replit.com/@MartinBowling/Infinite-Fourth-Trivia-powered-By-Groq?v=1">https://replit.com/@MartinBowling/Infinite-Fourth-Trivia-powered-By-Groq?v=1</a></div>
<p> </p>
<p>Happy Fourth of July, and happy trivia-ing!</p>
]]></content:encoded></item><item><title><![CDATA[Evolving Tables for Better Reasoning]]></title><description><![CDATA[As a developer always on the lookout for exciting AI advancements, I recently stumbled upon a fascinating ICLR 2024 paper titled "Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding". The concept immediately grabbed my atte...]]></description><link>https://martinbowling.com/evolving-tables-for-better-reasoning</link><guid isPermaLink="true">https://martinbowling.com/evolving-tables-for-better-reasoning</guid><category><![CDATA[ChainOfTable]]></category><category><![CDATA[groq]]></category><category><![CDATA[Python]]></category><category><![CDATA[llm]]></category><category><![CDATA[AI]]></category><category><![CDATA[Python 3]]></category><category><![CDATA[Dataanalysis]]></category><dc:creator><![CDATA[Martin Bowling]]></dc:creator><pubDate>Wed, 03 Jul 2024 04:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/s0-jOb62ibI/upload/d1ed1d0f805562a567cd3ccbf8b150f5.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>As a developer always on the lookout for exciting AI advancements, I recently stumbled upon a fascinating ICLR 2024 paper titled "<a target="_blank" href="https://arxiv.org/pdf/2401.04398">Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding</a>". The concept immediately grabbed my attention, and I couldn't resist diving in to implement it myself. Let me take you through my journey of bringing this innovative approach to life.</p>
<h2 id="heading-the-spark-of-inspiration">The Spark of Inspiration</h2>
<p>The core idea behind Chain-of-Table is brilliantly simple yet powerful: instead of relying solely on textual reasoning chains, why not leverage the structured nature of tables themselves to represent intermediate thoughts? This approach allows large language models (LLMs) to dynamically plan and execute a series of table operations, effectively "evolving" the table to better answer complex questions.</p>
<p>Excited by the potential, I fired up Replit and got to work. Here's a glimpse into how I implemented the key components:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">chain_of_table_reasoning</span>(<span class="hljs-params">df, question</span>):</span>
    messages = [{
        <span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>,
        <span class="hljs-string">"content"</span>: system_prompt
    }, {
        <span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>,
        <span class="hljs-string">"content"</span>: <span class="hljs-string">f"Table:\n<span class="hljs-subst">{df.to_string()}</span>\n\nQuestion: <span class="hljs-subst">{question}</span>"</span>
    }]

    tools = [{
        <span class="hljs-string">"type"</span>: <span class="hljs-string">"function"</span>,
        <span class="hljs-string">"function"</span>: {
            <span class="hljs-string">"name"</span>: func_name,
            <span class="hljs-string">"description"</span>: <span class="hljs-string">f"Perform <span class="hljs-subst">{func_name}</span> operation on the table"</span>,
            <span class="hljs-string">"parameters"</span>: {
                <span class="hljs-string">"type"</span>: <span class="hljs-string">"object"</span>,
                <span class="hljs-string">"properties"</span>: {
                    <span class="hljs-string">"args"</span>: {
                        <span class="hljs-string">"type"</span>: <span class="hljs-string">"object"</span>,
                        <span class="hljs-string">"description"</span>: <span class="hljs-string">f"Arguments for <span class="hljs-subst">{func_name}</span> operation"</span>
                    }
                },
                <span class="hljs-string">"required"</span>: [<span class="hljs-string">"args"</span>]
            }
        }
    } <span class="hljs-keyword">for</span> func_name <span class="hljs-keyword">in</span> [
        <span class="hljs-string">"add_col"</span>, <span class="hljs-string">"select_row"</span>, <span class="hljs-string">"select_col"</span>, <span class="hljs-string">"group_by"</span>, <span class="hljs-string">"sort_by"</span>,
        <span class="hljs-string">"get_table_info"</span>
    ]]

    <span class="hljs-comment"># ... (loop for iterative reasoning)</span>
</code></pre>
<p>This function sets up the core reasoning loop, allowing the LLM to iteratively transform the table using a set of predefined operations.</p>
<h2 id="heading-supercharging-with-groq">Supercharging with Groq</h2>
<p>To really push the boundaries of performance, I decided to leverage Groq's lightning-fast inference API. Their function calling feature turned out to be a game-changer for this project. The speed is truly remarkable, making the Chain-of-Table reasoning feel almost real-time.</p>
<p>Here's a snippet showing how I integrated Groq:</p>
<pre><code class="lang-python">client = AsyncGroq(api_key=os.environ.get(<span class="hljs-string">"GROQ_API_KEY"</span>))

MODEL = <span class="hljs-string">"llama3-70b-8192"</span>

<span class="hljs-comment"># ... (in the reasoning loop)</span>
response = <span class="hljs-keyword">await</span> client.chat.completions.create(
    model=MODEL,
    messages=messages,
    tools=tools,
    tool_choice=<span class="hljs-string">"auto"</span>,
)
</code></pre>
<h2 id="heading-bringing-it-all-together-with-gradio">Bringing It All Together with Gradio</h2>
<p>To make this project accessible and interactive, I built a simple Gradio interface. This allows users to input their own tables and questions, seeing the Chain-of-Table reasoning in action. Here's a glimpse of the Gradio setup:</p>
<pre><code class="lang-python"><span class="hljs-keyword">with</span> gr.Blocks() <span class="hljs-keyword">as</span> demo:
    gr.Markdown(<span class="hljs-string">"# Chain-of-Table Reasoning"</span>)

    <span class="hljs-keyword">with</span> gr.Row():
        <span class="hljs-keyword">with</span> gr.Column():
            table_input = gr.TextArea(label=<span class="hljs-string">"Input Table (CSV format)"</span>)
            upload_button = gr.UploadButton(<span class="hljs-string">"Upload CSV"</span>, file_types=[<span class="hljs-string">"csv"</span>])
            upload_button.upload(<span class="hljs-keyword">lambda</span> file: file.decode(<span class="hljs-string">'utf-8'</span>),
                                 upload_button, table_input)

        <span class="hljs-keyword">with</span> gr.Column():
            question_input = gr.TextArea(label=<span class="hljs-string">"Question"</span>)
            submit_button = gr.Button(<span class="hljs-string">"Submit"</span>)

    output = gr.TextArea(label=<span class="hljs-string">"Answer"</span>)

    submit_button.click(process_question,
                        inputs=[table_input, question_input],
                        outputs=output)

    gr.Examples(examples=examples, inputs=[table_input, question_input])
</code></pre>
<h2 id="heading-the-power-of-open-development">The Power of Open Development</h2>
<p>One of the most exciting aspects of this project has been the ability to rapidly prototype and deploy using Replit. Their platform is truly a godsend for going from idea to MVP at lightning speed. Plus, deployment is a breeze!</p>
<h2 id="heading-whats-next">What's Next?</h2>
<p>I'm thrilled to see what the community builds with this Chain-of-Table implementation. The combination of structured table reasoning, Groq's speed, and Replit's accessibility creates a powerful foundation for innovation.</p>
<p>I've made a template available so you can easily get started with your own experiments. Who knows? Your next AI breakthrough might be just a few table operations away!</p>
<p>Remember, the world of AI moves fast, and sometimes the most exciting developments happen when we combine cutting-edge research with practical implementation. So grab that template, fire up your Replit, and let's see what amazing things we can build together!</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://replit.com/@MartinBowling/Chain-of-Tables-Chat-with-Groq?v=1#main.py">https://replit.com/@MartinBowling/Chain-of-Tables-Chat-with-Groq?v=1#main.py</a></div>
]]></content:encoded></item><item><title><![CDATA[Building a Mixture of Agents (MoA) Chatbot]]></title><description><![CDATA[I'm excited to share with you a project I've been working on that combines my love for AI, rapid prototyping, and pushing the boundaries of what's possible with current language models. Let's dive into the world of Mixture of Agents (MoA) chatbots!
T...]]></description><link>https://martinbowling.com/building-a-mixture-of-agents-moa-chatbot</link><guid isPermaLink="true">https://martinbowling.com/building-a-mixture-of-agents-moa-chatbot</guid><category><![CDATA[#MixtureofAgents]]></category><category><![CDATA[groq]]></category><category><![CDATA[llm]]></category><category><![CDATA[AI]]></category><category><![CDATA[#ai-tools]]></category><category><![CDATA[Python]]></category><category><![CDATA[#AIDevelopment ]]></category><dc:creator><![CDATA[Martin Bowling]]></dc:creator><pubDate>Sat, 29 Jun 2024 04:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/fEgt5QRI-rA/upload/47d97958f960a9821f5fa1d51a8839fb.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I'm excited to share with you a project I've been working on that combines my love for AI, rapid prototyping, and pushing the boundaries of what's possible with current language models. Let's dive into the world of Mixture of Agents (MoA) chatbots!</p>
<h2 id="heading-the-concept-why-settle-for-one-ai-when-you-can-have-four">The Concept: Why Settle for One AI When You Can Have Four?</h2>
<p>The idea behind this project is simple yet powerful: instead of relying on a single AI model to answer user queries, why not leverage the strengths of multiple models? This approach allows us to tap into the diverse knowledge and capabilities of different AI architectures, potentially leading to more comprehensive and nuanced responses.</p>
<p>Here's how it works:</p>
<ol>
<li><p>We take the user's question and send it to four different language models.</p>
</li>
<li><p>Each model processes the query and generates its own response.</p>
</li>
<li><p>A fifth model, acting as an "aggregator," synthesizes these responses into a single, coherent answer.</p>
</li>
</ol>
<p>The result? A chatbot that provides multi-perspective, well-rounded answers to user queries.</p>
<h2 id="heading-the-tech-stack-groq-replit-and-gradio">The Tech Stack: Groq, Replit, and Gradio</h2>
<p>For this project, I've leveraged some cutting-edge tools:</p>
<ul>
<li><p><strong>Groq</strong>: This incredible inference API allows us to run multiple AI models with lightning-fast speed. It's the secret sauce that makes real-time processing of multiple models feasible.</p>
</li>
<li><p><strong>Replit</strong>: I've set up the project as a Replit template, making it super easy for anyone to clone and run their own instance of the chatbot.</p>
</li>
<li><p><strong>Gradio</strong>: This library provides a simple yet powerful interface for our chatbot, allowing users to interact with it through a clean, web-based UI.</p>
</li>
</ul>
<h2 id="heading-the-code-lets-break-it-down">The Code: Let's Break It Down</h2>
<p>Let's walk through some key parts of the code to understand how this MoA chatbot works.</p>
<h3 id="heading-setting-up-the-models">Setting Up the Models</h3>
<pre><code class="lang-python">reference_models = [
    <span class="hljs-string">"gemma2-9b-it"</span>, <span class="hljs-string">"llama3-70b-8192"</span>, <span class="hljs-string">"llama3-8b-8192"</span>, <span class="hljs-string">"mixtral-8x7b-32768"</span>
]

aggregator_model = <span class="hljs-string">"llama3-70b-8192"</span>
</code></pre>
<p>Here, we define our four reference models and the aggregator model. I've chosen a mix of different architectures and sizes to get a diverse range of responses.</p>
<h3 id="heading-the-aggregators-system-prompt">The Aggregator's System Prompt</h3>
<p>The aggregator model needs to know its job. We provide it with a detailed system prompt:</p>
<pre><code class="lang-python">aggregator_system_prompt = <span class="hljs-string">"""As an advanced language model, your task is to synthesize responses from multiple AI models into a single, coherent, and high-quality answer. Follow these guidelines:

1. Analyze and compare: Carefully examine the responses from different models, noting similarities, differences, and unique insights.
2. Evaluate accuracy: Critically assess the information provided, identifying and discarding any inaccuracies or biases.
...
"""</span>
</code></pre>
<p>This prompt is crucial as it guides the aggregator in combining the various model outputs effectively.</p>
<h3 id="heading-running-the-models">Running the Models</h3>
<p>The <code>run_llm</code> function handles individual model calls:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">run_llm</span>(<span class="hljs-params">model, message, history</span>):</span>
    messages = [{
        <span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span> <span class="hljs-keyword">if</span> i % <span class="hljs-number">2</span> == <span class="hljs-number">0</span> <span class="hljs-keyword">else</span> <span class="hljs-string">"assistant"</span>,
        <span class="hljs-string">"content"</span>: str(msg)
    } <span class="hljs-keyword">for</span> i, msg <span class="hljs-keyword">in</span> enumerate(history + [message])]

    response = <span class="hljs-keyword">await</span> client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=<span class="hljs-number">0.7</span>,
        max_tokens=<span class="hljs-number">512</span>,
    )
    <span class="hljs-keyword">return</span> response.choices[<span class="hljs-number">0</span>].message.content
</code></pre>
<p>We use async calls to Groq's API for optimal performance.</p>
<h3 id="heading-the-main-chat-function">The Main Chat Function</h3>
<p>The heart of our chatbot is the <code>chat_with_mixture_of_agents</code> function:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">chat_with_mixture_of_agents</span>(<span class="hljs-params">message, history</span>):</span>
    <span class="hljs-comment"># Run reference models</span>
    tasks = [run_llm(model, message, history) <span class="hljs-keyword">for</span> model <span class="hljs-keyword">in</span> reference_models]
    results = <span class="hljs-keyword">await</span> asyncio.gather(*tasks)

    <span class="hljs-comment"># Prepare aggregator input</span>
    aggregator_input = <span class="hljs-string">"\n\n"</span>.join([
        <span class="hljs-string">f"<span class="hljs-subst">{model}</span>:\n<span class="hljs-subst">{response}</span>"</span>
        <span class="hljs-keyword">for</span> model, response <span class="hljs-keyword">in</span> zip(reference_models, results)
    ])

    <span class="hljs-comment"># Run aggregator model</span>
    aggregator_messages = [
        {
            <span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>,
            <span class="hljs-string">"content"</span>: aggregator_system_prompt
        },
        {
            <span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>,
            <span class="hljs-string">"content"</span>: <span class="hljs-string">f"User query: <span class="hljs-subst">{message}</span>\n\n<span class="hljs-subst">{aggregator_input}</span>"</span>
        },
    ]

    response_content = <span class="hljs-string">""</span>
    stream = <span class="hljs-keyword">await</span> client.chat.completions.create(
        model=aggregator_model,
        messages=aggregator_messages,
        temperature=<span class="hljs-number">0.7</span>,
        max_tokens=<span class="hljs-number">1024</span>,
        top_p=<span class="hljs-number">1</span>,
        stop=<span class="hljs-literal">None</span>,
        stream=<span class="hljs-literal">True</span>,
    )

    <span class="hljs-keyword">async</span> <span class="hljs-keyword">for</span> chunk <span class="hljs-keyword">in</span> stream:
        content = chunk.choices[<span class="hljs-number">0</span>].delta.content
        <span class="hljs-keyword">if</span> content:
            response_content += content
        <span class="hljs-keyword">yield</span> response_content
</code></pre>
<p>This function orchestrates the entire process:</p>
<ol>
<li><p>It runs all reference models concurrently.</p>
</li>
<li><p>It prepares the input for the aggregator model.</p>
</li>
<li><p>It streams the aggregator's response back to the user in real-time.</p>
</li>
</ol>
<h2 id="heading-the-user-interface">The User Interface</h2>
<p>We use Gradio to create a simple chat interface:</p>
<pre><code class="lang-python"><span class="hljs-keyword">with</span> gr.Blocks(fill_height=<span class="hljs-literal">True</span>, head=js) <span class="hljs-keyword">as</span> demo:
    gr.ChatInterface(
        chat_with_mixture_of_agents,
        clear_btn=<span class="hljs-literal">None</span>,
        undo_btn=<span class="hljs-literal">None</span>,
        retry_btn=<span class="hljs-literal">None</span>,
        fill_height=<span class="hljs-literal">True</span>,
        examples=[
            <span class="hljs-string">"Explain the concept of quantum entanglement in simple terms."</span>,
            <span class="hljs-string">"What are the potential implications of artificial general intelligence on society?"</span>,
            <span class="hljs-comment"># ... more example questions ...</span>
        ])
</code></pre>
<p>This creates a clean, user-friendly interface with some example questions to get users started.</p>
<h2 id="heading-conclusion-and-future-directions">Conclusion and Future Directions</h2>
<p>This MoA chatbot is just the beginning. There are so many exciting directions we could take this:</p>
<ul>
<li><p>Experimenting with different combinations of models</p>
</li>
<li><p>Implementing more sophisticated aggregation techniques</p>
</li>
<li><p>Adding user controls for model selection or weighting</p>
</li>
<li><p>Integrating external knowledge sources</p>
</li>
</ul>
<p>The possibilities are endless, and I can't wait to see what the community does with this concept. Remember, the project is open-source, so feel free to fork it, modify it, and make it your own!</p>
<p>If you want to try it out yourself, head over to the Replit template and start chatting with our AI panel. And don't forget to let me know what you think or if you have any cool ideas for improvements.</p>
<p>Happy coding, and may your AI adventures be ever exciting! 🚀🤖</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://replit.com/@MartinBowling/MoA-Chat-with-Groq?v=1#main.py">https://replit.com/@MartinBowling/MoA-Chat-with-Groq?v=1#main.py</a></div>
]]></content:encoded></item><item><title><![CDATA[Build Your Own AI Language Tutor]]></title><description><![CDATA[Hey there, fellow developers! 👋 I'm excited to share with you my latest project: the AI Language Learning Assistant. This little gem came to life during a #Hack@Home session when I couldn't make it to the Berkeley AI Hackathon. But who says you need...]]></description><link>https://martinbowling.com/build-your-own-ai-language-tutor</link><guid isPermaLink="true">https://martinbowling.com/build-your-own-ai-language-tutor</guid><category><![CDATA[AI]]></category><category><![CDATA[llm]]></category><category><![CDATA[languages]]></category><category><![CDATA[Python]]></category><category><![CDATA[groq]]></category><category><![CDATA[repl.it]]></category><dc:creator><![CDATA[Martin Bowling]]></dc:creator><pubDate>Sat, 22 Jun 2024 04:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/zFSo6bnZJTw/upload/e70967f81b3c6ff017d4972cabce776e.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there, fellow developers! 👋 I'm excited to share with you my latest project: the AI Language Learning Assistant. This little gem came to life during a #Hack@Home session when I couldn't make it to the Berkeley AI Hackathon. But who says you need to be at a physical event to innovate, right? 🚀</p>
<p>Let's dive into the nitty-gritty of this AI-powered polyglot helper!</p>
<h2 id="heading-the-tech-stack">The Tech Stack 🛠️</h2>
<p>First things first, let's talk about the awesome tech that makes this possible:</p>
<ul>
<li><p><strong>Groq API</strong>: The powerhouse behind our natural language processing</p>
</li>
<li><p><strong>Gradio</strong>: For that slick, user-friendly interface</p>
</li>
<li><p><strong>Replit</strong>: Our trusty hosting platform</p>
</li>
<li><p><strong>Python</strong>: The glue that holds it all together</p>
</li>
</ul>
<h2 id="heading-key-features">Key Features 🌟</h2>
<ol>
<li><p><strong>Multilingual Support</strong>: We're talking 9 languages here! From English to Portuguese, we've got you covered.</p>
</li>
<li><p><strong>Real-time Speech Recognition</strong>: Thanks to Groq's Whisper API, we can turn your speech into text in a snap.</p>
</li>
<li><p><strong>Intelligent Feedback</strong>: Grammar, vocabulary, pronunciation - we analyze it all!</p>
</li>
<li><p><strong>Contextual Responses</strong>: Our AI doesn't just respond; it converses naturally.</p>
</li>
<li><p><strong>Instant Translations</strong>: Because sometimes you need that extra help to understand.</p>
</li>
</ol>
<h2 id="heading-the-code-breakdown">The Code Breakdown 💻</h2>
<p>Let's look at some of the cool parts of our code:</p>
<h3 id="heading-speech-recognition">Speech Recognition</h3>
<p>We're using Groq's Whisper API for transcription. Check out this async function:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">transcribe_audio</span>(<span class="hljs-params">audio_file</span>):</span>
    <span class="hljs-keyword">try</span>:
        form_data = FormData()
        form_data.add_field(<span class="hljs-string">'file'</span>, audio_file, filename=<span class="hljs-string">'audio.wav'</span>, content_type=<span class="hljs-string">'audio/wav'</span>)
        form_data.add_field(<span class="hljs-string">'model'</span>, <span class="hljs-string">'whisper-large-v3'</span>)
        <span class="hljs-comment"># ... more fields ...</span>

        <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> aiohttp.ClientSession() <span class="hljs-keyword">as</span> session:
            <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> session.post(<span class="hljs-string">"https://api.groq.com/openai/v1/audio/transcriptions"</span>,
                                    headers={<span class="hljs-string">"Authorization"</span>: <span class="hljs-string">f"Bearer <span class="hljs-subst">{GROQ_API_KEY}</span>"</span>},
                                    data=form_data) <span class="hljs-keyword">as</span> response:
                <span class="hljs-comment"># ... handle response ...</span>
                <span class="hljs-keyword">return</span> result[<span class="hljs-string">"text"</span>]
    <span class="hljs-keyword">except</span> ClientError <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">raise</span> GroqError(<span class="hljs-string">f"Network error during transcription: <span class="hljs-subst">{str(e)}</span>"</span>)
</code></pre>
<p>This function handles the audio file, sends it to Groq's API, and returns the transcribed text. Pretty neat, huh?</p>
<h3 id="heading-language-analysis">Language Analysis</h3>
<p>Once we have the text, we analyze it:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">analyze_language</span>(<span class="hljs-params">text, target_language, history</span>):</span>
    prompt = <span class="hljs-string">f"""Analyze the following text in <span class="hljs-subst">{target_language}</span>, considering the conversation history:
    # ... prompt details ...
    """</span>

    response = <span class="hljs-keyword">await</span> async_groq_client.chat.completions.create(
        model=<span class="hljs-string">"llama3-70b-8192"</span>,
        messages=[{<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"You are an expert language tutor."</span>},
                  {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: prompt}],
        temperature=<span class="hljs-number">0.3</span>,
        max_tokens=<span class="hljs-number">1000</span>,
        response_format={<span class="hljs-string">"type"</span>: <span class="hljs-string">"json_object"</span>})
    <span class="hljs-keyword">return</span> json.loads(response.choices[<span class="hljs-number">0</span>].message.content)
</code></pre>
<p>This function crafts a prompt for Groq's language model, asking it to analyze grammar, vocabulary, and more. We get back a structured JSON response with all the juicy details.</p>
<h3 id="heading-conversation-management">Conversation Management</h3>
<p>To keep things flowing naturally, we maintain a conversation history:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">update_conversation_history</span>(<span class="hljs-params">user_id, target_language, new_entry</span>):</span>
    key = <span class="hljs-string">f"<span class="hljs-subst">{user_id}</span>_<span class="hljs-subst">{target_language}</span>"</span>
    history = get_conversation_history(user_id, target_language)
    history.append(new_entry)
    history = history[<span class="hljs-number">-5</span>:]  <span class="hljs-comment"># Keep only the last 5 interactions</span>
    db[key] = history
</code></pre>
<p>This function updates our conversation history, keeping the last 5 interactions for context. It's like giving our AI a short-term memory!</p>
<h2 id="heading-the-user-experience">The User Experience 🎨</h2>
<p>Using Gradio, we've created an interface that's both powerful and easy to use:</p>
<pre><code class="lang-python">iface = gr.Interface(
    fn=language_tutor,
    inputs=[
        gr.Audio(type=<span class="hljs-string">"filepath"</span>, label=<span class="hljs-string">"Speak here"</span>),
        gr.Dropdown([<span class="hljs-string">"English"</span>, <span class="hljs-string">"Chinese"</span>, <span class="hljs-string">"German"</span>, <span class="hljs-string">"Spanish"</span>, <span class="hljs-string">"Russian"</span>, <span class="hljs-string">"Korean"</span>,
                     <span class="hljs-string">"French"</span>, <span class="hljs-string">"Japanese"</span>, <span class="hljs-string">"Portuguese"</span>],
                    label=<span class="hljs-string">"Select target language"</span>),
        <span class="hljs-comment"># ... more inputs ...</span>
    ],
    outputs=[
        gr.Textbox(label=<span class="hljs-string">"Session ID"</span>),
        gr.Textbox(label=<span class="hljs-string">"Target Language"</span>),
        gr.Textbox(label=<span class="hljs-string">"Transcription"</span>),
        gr.Textbox(label=<span class="hljs-string">"Language Analysis"</span>),
        gr.Textbox(label=<span class="hljs-string">"AI Response"</span>),
        gr.Textbox(label=<span class="hljs-string">"Translation"</span>),
        gr.Textbox(label=<span class="hljs-string">"Session Management Message"</span>)
    ],
    title=<span class="hljs-string">"AI Language Learning Assistant"</span>,
    description=<span class="hljs-string">"Speak in your target language and get instant feedback and responses!"</span>
)
</code></pre>
<p>This creates a web interface where users can speak, select their language, and get instant feedback. It's like having a language tutor in your browser! 🌐👩‍🏫</p>
<h2 id="heading-wrapping-up">Wrapping Up 🎁</h2>
<p>And there you have it, folks! A full-fledged AI language tutor that you can spin up with just a few clicks on Replit. It's been a blast putting this together, and I hope it inspires you to create your own AI-powered tools.</p>
<p>Remember, you don't need to be at a fancy hackathon to innovate. Sometimes, the best ideas come when you're #Hack@Home! 🏠💡</p>
<p>So, what are you waiting for? Fork the <a target="_blank" href="https://replit.com/@MartinBowling/Language-Trainer-with-Groq?v=1#readme.md">Replit template</a>, add your Groq API key, and start your polyglot journey today!</p>
<p>Happy coding, and may your conversations be ever multilingual! 🌍🗣️</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://replit.com/@MartinBowling/Language-Trainer-with-Groq?v=1#readme.md">https://replit.com/@MartinBowling/Language-Trainer-with-Groq?v=1#readme.md</a></div>
]]></content:encoded></item><item><title><![CDATA[Building a Fact-Checking AI Chat Interface]]></title><description><![CDATA[Hey there, fellow developers 👋, coming at you with a super cool project I whipped up inspired by Andrej Karpathy's tweet about LLM interfaces and proof. Let's dive into this fun little creation that combines the power of AI with fact-checking capabi...]]></description><link>https://martinbowling.com/building-a-fact-checking-ai-chat-interface</link><guid isPermaLink="true">https://martinbowling.com/building-a-fact-checking-ai-chat-interface</guid><category><![CDATA[factcheck]]></category><category><![CDATA[Python]]></category><category><![CDATA[llm]]></category><category><![CDATA[chatbot]]></category><category><![CDATA[claude.ai]]></category><category><![CDATA[Claude 3.5]]></category><dc:creator><![CDATA[Martin Bowling]]></dc:creator><pubDate>Fri, 21 Jun 2024 04:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/gcsNOsPEXfs/upload/90f466b48255c9eb426b1591145d5ad8.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there, fellow developers 👋, coming at you with a super cool project I whipped up inspired by <a target="_blank" href="https://x.com/karpathy/status/1804187473167421798">Andrej Karpathy's tweet about LLM interfaces and proof</a>. Let's dive into this fun little creation that combines the power of AI with fact-checking capabilities! 🚀</p>
<h2 id="heading-the-inspiration">The Inspiration 💡</h2>
<p>So, Karpathy tweeted about wanting a built-in feature in LLM interfaces that provides proof for the information they give. You know how it is - we often find ourselves manually verifying things like command switches or API details. Well, that got my gears turning, and I thought, "Why not create something that does this automatically?" 🤔</p>
<h2 id="heading-the-solution-fact-checked-llm-chat-interface">The Solution: Fact-Checked LLM Chat Interface 🧠🔍</h2>
<p>I threw together a quick and dirty (but oh-so-cool) Fact-Checked LLM Chat interface. Here's how it works:</p>
<ol>
<li><p>We use Anthropic's Claude 3.5 Sonnet to generate responses.</p>
</li>
<li><p>The response is streamed back to the user in real-time.</p>
</li>
<li><p>We then use Brave Search to find relevant sources for fact-checking.</p>
</li>
<li><p>The content from these sources is fed back to Claude to verify its original claims.</p>
</li>
<li><p>Finally, we present the user with the fact-check results and the source URL.</p>
</li>
</ol>
<p>Let's break down some of the key components of this code, shall we? 🛠️</p>
<h3 id="heading-setting-up-the-environment">Setting Up the Environment</h3>
<p>First things first, we import all the necessary libraries and set up our API keys:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> asyncio
<span class="hljs-keyword">import</span> httpx
<span class="hljs-keyword">import</span> gradio <span class="hljs-keyword">as</span> gr
<span class="hljs-keyword">from</span> anthropic <span class="hljs-keyword">import</span> AsyncAnthropic
<span class="hljs-keyword">import</span> json
<span class="hljs-keyword">import</span> re

async_client = AsyncAnthropic(api_key=os.environ[<span class="hljs-string">"ANTHROPIC_API_KEY"</span>])
BRAVE_API_KEY = os.environ.get(<span class="hljs-string">"BRAVE_API_KEY"</span>)
CLAUDE_MODEL = <span class="hljs-string">"claude-3-5-sonnet-20240620"</span>
</code></pre>
<h3 id="heading-fetching-search-results">Fetching Search Results</h3>
<p>We use Brave Search API to find relevant sources:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fetch_brave_search_results</span>(<span class="hljs-params">query: str, count: int = <span class="hljs-number">5</span></span>) -&gt; dict:</span>
    url = <span class="hljs-string">"https://api.search.brave.com/res/v1/web/search"</span>
    headers = {
        <span class="hljs-string">"Accept"</span>: <span class="hljs-string">"application/json"</span>,
        <span class="hljs-string">"Accept-Encoding"</span>: <span class="hljs-string">"gzip"</span>,
        <span class="hljs-string">"X-Subscription-Token"</span>: BRAVE_API_KEY
    }
    params = {
        <span class="hljs-string">"q"</span>: query,
        <span class="hljs-string">"count"</span>: count,
        <span class="hljs-string">"country"</span>: <span class="hljs-string">"us"</span>,
        <span class="hljs-string">"search_lang"</span>: <span class="hljs-string">"en"</span>,
    }

    <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> httpx.AsyncClient() <span class="hljs-keyword">as</span> client:
        response = <span class="hljs-keyword">await</span> client.get(url, headers=headers, params=params)
        response.raise_for_status()
        <span class="hljs-keyword">return</span> response.json()
</code></pre>
<h3 id="heading-streaming-claudes-responses">Streaming Claude's Responses</h3>
<p>We stream Claude's responses for a snappy user experience:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">ask_claude_streaming</span>(<span class="hljs-params">message</span>):</span>
    completion = async_client.messages.stream(
        max_tokens=<span class="hljs-number">4096</span>,
        messages=[{
            <span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>,
            <span class="hljs-string">"content"</span>: message
        }],
        model=CLAUDE_MODEL,
    )
    <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> completion <span class="hljs-keyword">as</span> stream:
        <span class="hljs-keyword">async</span> <span class="hljs-keyword">for</span> text <span class="hljs-keyword">in</span> stream.text_stream:
            <span class="hljs-keyword">yield</span> text
</code></pre>
<h3 id="heading-fact-checking-magic">Fact-Checking Magic ✨</h3>
<p>The real magic happens in the <code>fact_check</code> function. We fetch content from the search results, have Claude analyze it, and return the fact-check results:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fact_check</span>(<span class="hljs-params">llm_response: str, query: str, search_results: dict,
                     first_page_content: str</span>) -&gt; str:</span>
    <span class="hljs-comment"># ... (check the full code for details)</span>
    fact_check_prompt = <span class="hljs-string">f"""
    Given the following claim and markdown content from a webpage, determine if the claim is supported by the content. 
    Respond with a JSON object containing two fields: "claim_supported" (boolean) and "supporting_facts" (string).
    Ensure the "supporting_facts" field is a single line string with no line breaks.

    Claim: <span class="hljs-subst">{llm_response}</span>

    Query: <span class="hljs-subst">{query}</span>

    Markdown Content:
    <span class="hljs-subst">{markdown_content}</span>

    JSON Response:
    """</span>

    fact_check_result = <span class="hljs-string">''</span>.join(
        [chunk <span class="hljs-keyword">async</span> <span class="hljs-keyword">for</span> chunk <span class="hljs-keyword">in</span> ask_claude_streaming(fact_check_prompt)])
    <span class="hljs-comment"># ... (process the result)</span>
</code></pre>
<h3 id="heading-putting-it-all-together">Putting It All Together</h3>
<p>The <code>chat_with_fact_check</code> function orchestrates the whole process:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">chat_with_fact_check</span>(<span class="hljs-params">message, history</span>):</span>
    <span class="hljs-comment"># ... (see full code for details)</span>
    llm_response = <span class="hljs-string">""</span>
    <span class="hljs-keyword">async</span> <span class="hljs-keyword">for</span> chunk <span class="hljs-keyword">in</span> ask_claude_streaming(json.dumps(messages)):
        llm_response += chunk
        <span class="hljs-keyword">yield</span> history + [[message, llm_response]]

    <span class="hljs-comment"># Fact-checking after the full response</span>
    fact_check_result = <span class="hljs-keyword">await</span> fact_check(llm_response, message, search_results,
                                         first_page_content)

    <span class="hljs-keyword">yield</span> history + [[message, llm_response],
                     [<span class="hljs-string">"Fact Check"</span>, fact_check_result]]
</code></pre>
<h2 id="heading-the-result">The Result 🎉</h2>
<p>What we end up with is a chat interface that not only provides AI-generated responses but also fact-checks them on the fly! It's like having a super-smart assistant with a built-in fact-checker. How cool is that? 😎</p>
<h2 id="heading-try-it-yourself">Try It Yourself!</h2>
<p>I've made a Replit template so you can fork, build, and improve upon this project. Feel free to play around with it and see what awesome features you can add!</p>
<p>Remember, this is just a quick demo, but the possibilities are endless. You could extend this to support more sources, improve the fact-checking algorithm, or even add visual elements to make the proof more engaging.</p>
<p>So, what do you think? Ready to take your LLM interactions to the next level with some automatic fact-checking? Let me know what cool modifications you come up with!</p>
<p>Happy coding, and may your AI assistants always be truthful! 🤖✅</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://replit.com/@MartinBowling/Karpathy-Fact-Check-LLM-Chat-with-Sonnet-35?v=1#main.py">https://replit.com/@MartinBowling/Karpathy-Fact-Check-LLM-Chat-with-Sonnet-35?v=1#main.py</a></div>
]]></content:encoded></item><item><title><![CDATA[Crafting a Blazing-Fast Audio Transcription App with Groq's Whisper v3]]></title><description><![CDATA[Hey there, fellow developers! 👋 Today, I'm excited to share with you a project I've been working on that's all about pushing the boundaries of real-time audio transcription. Let's dive into the code and see how we can harness the power of Groq's Whi...]]></description><link>https://martinbowling.com/crafting-a-blazing-fast-audio-transcription-app-with-groqs-whisper-v3</link><guid isPermaLink="true">https://martinbowling.com/crafting-a-blazing-fast-audio-transcription-app-with-groqs-whisper-v3</guid><category><![CDATA[Whisperv3]]></category><category><![CDATA[speech to text]]></category><category><![CDATA[Audio Transcription]]></category><category><![CDATA[FastAPI]]></category><category><![CDATA[groq]]></category><category><![CDATA[Python]]></category><category><![CDATA[JavaScript]]></category><category><![CDATA[HTML5]]></category><category><![CDATA[whisper]]></category><dc:creator><![CDATA[Martin Bowling]]></dc:creator><pubDate>Thu, 20 Jun 2024 04:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/Y20JJ_ddy9M/upload/322bba8de230c8c4befbf88ec0ac57d0.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there, fellow developers! 👋 Today, I'm excited to share with you a project I've been working on that's all about pushing the boundaries of real-time audio transcription. Let's dive into the code and see how we can harness the power of Groq's Whisper v3 model to create a blazing-fast transcription service!</p>
<h2 id="heading-the-backend-magic">The Backend Magic 🧙‍♂️</h2>
<p>Let's start by looking at the heart of our application - the FastAPI backend. Here's where the real magic happens:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> io
<span class="hljs-keyword">import</span> logging
<span class="hljs-keyword">from</span> fastapi <span class="hljs-keyword">import</span> FastAPI, File, UploadFile, Request, WebSocket
<span class="hljs-keyword">from</span> fastapi.responses <span class="hljs-keyword">import</span> HTMLResponse
<span class="hljs-keyword">from</span> fastapi.templating <span class="hljs-keyword">import</span> Jinja2Templates
<span class="hljs-keyword">from</span> groq <span class="hljs-keyword">import</span> AsyncGroq

app = FastAPI()
templates = Jinja2Templates(directory=<span class="hljs-string">"templates"</span>)

GROQ_API_KEY = os.environ.get(<span class="hljs-string">"GROQ_API_KEY"</span>)
groq_client = AsyncGroq(api_key=GROQ_API_KEY)
</code></pre>
<p>We're using FastAPI for its speed and ease of use, and we're integrating with Groq's AsyncGroq client to tap into their lightning-fast Whisper v3 model.</p>
<p>The star of the show is our <code>/transcribe</code> endpoint:</p>
<pre><code class="lang-python"><span class="hljs-meta">@app.post("/transcribe")</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">transcribe_audio</span>(<span class="hljs-params">file: UploadFile = File(<span class="hljs-params">...</span>)</span>):</span>
    contents = <span class="hljs-keyword">await</span> file.read()
    audio_file = io.BytesIO(contents)
    audio_file.name = <span class="hljs-string">"audio.wav"</span>

    <span class="hljs-keyword">try</span>:
        transcription = <span class="hljs-keyword">await</span> groq_client.audio.transcriptions.create(
            file=audio_file,
            model=<span class="hljs-string">"whisper-large-v3"</span>,
            response_format=<span class="hljs-string">"verbose_json"</span>
        )
        <span class="hljs-keyword">return</span> {<span class="hljs-string">"transcription"</span>: transcription.text}
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">return</span> {<span class="hljs-string">"error"</span>: <span class="hljs-string">f"Error during transcription: <span class="hljs-subst">{str(e)}</span>"</span>}
</code></pre>
<p>This endpoint receives audio chunks, processes them using Groq's Whisper v3 model, and returns the transcription. It's asynchronous, which means it can handle multiple requests efficiently without blocking.</p>
<h2 id="heading-the-frontend-experience">The Frontend Experience 🎨</h2>
<p>On the frontend, we're keeping things simple yet effective with a straightforward HTML/JS setup:</p>
<pre><code class="lang-html"><span class="hljs-tag">&lt;<span class="hljs-name">button</span> <span class="hljs-attr">id</span>=<span class="hljs-string">"startButton"</span>&gt;</span>Start Recording<span class="hljs-tag">&lt;/<span class="hljs-name">button</span>&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">button</span> <span class="hljs-attr">id</span>=<span class="hljs-string">"stopButton"</span> <span class="hljs-attr">disabled</span>&gt;</span>Stop Recording<span class="hljs-tag">&lt;/<span class="hljs-name">button</span>&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">id</span>=<span class="hljs-string">"transcription"</span>&gt;</span><span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">id</span>=<span class="hljs-string">"status"</span>&gt;</span><span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
</code></pre>
<p>The real action happens in our JavaScript:</p>
<pre><code class="lang-javascript">startButton.onclick = <span class="hljs-keyword">async</span> () =&gt; {
    <span class="hljs-keyword">try</span> {
        <span class="hljs-keyword">const</span> stream = <span class="hljs-keyword">await</span> navigator.mediaDevices.getUserMedia({ <span class="hljs-attr">audio</span>: <span class="hljs-literal">true</span> });
        mediaRecorder = <span class="hljs-keyword">new</span> MediaRecorder(stream);

        mediaRecorder.ondataavailable = <span class="hljs-function">(<span class="hljs-params">event</span>) =&gt;</span> {
            audioChunks.push(event.data);
            <span class="hljs-keyword">if</span> (audioChunks.length === <span class="hljs-number">5</span>) {  <span class="hljs-comment">// Send every 5 chunks</span>
                sendAudioChunks();
            }
        };

        mediaRecorder.start(<span class="hljs-number">1000</span>);  <span class="hljs-comment">// Capture in 1-second intervals</span>
        <span class="hljs-comment">// ... (update UI)</span>
    } <span class="hljs-keyword">catch</span> (error) {
        updateStatus(<span class="hljs-string">`Error starting recording: <span class="hljs-subst">${error.message}</span>`</span>);
    }
};
</code></pre>
<p>We're using the MediaRecorder API to capture audio in 1-second chunks. Every 5 chunks, we send the data to our backend for transcription.</p>
<h2 id="heading-the-secret-sauce-groqs-whisper-v3">The Secret Sauce: Groq's Whisper v3 🚀</h2>
<p>The real power behind this application is Groq's Whisper v3 model. It's incredibly fast and accurate, allowing us to transcribe audio in near real-time. By leveraging Groq's API, we're able to offload the heavy lifting of transcription to their specialized hardware, resulting in lightning-fast results.</p>
<h2 id="heading-putting-it-all-together">Putting It All Together 🏗️</h2>
<p>When a user starts recording, we capture audio chunks and send them to our backend. The backend then uses Groq's Whisper v3 model to transcribe the audio and sends the results back to the frontend, where it's displayed in real-time.</p>
<p>This architecture allows for a smooth, responsive user experience while taking advantage of the incredible speed of Groq's technology.</p>
<h2 id="heading-wrapping-up">Wrapping Up 🎁</h2>
<p>And there you have it! A real-time audio transcription service that leverages the power of Groq's Whisper v3 model. This project demonstrates how we can combine modern web technologies with cutting-edge AI to create powerful, user-friendly applications.</p>
<p>Feel free to clone the repo, experiment with it, and let me know what cool features you add! Happy coding, everyone! 🚀👨‍💻👩‍💻</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://replit.com/@MartinBowling/YT-Transcriber-with-Groq-Whisper-v3?v=1#templates/index.html">https://replit.com/@MartinBowling/YT-Transcriber-with-Groq-Whisper-v3?v=1#templates/index.html</a></div>
]]></content:encoded></item><item><title><![CDATA[Building an AI-Enhanced Chatbot with FastAPI and Groq]]></title><description><![CDATA[I was tinkering with a fun idea - an AI-enhanced chatbot system that leverages the power of FastAPI and Groq's lightning-fast inference capabilities. This chatbot is designed to provide a more natural and efficient conversation experience by incorpor...]]></description><link>https://martinbowling.com/building-an-ai-enhanced-chatbot-with-fastapi-and-groq</link><guid isPermaLink="true">https://martinbowling.com/building-an-ai-enhanced-chatbot-with-fastapi-and-groq</guid><category><![CDATA[groq]]></category><category><![CDATA[Python]]></category><category><![CDATA[AI]]></category><category><![CDATA[llm]]></category><category><![CDATA[chatbot]]></category><dc:creator><![CDATA[Martin Bowling]]></dc:creator><pubDate>Sun, 17 Mar 2024 04:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/4XlvusRVTU0/upload/5d4f520065fdf0339dbf7fd3bffcb205.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I was tinkering with a fun idea - an AI-enhanced chatbot system that leverages the power of FastAPI and Groq's lightning-fast inference capabilities. This chatbot is designed to provide a more natural and efficient conversation experience by incorporating several advanced features. Let me walk you through the key components and concepts!</p>
<ol>
<li>FastAPI Framework</li>
</ol>
<p>We're using FastAPI as our web framework. It's fast, easy to use, and perfect for building APIs. Here's how we set up our basic app:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> fastapi <span class="hljs-keyword">import</span> FastAPI
<span class="hljs-keyword">import</span> uvicorn

app = FastAPI()

<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    uvicorn.run(app, port=<span class="hljs-number">8080</span>, host=<span class="hljs-string">"0.0.0.0"</span>)
</code></pre>
<ol start="2">
<li>Message Completion Prediction</li>
</ol>
<p>One of the coolest features of this chatbot is its ability to predict message completions. We use Groq's API to assess the likelihood that a user's message is complete:</p>
<pre><code class="lang-python">response = <span class="hljs-keyword">await</span> client.chat.completions.create(
    model=MODEL_TO_USE,
    messages=[{
        <span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>,
        <span class="hljs-string">"content"</span>: <span class="hljs-string">"It is extremely important that you are brief and that you only respond with a number between 0-100 to show likelihood that the user message is complete and can be responded to..."</span>
    }, {
        <span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>,
        <span class="hljs-string">"content"</span>: <span class="hljs-string">f"<span class="hljs-subst">{text}</span>"</span>
    }],
)
</code></pre>
<p>If the likelihood is below 90%, we generate a predicted ending for the message.</p>
<ol start="3">
<li>Similarity Detection</li>
</ol>
<p>To avoid redundant processing, we implement a similarity check using embeddings:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">check_if_same</span>(<span class="hljs-params">combined_message</span>):</span>
    <span class="hljs-keyword">global</span> last_message_embedding, last_response
    embedding_response = <span class="hljs-keyword">await</span> embed_client.embeddings.create(
        model=EMBEDDING_MODEL_TO_USE, input=combined_message)
    current_embedding = np.array(embedding_response.data[<span class="hljs-number">0</span>].embedding)

    <span class="hljs-keyword">if</span> last_message_embedding <span class="hljs-keyword">is</span> <span class="hljs-keyword">not</span> <span class="hljs-literal">None</span>:
        similarity = <span class="hljs-number">1</span> - cosine(last_message_embedding, current_embedding)
        <span class="hljs-keyword">if</span> similarity &gt; <span class="hljs-number">0.9</span>:
            <span class="hljs-keyword">return</span> <span class="hljs-literal">True</span>, similarity, last_response

    last_message_embedding = current_embedding
    <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span>, <span class="hljs-number">0</span>, <span class="hljs-literal">None</span>
</code></pre>
<p>This function helps us identify if the current message is very similar to the previous one, allowing us to reuse the last response if appropriate.</p>
<ol start="4">
<li>Response Generation</li>
</ol>
<p>If the message is unique or complete, we generate a new response using Groq's API:</p>
<pre><code class="lang-python">response = <span class="hljs-keyword">await</span> client.chat.completions.create(
    model=MODEL_TO_USE,
    messages=[{
        <span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>,
        <span class="hljs-string">"content"</span>: <span class="hljs-string">"Generate a response to the user message"</span>
    }, {
        <span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>,
        <span class="hljs-string">"content"</span>: combined_message
    }],
)
</code></pre>
<ol start="5">
<li>User Interface</li>
</ol>
<p>We've created a simple yet effective UI using HTML, Bootstrap, and jQuery. The interface allows users to input their messages and displays the AI's responses in real-time.</p>
<pre><code class="lang-html"><span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"container text-center"</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">h2</span>&gt;</span>Enter Your Message<span class="hljs-tag">&lt;/<span class="hljs-name">h2</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"d-flex justify-content-center"</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">input</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"text"</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"inputBox"</span> <span class="hljs-attr">id</span>=<span class="hljs-string">"userInput"</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"spinner-border text-primary d-none"</span> <span class="hljs-attr">id</span>=<span class="hljs-string">"loadingSpinner"</span> <span class="hljs-attr">role</span>=<span class="hljs-string">"status"</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">span</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"sr-only"</span>&gt;</span>Loading...<span class="hljs-tag">&lt;/<span class="hljs-name">span</span>&gt;</span>
        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
    <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"response-area"</span> <span class="hljs-attr">id</span>=<span class="hljs-string">"responseArea"</span>&gt;</span><span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
</code></pre>
<p>The JavaScript code handles user input and updates the UI with the AI's responses:</p>
<pre><code class="lang-javascript">$(<span class="hljs-string">'#userInput'</span>).on(<span class="hljs-string">'keyup'</span>, <span class="hljs-function"><span class="hljs-keyword">function</span>(<span class="hljs-params">event</span>) </span>{
    <span class="hljs-keyword">if</span> (event.key === <span class="hljs-string">' '</span> || event.key === <span class="hljs-string">'.'</span> || event.key === <span class="hljs-string">'?'</span> || event.key === <span class="hljs-string">'!'</span>) {
        <span class="hljs-comment">// Send request to server</span>
        <span class="hljs-comment">// ...</span>
    }
});
</code></pre>
<p>This project demonstrates how we can combine multiple AI models and fast inference engines like Groq to create a more intelligent and responsive chatbot. The system's ability to predict message completions, detect similarities, and generate context-aware responses makes for a much more natural conversation flow.</p>
<p>I'm constantly working on improving this system, and I'd love to hear your thoughts or suggestions. Feel free to try it out and let me know what you think!</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://replit.com/@MartinBowling/GroqPredictiveChat?v=1#index.html">https://replit.com/@MartinBowling/GroqPredictiveChat?v=1#index.html</a></div>
]]></content:encoded></item></channel></rss>