Airtable + GPT: Prototyping a Lightweight RAG System with No-Code Tools

Image by Editor | ChatGPT

# Introduction

Ready for a practical walkthrough with little to no code involved, depending on the approach you choose? This tutorial shows how to tie together two formidable tools — OpenAI‘s GPT models and the Airtable cloud-based database — to prototype a simple, toy-sized retrieval-augmented generation (RAG) system. The system accepts question-based prompts and uses text data stored in Airtable as the knowledge base to produce grounded answers. If you’re unfamiliar with RAG systems, or want a refresher, don’t miss this article series on understanding RAG.

# The Ingredients

To practice this tutorial yourself, you’ll need:

An Airtable account with a base created in your workspace.
An OpenAI API key (ideally a paid plan for flexibility in model choice).
A Pipedream account — an orchestration and automation app that enables experimentation under a free tier (with limits on daily runs).

# The Retrieval-Augmented Generation Recipe

The process to build our RAG system isn’t purely linear, and some steps can be taken in different ways. Depending on your level of programming knowledge, you may opt for a code-free or nearly code-free approach, or create the workflow programmatically.

In essence, we’ll create an orchestration workflow consisting of three parts, using Pipedream:

Trigger: similar to a web service request, this element initiates an action flow that passes through the next elements in the workflow. Once deployed, this is where you specify the request, i.e., the user prompt for our prototype RAG system.
Airtable block: establishes a connection to our Airtable base and specific table to use its data as the RAG system’s knowledge base. We’ll add some text data to it shortly within Airtable.
OpenAI block: connects to OpenAI’s GPT-based language models using an API key and passes the user prompt alongside the context (retrieved Airtable data) to the model to obtain a response.

But first, we need to create a new table in our Airtable base containing text data. For this example, I created an empty table with three fields (ID: one-line text, Source: one-line text, Content: long text), and then imported data from this publicly available small dataset containing text with basic knowledge about Asian countries. Use the CSV and link options to import the data into the table. More information about creating tables and importing data is in this article.

Once you have your knowledge base ready in Airtable, let’s switch to Pipedream for the rest of the process! Register if you haven’t done so yet, and log in. There are several ways to initiate a workflow. One of them is by describing what you want to create to Pipedream’s AI agent, which is intuitive but may result in errors needing fixes, as we’ll illustrate shortly. Pipedream has two parallel UIs: one for manually building and editing workflows, and one for conversations with its AI agent, called String.

Pipedream’s AI conversational interface (String)

If you opt for letting String build an approximate version of your desired workflow, describe clearly what to build, with a prompt that may look something like:

I’d like to orchestrate a simple toy RAG application that uses an Airtable table containing text, and OpenAI’s GPT models (I have a paid API key).

Pipedream’s language model may return an answer for you to approve (at your own risk), like this:

I’ll create a simple RAG (Retrieval-Augmented Generation) agent that connects your Airtable table with OpenAI’s GPT models. Here’s what I’ll build:

Set up a trigger to accept queries (via webhook or manual execution)
Search and retrieve relevant text from your Airtable table based on the query
Use the retrieved text as context for OpenAI GPT to generate an informed response
Return the AI-generated answer that’s grounded in your Airtable data
This will give you a basic RAG system where you can ask questions and get answers based on the content stored in your Airtable, enhanced by GPT’s reasoning capabilities.

For the sake of experimentation and “seeing what happens,” approve the suggested course of action to obtain a three-block workflow similar to the one in the image shown above.

Since this isn’t the core Pipedream editor UI, switch to it by clicking “Open Pipedream” in the upper right corner. A new tab will open with the actual Pipedream workflow editor.

For the trigger block, a URL is automatically generated with a syntax similar to this one I got for mine: https://eoupscprutt37xx.m.pipedream.net. Click it and, in the settings pane that opens on the right-hand side, ensure the first couple of options are set to “Full HTTP request” and “Return a static response.”

For the second block (Airtable action) there may be a little work to do. First, connect to your Airtable base. If you’re working in the same browser, this might be straightforward: sign in to Airtable from the pop-up window that appears after clicking “Connect new account,” then follow the on-screen steps to specify the base and table to access:

Pipedream workflow editor: connecting to Airtable

Here comes the tricky part (and a reason I intentionally left an imperfect prompt earlier when asking the AI agent to build the skeleton workflow): there are several types of Airtable actions to choose from, and the specific one we need for a RAG-style retrieval mechanism is “List records.” Chances are, this isn’t the action you see in the second block of your workflow. If that’s the case, remove it, add a new block in the middle, select “Airtable,” and choose “List records.” Then reconnect to your table and test the connection to ensure it works.

This is what a successfully tested connection looks like:

Pipedream workflow editor: testing connection to Airtable

Last, set up and configure OpenAI access to GPT. Keep your API key handy. If your third block’s secondary label isn’t “Generate RAG response,” remove the block and replace it with a new OpenAI block with this subtype.

Start by establishing an OpenAI connection using your API key:

Establishing OpenAI connection

The user question field should be set as {{ steps.trigger.event.body.test }}, and the knowledge base records (your text “documents” for RAG from Airtable) must be set as {{ steps.list_records.$return_value }}.

You can keep the rest as default and test, but you may encounter parsing errors common to these kinds of workflows, prompting you to jump back to String for support and automatic fixes using the AI agent. Alternatively, you can directly copy and paste the following into the OpenAI component’s code field at the bottom for a robust solution:

import openai from "@pipedream/openai"

export default defineComponent({
  name: "Generate RAG Response",
  description: "Generate a response using OpenAI based on user question and Airtable knowledge base content",
  type: "action",
  props: {
    openai,
    model: {
      propDefinition: [
        openai,
        "chatCompletionModelId",
      ],
    },
    question: {
      type: "string",
      label: "User Question",
      description: "The question from the webhook trigger",
      default: "{{ steps.trigger.event.body.test }}",
    },
    knowledgeBaseRecords: {
      type: "any",
      label: "Knowledge Base Records",
      description: "The Airtable records containing the knowledge base content",
      default: "{{ steps.list_records.$return_value }}",
    },
  },
  async run({ $ }) {
    // Extract user question
    const userQuestion = this.question;
    
    if (!userQuestion) {
      throw new Error("No question provided from the trigger");
    }

    // Process Airtable records to extract content
    const records = this.knowledgeBaseRecords;
    let knowledgeBaseContent = "";
    
    if (records && Array.isArray(records)) {
      knowledgeBaseContent = records
        .map(record => {
          // Extract content from fields.Content
          const content = record.fields?.Content;
          return content ? content.trim() : "";
        })
        .filter(content => content.length > 0) // Remove empty content
        .join("\n\n---\n\n"); // Separate different knowledge base entries
    }

    if (!knowledgeBaseContent) {
      throw new Error("No content found in knowledge base records");
    }

    // Create system prompt with knowledge base context
    const systemPrompt = `You are a helpful assistant that answers questions based on the provided knowledge base. Use only the information from the knowledge base below to answer questions. If the information is not available in the knowledge base, please say so.

Knowledge Base:
${knowledgeBaseContent}

Instructions:
- Answer based only on the provided knowledge base content
- Be accurate and concise
- If the answer is not in the knowledge base, clearly state that the information is not available
- Cite relevant parts of the knowledge base when possible`;

    // Prepare messages for OpenAI
    const messages = [
      {
        role: "system",
        content: systemPrompt,
      },
      {
        role: "user",
        content: userQuestion,
      },
    ];

    // Call OpenAI chat completion
    const response = await this.openai.createChatCompletion({
      $,
      data: {
        model: this.model,
        messages: messages,
        temperature: 0.7,
        max_tokens: 1000,
      },
    });

    const generatedResponse = response.generated_message?.content;

    if (!generatedResponse) {
      throw new Error("Failed to generate response from OpenAI");
    }

    // Export summary for user feedback
    $.export("$summary", `Generated RAG response for question: "${userQuestion.substring(0, 50)}${userQuestion.length > 50 ? '...' : ''}"`);

    // Return the generated response
    return {
      question: userQuestion,
      response: generatedResponse,
      model_used: this.model,
      knowledge_base_entries: records ? records.length : 0,
      full_openai_response: response,
    };
  },
})

If no errors or warnings appear, you should be ready to test and deploy. Deploy first, and then test by passing a user query like this in the newly opened deployment tab:

Testing deployed workflow with a prompt asking what is the capital of Japan

If the request is handled and everything runs correctly, scroll down to see the response returned by the GPT model accessed in the last stage of the workflow:

GPT model response

Well done! This response is grounded in the knowledge base we built in Airtable, so we now have a simple prototype RAG system that combines Airtable and GPT models via Pipedream.

# Wrapping Up

This article showed how to build, with little or no coding, an orchestration workflow to prototype a RAG system that uses Airtable text databases as the knowledge base for retrieval and OpenAI’s GPT models for response generation. Pipedream allows defining orchestration workflows programmatically, manually, or aided by its conversational AI agent. Through the author’s experiences, we succinctly showcased the pros and cons of each approach.

Iván Palomares Carrascosa is a leader, writer, speaker, and adviser in AI, machine learning, deep learning & LLMs. He trains and guides others in harnessing AI in the real world.

Source link

Airtable + GPT: Prototyping a Lightweight RAG System with No-Code Tools

# Introduction

# The Ingredients

# The Retrieval-Augmented Generation Recipe

# Wrapping Up

Leave a comment Cancel reply

You May Also Like

Breaking Down DENSE_RANK(): A Step-by-Step Guide for SQL Enthusiasts

5 Free Courses to Master MLOps