Sponsored Content
Traditional data platforms have long excelled at structured queries on tabular data – think “how many units did the West region sell last quarter?” This underlying relational foundation is powerful. But with the growing volume and importance of multimodal data (e.g. images, audio, unstructured text), answering nuanced semantic questions by relying on traditional, external machine learning pipelines has become a significant bottleneck.
Consider a common e-commerce scenario: “identify electronics products with high return rates linked to customer photos showing signs of damage upon arrival.” Historically, this meant using SQL for structured product data, sending images to a separate ML pipeline for analysis, and finally attempting to combine the disparate results. A multi-step, time-consuming process where AI was essentially bolted onto the dataflow rather than natively integrated within the analytical environment.
Imagine tackling this task – combining structured data with insights derived from unstructured visual media — using a single elegant SQL statement. This leap is possible by integrating generative AI directly into the core of the modern data platform. It introduces a new era where sophisticated, multimodal analyses can be executed with familiar SQL.
Let’s explore how generative AI is fundamentally reshaping data platforms and allowing practitioners to deliver multimodal insights with the versatility of SQL.
Relational Algebra Meets Generative AI
Traditional data warehouses derive their power from a foundation in relational algebra. This provides a mathematically defined and consistent framework to query structured, tabular data, excelling where schemas are well-defined.
But multimodal data contains rich semantic content that relational algebra, by itself, cannot directly interpret. Generative AI integration acts as a semantic bridge. This enables queries that tap into an AI’s capacity to interpret complex signals embedded in multimodal data, allowing it to reason much like humans do, thereby transcending the constraints of traditional data types and SQL functions.
To fully appreciate this evolution, let’s first explore the architectural components that enable these capabilities.
Generative AI in Action
Modern Data to AI platforms allow businesses to interact with data by embedding generative AI capabilities at their core. Instead of ETL pipelines to external services, functions like BigQuery’s AI.GENERATE
and AI.GENERATE_TABLE
allow users to leverage powerful large language models (LLMs) using familiar SQL. These functions combine data from an existing table, along with a user-defined prompt, to an LLM, and returns a response.
Unstructured Text Analysis
Consider an e-commerce business with a table containing millions of product reviews across thousands of items. Manual analysis at this volume to understand customer opinion is prohibitively time-consuming. Instead, AI functions can automatically extract key themes from each review and generate concise summaries. These summaries can offer potential customers quick and insightful overviews.
Multimodal Analysis
And these functions extend beyond non-tabular data. Modern LLMs can extract insights from multimodal data. This data typically lives in cloud object stores like Google Cloud Storage (GCS). BigQuery simplifies access to these objects with ObjectRef
. ObjectRef
columns reside within standard BigQuery tables and securely reference objects in GCS for analysis.
Consider the possibilities of combining structured and unstructured data for the e-commerce example:
- Identify all phones sold in 2024 with frequent customer complaints of “Bluetooth pairing issues” and cross-reference the product user manual (PDF) to see if troubleshooting steps are missing.
- List shipping carriers most frequently associated with “damaged on arrival” incidents for the western region by analyzing customer-submitted photos showing transit-related damage.
To address situations where insights depend on external file analysis alongside structured table data, BigQuery uses ObjectRef
. Let’s see how ObjectRef
enhances a standard BigQuery table. Consider a table with basic product information:
We can easily add an ObjectRef
column named manuals
in this example, to reference the official product manual PDF stored in GCS. This allows the ObjectRef
to live side-by-side with structured data:
This integration powers sophisticated multimodal analysis. Let’s take a look at an example where we generate Q&A pairs using customer reviews (text) and product manuals (PDF):
SQL
SELECT
product_id,
product_name,
question_answer
FROM
AI.GENERATE_TABLE(
MODEL `my_dataset.gemini`,
(SELECT product_id, product_name,
('Use reviews and product manual PDF to generate common question/answers',
customer_reviews,
manuals
) AS prompt,
FROM `my_dataset.reviews_multimodal`
),
STRUCT("question_answer ARRAY" AS output_schema)
);
The prompt argument of AI.GENERATE_TABLE
in this query uses three main inputs:
- A textual instruction to the model to generate common frequently asked questions
- The
customer_reviews
column (a STRING with aggregated textual commentary) - The
manuals ObjectRef
column, linking directly to the product manual PDF
The function uses an unstructured text column and the underlying PDF stored in GCS to perform the AI operation. The output is a set of valuable Q&A pairs that help potential customers better understand the product:
Extending ObjectRef’s Utility
We can easily incorporate additional multimodal assets by adding more ObjectRef
columns to our table. Continuing with the e-commerce scenario, we add an ObjectRef
column called product_image
, which refers to the official product image displayed on the website.
And since ObjectRef
s are STRUCT data types, they support nesting with ARRAYs. This is particularly powerful for scenarios where one primary record relates to multiple unstructured objects. For instance, a customer_images
column could be an array of ObjectRef
s, each pointing to a different customer-uploaded product image stored in GCS.
This ability to flexibly model one-to-one and one-to-many relationships between structured records and various unstructured data objects (within BigQuery and using SQL!) opens analytical possibilities that previously required several external tools.
Type-specific AI Functions
AI.GENERATE
functions offer flexibility in defining output schemas, but for common analytical tasks that require strongly typed outputs, BigQuery provides type-specific AI functions. These functions can analyze text or ObjectRef
s with an LLM and return the response as a STRUCT directly to BigQuery.
Here are a few examples:
- AI.GENERATE_BOOL: processes input (text or ObjectRefs) and returns a BOOL value, useful for sentiment analysis or any true/false determination.
- AI.GENERATE_INT: returns an integer value, useful for extracting numerical counts, ratings, or quantifiable integer-based attributes from data.
- AI.GENERATE_DOUBLE: returns a floating point number, useful for extracting scores, measurements, or financial values.
The primary advantage of these type-specific functions is their enforcement of output data types, ensuring predictable scalar results (e.g. booleans, integers, doubles) from unstructured inputs using simple SQL.
Building upon our e-commerce example, imagine we want to quickly flag product reviews that mention shipping or packaging issues. We can use AI.GENERATE_BOOL
for this binary classification:
SQL
SELECT *
FROM `my_dataset.reviews_table`
AI.GENERATE_BOOL(
prompt => ("The review mentions a shipping or packaging problem", customer_reviews),
connection_id => "us-central1.conn");
The query filters records and returns rows that mention issues with shipping or packaging. Note that we did not have to specify keywords (e.g. “broken”, “damaged”) — this semantic meaning within each review is reviewed by the LLM.
Bringing It All Together: A Unified Multimodal Query
We’ve explored how generative AI enhances data platform capabilities. Now, let’s revisit the e-commerce challenge posed in the introduction: “identify electronics products with high return rates linked to customer photos showing signs of damage upon arrival.” Historically, this required distinct pipelines and often spanned multiple personas (data scientist, data analyst, data engineer).
With integrated AI capabilities, an elegant SQL query can now address this question:
This unified query demonstrates a significant evolution in how data platforms function. Instead of merely storing and retrieving varied data types, the platform becomes an active environment where users can ask business questions and return answers by directly analyzing structured and unstructured data side-by-side, using a familiar SQL interface. This integration offers a more direct path to insights that previously required specialized expertise and tooling.
Semantic Reasoning with AI Query Engine (Coming Soon)
While functions like AI.GENERATE_TABLE
are powerful for row-wise AI processing (enriching individual records or generating new data from them), BigQuery also aims to integrate more holistic, semantic reasoning with AI Query Engine (AIQE).
AIQE’s goal is to empower data analysts, even those without deep AI expertise, to perform complex semantic reasoning across entire datasets. AIQE achieves this by abstracting complexities like prompt engineering and allows users to focus on business logic.
Sample AIQE functions may include:
- AI.IF: for semantic filtering. An LLM evaluates if a row’s data aligns with a natural language condition in the prompt (e.g. “return product reviews that raise concerns about overheating”).
- AI.JOIN: joins tables based on semantic similarity or relationships expressed in natural language — not just explicitly key equality (e.g. “link customer support tickets to relevant sections in your product knowledge base”)
- AI.SCORE: ranks or orders rows by how well they match a semantic condition, useful for “top-k” scenarios (e.g. “find the top 10 best customer support calls”).
Conclusion: The Evolving Data Platform
Data platforms remain in a continuous state of evolution. From origins centered on managing structured, relational data, they now embrace the opportunities presented by unstructured, multimodal data. The direct integration of AI-powered SQL operators and support for references to arbitrary files in object stores with mechanisms like ObjectRef
represent a fundamental shift in how we interact with data.
As the lines between data management and AI continue to converge, the data warehouse stands to remain the central hub for enterprise data — now infused with the ability to understand in richer, more human-like ways. Complex multimodal questions that once required disparate tools and extensive AI expertise can now be addressed with greater simplicity. This evolution toward more capable data platforms continues to democratize sophisticated analytics and allows a broader range of SQL-proficient users to derive deep insights.
To explore these capabilities and start working with multimodal data in BigQuery:
Author: Jeff Nelson, Developer Relations Engineer, Google Cloud