This is the final post in a four-part series on building a true agentic analytics platform. Part 1 introduced the three pillars; Part 2 covered data unification; Part 3 covered data meaning and the semantic layer. This post focuses on the third pillar: governed agentic access.
You've unified your data across sources, and you've built a semantic layer that gives agents the context to form accurate queries. That's a strong foundation. But there's a third layer of the agentic analytics stack that's easy to underestimate until something goes wrong.
Governed agentic access is the set of capabilities that determine how agents connect to your data, what they're allowed to do with it, how fast they can get answers, and what they can do with data that isn't neatly structured in a SQL table.
Get this layer wrong, and you end up with one of two failure modes: agents that are properly secured but so slow and friction-heavy that agentic workflows aren't practical, or agents that work fast but with no meaningful access controls, a security and compliance risk no serious enterprise can accept. The goal is both safe and fast, governed and capable.
The Access Control Problem for Agents
When a human analyst runs a query in a BI tool, access control is relatively straightforward. They authenticate with their credentials, and the platform applies their permissions. They might be restricted to specific schemas, rows within a table, or columns that mask PII. This is standard role-based access control (RBAC), and most platforms handle it reasonably well.
AI agents complicate this picture in several ways.
First, agents often run as service accounts rather than as individual users. A service account that runs "as the analytics platform" has no natural permission boundary. It can potentially query anything the platform has access to. Without careful configuration, an agent connected to a sales analyst's service account could, in theory, return HR data it was never intended to touch.
Second, agents execute queries autonomously. When a human analyst runs a query and gets data they weren't expecting, they notice. When an agent queries data outside its intended scope and incorporates it into a downstream workflow, the violation might not surface until significant damage is done.
Third, the query volume from agents is different from human query volume. Agents can generate many queries in rapid succession as part of a multi-step reasoning process. A governance model designed for human query patterns might not be designed to handle this kind of workload, and a poorly governed high-volume query pattern is a compliance auditor's nightmare.
The governance model for agentic analytics has to extend cleanly from the human analytics case to cover agents specifically. That means row-level security that applies based on the credentials and role the agent is operating under, column masking that prevents agents from returning PII they're not authorized to handle, and policy enforcement that's consistent regardless of whether the caller is a human analyst or an LLM running a tool call.
Dremio's access control model applies the same governance rules to every query, regardless of the client. An agent connecting via Dremio's MCP Server inherits the permissions of the credential it authenticates with, not a blanket service account. Row-level policies, column masks, and schema-level access restrictions all apply. The platform doesn't have a separate "agent mode" with looser rules.
Try Dremio’s Interactive Demo
Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI
MCP: Purpose-Built Connectivity for AI Agents
For years, connecting an AI agent to a data platform meant building a custom integration: a REST API wrapper, a custom tool definition, connection pooling, error handling, and retry logic. Every agent framework did this differently. The result was a fragmented ecosystem where each new AI tool or framework required its own connector built from scratch.
Figure 1: Every agent query flows through Dremio's MCP Server, through access controls, through the semantic layer, and out to the right data source, governed at every step.
The Model Context Protocol (MCP) changes this. MCP is an open standard for how AI agents discover and interact with external tools and services. A platform that implements an MCP Server exposes its capabilities as a set of structured tools that any MCP-compatible AI agent (Claude, LangChain, Codex, or custom frameworks) can call without custom integration work.
For data platforms, MCP is particularly well-suited because it maps cleanly onto the workflow agents need to follow: discover what data is available, understand its structure, compose a query, run it, and interpret results. These are well-defined steps, and MCP provides the protocol for each.
Dremio's MCP Server exposes exactly this set of tools:
RunSqlQuery: The agent passes a SQL query (or a natural language question that Dremio translates to SQL) and gets back query results. Governance rules apply at the platform layer, before results are returned.
GetSchemaOfTable: The agent can inspect the structure of any table it has access to, including column names, types, and semantic descriptions from Dremio's AI Semantic Layer. This is how agents orient themselves to a new dataset without hallucinating schema details.
RunSemanticSearch: The agent can search across Dremio's semantic layer to find datasets relevant to a question, even when it doesn't know the exact table name. This is crucial for navigating large, complex data environments where the agent can't know every table in advance.
The practical impact of MCP is that an agent can connect to Dremio and start answering complex questions with no custom integration code. The AI framework calls the MCP tool; Dremio handles authentication, query execution, governance enforcement, and result formatting. For teams building agentic applications, this cuts integration time from weeks to hours.
The open-standards approach matters here, too. MCP is not a Dremio-proprietary protocol. It's an emerging standard backed by Anthropic and adopted across the AI developer ecosystem. An investment in MCP connectivity to Dremio isn't locked to any particular AI framework. As the ecosystem evolves, the same connection point works with new agents and new tools.
AI SQL Functions: Bringing Unstructured Data Into the Agent's Reach
Figure 2: AI SQL Functions let agents analyze unstructured text (ticket bodies, reviews, documents) inside the same SQL query as structured data. No separate pipeline, no context switching.
Here's a practical challenge that agentic analytics quickly surfaces: much of enterprise data isn't structured.
Support ticket narratives. Customer feedback surveys. Contract documents stored as PDFs. Product review comments. Log files with free-text error messages. Sales call transcripts. These datasets contain real signals, often exactly the signal that makes the difference between a good business question and a great one. But they don't fit neatly into SQL tables with typed columns.
The traditional approach is to run a preprocessing pipeline: extract text, chunk it, embed it, store it in a vector database, and query it separately from the structured data. This works, but it means the agent has to switch contexts: run SQL for structured queries, run vector search for unstructured content, and then combine the results. It's complex to build and fragile at runtime.
AI SQL Functions handle this differently. Instead of moving unstructured data analysis outside the SQL layer, they bring AI capabilities inside it. An AI function is a SQL function that calls an LLM on a column value at query time.
Consider a table of customer support tickets with a ticket_body text column. A standard SQL query can count tickets or filter by timestamp. An AI SQL function can classify each ticket by topic, extract a sentiment score, or determine whether it represents a bug report or a feature request, all within a single SQL statement that an agent can compose naturally.
SELECT
ticket_id,
customer_id,
AI_CLASSIFY(ticket_body, ['billing', 'technical', 'product_feedback', 'other']) AS category,
AI_SENTIMENT(ticket_body) AS sentiment_score
FROM support_tickets
WHERE created_date >= CURRENT_DATE - INTERVAL '30' DAY;
For an agent, this is significant. It means unstructured data becomes a first-class citizen of the query environment, accessible through the same SQL interface the agent uses for everything else, with no context switching, no separate tool call, no custom glue code. The agent can join structured and unstructured analysis in a single query.
Dremio's AI Functions bring this capability directly into the SQL layer. Agents querying Dremio can apply LLM-based analysis to text columns, PDFs, logs, and other unstructured content as part of normal query composition. The results can be joined to structured tables, filtered, aggregated, and returned like any other SQL output.
Autonomous Performance: Governance Without the Speed Tax
There's an implicit tension in governed agentic access: security and governance add overhead. Authentication checks, row-level policy evaluation, column masking: all of this takes compute. In a human analytics context, where individual queries are relatively infrequent and users can wait a few seconds, this overhead is acceptable. In an agentic context, where a single user-facing response might require five or ten underlying queries executed in rapid succession, it compounds.
Sub-second query latency isn't a nice-to-have for agentic analytics. It's a requirement for conversational workflows to feel natural. A multi-step agent workflow that takes 30 seconds per query step will take minutes to complete, which breaks the user experience even if the final answer is correct.
Autonomous Reflections address this. As described in Part 2, Reflections are smart materializations that Dremio builds and maintains automatically based on observed query patterns. An agent that runs a particular query pattern repeatedly will benefit from an automatically created acceleration structure the next time, without any configuration from a data engineer.
The key property for agentic access is that Reflections are transparent. The agent continues to query the same tables with the same SQL. The platform routes the query to the Reflection when one is available. From the agent's perspective, nothing has changed. From the user's perspective, answers arrive much faster.
Governance rules still apply, on top of Reflections. The materialized data is subject to the same access controls as the underlying tables. A Reflection built on a table with row-level security can't be used to bypass that security for a credential that shouldn't see the restricted rows.
Figure 3: Autonomous Reflections require zero configuration. Dremio detects patterns, builds the acceleration structure, and routes future queries transparently. Agents see the same SQL, just much faster results.
Putting All Three Pillars Together
This series has walked through the three layers of a complete agentic analytics platform:
Data Unification gives agents comprehensive access to enterprise data across clouds, on-premises systems, and data lakes without ETL pipelines or data movement. Federation means no source is off-limits. The lakehouse means core data is managed, versioned, and performant.
Data Meaning gives agents the business context to form accurate queries: metric definitions, table documentation, synonym resolution, and relationship encoding. Without this layer, agents have access to data but do not understand.
Governed Agentic Access gives agents a safe, fast, purpose-built interface to the platform. Access controls extend cleanly to agent workloads. MCP removes integration friction. AI SQL Functions bring unstructured data analysis into the same query layer. Autonomous Reflections ensure that governance overhead doesn't come at the cost of prohibitive latency.
Dremio delivers all three as a single integrated platform, not as separate products stitched together, but as an architecture designed with agentic workloads in mind from the ground up.
The result is what Dremio calls The Agentic Lakehouse: a data platform where AI agents can discover, query, analyze, and act on enterprise data reliably, accurately, and safely, at the speed users need for real agentic workflows.
If you're building or planning an agentic analytics capability, the three-pillar framework in this series gives you a useful lens for evaluating what your current platform does well and where the gaps are. Most platforms have one pillar; some have two. Getting to all three is what makes the difference between a demo that works and a production system that earns trust.
Ingesting Data Into Apache Iceberg Tables with Dremio: A Unified Path to Iceberg
By unifying data from diverse sources, simplifying data operations, and providing powerful tools for data management, Dremio stands out as a comprehensive solution for modern data needs. Whether you are a data engineer, business analyst, or data scientist, harnessing the combined power of Dremio and Apache Iceberg will undoubtedly be a valuable asset in your data management toolkit.
Oct 12, 2023·Product Insights from the Dremio Blog
Table-Driven Access Policies Using Subqueries
This blog helps you learn about table-driven access policies in Dremio Cloud and Dremio Software v24.1+.
Aug 31, 2023·Dremio Blog: News Highlights
Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud
Dremio Arctic bring new features to Dremio Cloud, including Apache Iceberg table optimization and Data as Code.