Snowflake Cortex: AI and LLM Functions in Your Data Warehouse

How Snowflake Cortex brings large language model capabilities directly into SQL — summarisation, sentiment analysis, translation, classification, and document extraction — without moving data out of Snowflake or managing AI infrastructure.

Snowflake Cortex represents a significant architectural shift in how organisations can use AI in their analytics workflows. Rather than extracting data from Snowflake, sending it to an external AI API, and bringing results back, Cortex brings AI functions directly into Snowflake SQL — allowing you to run language model inference on your data in place, without data movement, without API integrations, and with the same access controls that govern your existing Snowflake data.

For data teams that have been watching the rise of LLMs and wondering how to apply them to their analytical data without building complex infrastructure, Cortex is the pragmatic answer: SQL functions that call large language models as naturally as CONCAT() or DATEDIFF().

What Snowflake Cortex Is

Snowflake Cortex is a set of AI services built into Snowflake. It has two main components:

**Cortex LLM Functions**: SQL functions that call large language models. You pass text (or a text column) to the function and get a result back. These are standard Snowflake SQL functions — they can be used in SELECT statements, WHERE clauses, CTEs, and dbt models exactly like any other SQL function.

**Cortex AI**: a higher-level interface for building conversational AI applications on Snowflake data, including a natural language to SQL capability (Cortex Analyst) and document processing pipelines. This is more relevant for application developers building analytics applications than for data engineers writing batch pipelines.

This guide focuses on the LLM functions, which are the most immediately applicable for analytics engineering and BI workloads.

Core LLM Functions

**SNOWFLAKE.CORTEX.COMPLETE**: the foundational function. Pass a model name and a prompt, get a text response. Used for freeform text generation — summarisation, classification, transformation.

SELECT

SNOWFLAKE.CORTEX.COMPLETE(

'mistral-large',

CONCAT('Classify this customer feedback as Positive, Neutral, or Negative. Return only the word. Feedback: ', review_text)

) AS sentiment

FROM customer_reviews

WHERE review_date >= CURRENT_DATE - 30;

**SNOWFLAKE.CORTEX.SENTIMENT**: a purpose-built sentiment analysis function that returns a sentiment score between -1 (most negative) and 1 (most positive) without requiring prompt engineering.

SELECT

review_id,

review_text,

SNOWFLAKE.CORTEX.SENTIMENT(review_text) AS sentiment_score

FROM customer_reviews;

**SNOWFLAKE.CORTEX.SUMMARIZE**: generates a concise summary of a longer text.

SELECT

ticket_id,

SNOWFLAKE.CORTEX.SUMMARIZE(support_transcript) AS case_summary

FROM support_tickets

WHERE LENGTH(support_transcript) > 1000;

**SNOWFLAKE.CORTEX.TRANSLATE**: translates text between languages.

**SNOWFLAKE.CORTEX.EXTRACT_ANSWER**: extracts a specific answer from a text given a question — useful for extracting structured information from unstructured documents.

SELECT

document_id,

SNOWFLAKE.CORTEX.EXTRACT_ANSWER(

contract_text,

'What is the contract termination notice period?'

) AS notice_period

FROM contracts;

Practical Use Cases

**Customer feedback analysis at scale.** Most organisations have large backlogs of unstructured customer feedback — support tickets, survey responses, reviews — that are not systematically analysed because doing so manually is impractical. Cortex makes batch sentiment analysis and topic classification a single SQL query.

**Document information extraction.** Contracts, invoices, and compliance documents contain structured information buried in unstructured text. EXTRACT_ANSWER can pull specific fields from document text at scale — contract values, renewal dates, governing law provisions — making document data available for analytical queries without manual extraction.

**Support ticket classification and routing.** Automatically classify support tickets by issue type, urgency, and product area using COMPLETE with a classification prompt. Use the classifications to build dashboards, route tickets to the right team, or feed training data for a fine-tuned model.

**Product catalogue enrichment.** Product descriptions sourced from multiple suppliers vary in format, completeness, and language. SUMMARIZE and TRANSLATE produce normalised descriptions; COMPLETE can extract structured attributes (dimensions, materials, compatibility) from free-text descriptions.

**Meeting notes and call transcript processing.** Sales call transcripts and meeting notes can be summarised (SUMMARIZE), action items extracted (EXTRACT_ANSWER), and customer sentiment classified (SENTIMENT) — automatically enriching CRM data from unstructured conversation records.

Available Models

Cortex LLM functions support multiple models, each with different price, capability, and latency characteristics. As of 2025, available models include:

- **mistral-large** and **mistral-7b**: Mistral AI models, good balance of capability and cost

- **llama3-70b** and **llama3-8b**: Meta's Llama 3, strong general-purpose capability

- **mixtral-8x7b**: Mistral's mixture-of-experts model, cost-efficient for many tasks

- **reka-flash** and **reka-core**: Reka models with strong multimodal capabilities

- **arctic-embed-m**: Snowflake's embedding model for semantic search use cases

Model selection involves tradeoffs: larger models produce better results but have higher per-token costs and higher latency. For classification tasks with well-crafted prompts, smaller models (mistral-7b, llama3-8b) often match larger models at a fraction of the cost. Test with the task before committing to a model.

Cost and Performance Considerations

Cortex LLM functions are charged per token processed — the length of the input and output text determines the cost. For large-scale batch processing (millions of records), token costs accumulate quickly. Before running Cortex functions at full scale:

- Test with a sample (1,000–10,000 rows) to validate result quality and estimate token consumption

- Use the smallest model that produces acceptable quality for the specific task

- Keep prompts concise — every word in a system prompt is charged on every row

- For classification tasks, consider whether a purpose-built function (SENTIMENT) is cheaper than COMPLETE

Cortex queries run on Snowflake's AI compute capacity, separate from your warehouse compute. They do not consume warehouse credits. Response times vary by model and load — expect 1–5 seconds per function call in typical workloads. For real-time use cases, batch processing overnight and querying cached results is more reliable than on-demand inference.

Security and Governance

Cortex inherits Snowflake's access control model. Running SENTIMENT on a customer reviews table requires the same data access permissions as any other query on that table. There is no additional access grant required to use Cortex functions — users who can read the data can apply Cortex functions to it.

Data processed by Cortex LLM functions stays within Snowflake. Snowflake's published architecture (as of 2025) routes data to the model inference layer without leaving the Snowflake environment. For organisations with strict data residency requirements, verify Cortex's data processing model against your compliance requirements before adoption.

For data engineering that incorporates AI-powered analytics and Snowflake architecture design, our data architecture consulting team can advise on AI readiness and Cortex implementation — contact us to discuss your requirements.

Get your data architecture audit in 30 minutes.

A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.

Book a Call →