Galileo AI Evals vs Phoenix

Side-by-side comparison of features, pricing, and ratings

Galileo AI Evals

AI observability and eval engineering platform that turns evals into production guardrails.

Visit Website

Phoenix

Open-source platform for AI agent tracing and evaluation

Visit Website

Pricing

Contact Sales

Freemium

Plans

$0/month

$100/month (billed yearly, saves 33%)

$0/mo (open-source)

$0/mo (managed cloud)

Custom

Popularity

6.2k views

7.0k views

Skill Level

Intermediate

API Available

Platforms

WebAPICLI

Categories

💻 Code & Development📊 Data & Analytics🔒 Security & Privacy

💻 Code & Development📊 Data & Analytics

Features

Eval engineering platform for AI systems

20+ out-of-box evals for RAG, agents, safety, security

Custom evaluators to encode domain expertise

Auto-tune metrics from live feedback

Distill LLM judges into compact Luna models

Low-cost production guardrails at 97% lower cost

Pre-production evals become production guardrails

Guardrail policies to block harmful responses

Insights engine for failure mode analysis and prescription

Capture groundtruth from synthetic, dev, and production data

Subject matter expert annotations

Eval scores control agent actions, tool access, escalation

Run guardrails on L4 GPUs

Deployment options: SaaS, VPC, On-Premises

Supports millions of signals (models, prompts, functions, context, datasets, traces)

Agent tracing: prompts, retrievals, tool calls, outputs

LLM-as-judge evaluations for scoring outputs

Annotation workflows with human review

Dataset creation from traces

Experiment runner for hypothesis testing

Prompt IDE for iterative prompt optimization

Evaluation on cost, latency, and performance

Self-hosting on Docker or Kubernetes

Cloud instances with no infrastructure setup

Vendor-agnostic: support for any model or framework

OpenTelemetry native support

Integrations

NVIDIA NeMo

NVIDIA NIM

MongoDB

CrewAI

HP AI Studio

LlamaIndex

OpenTelemetry

NVIDIA NeMo Agent Toolkit

Docker

Kubernetes