ComponentsLangfuse - LLM Observability

Langfuse - LLM Observability Platform

What is Langfuse?

Langfuse is an open-source LLM engineering platform designed to help teams debug, analyze, and iterate on their LLM applications. It provides comprehensive observability, analytics, and prompt management capabilities for large language model applications.

Key Features

🔍 Tracing & Observability

Complete visibility into LLM application behavior:

  • Detailed traces of all LLM calls
  • Nested spans for complex workflows
  • Request/response logging
  • Latency tracking
  • Error monitoring

📝 Prompt Management

Centralized prompt engineering:

  • Version control for prompts
  • A/B testing capabilities
  • Template management
  • Collaboration tools
  • Production deployment

📊 Analytics & Metrics

Comprehensive performance insights:

  • Cost tracking per model and user
  • Latency analysis with percentiles
  • Usage statistics and trends
  • Model comparison
  • Custom dashboards

✅ Evaluation & Testing

Quality assurance for LLM outputs:

  • Manual scoring interface
  • Automated evaluations
  • Test dataset management
  • Regression testing
  • Quality metrics

🎯 User Segmentation

Understand user behavior:

  • User-level analytics
  • Session tracking
  • Cohort analysis
  • Feedback collection

Architecture

Langfuse consists of several integrated services:

  • Web Application: User interface for visualization
  • Worker Service: Background processing
  • PostgreSQL: Primary data store
  • ClickHouse: Analytics database
  • Redis: Caching and queuing
  • MinIO: Object storage

Quick Start

# Clone repository
git clone https://github.com/langfuse/langfuse
cd langfuse
 
# Start all services
docker compose up -d
 
# Access at http://localhost:3000

Integration Options

Python

pip install langfuse
from langfuse import Langfuse
 
langfuse = Langfuse()
trace = langfuse.trace(name="llm-query")

JavaScript/TypeScript

npm install langfuse
import { Langfuse } from "langfuse";
 
const langfuse = new Langfuse();
const trace = langfuse.trace({ name: "llm-query" });

OpenAI

from langfuse.openai import openai
 
# Automatic tracking
response = openai.ChatCompletion.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}]
)

LangChain

from langfuse.callback import CallbackHandler
 
handler = CallbackHandler()
 
# Use with LangChain
chain.run(input="Hello", callbacks=[handler])

Use Cases

Production Monitoring

Monitor LLM applications in production with real-time alerts and dashboards.

Cost Optimization

Track and optimize LLM costs across models, users, and use cases.

Quality Assurance

Evaluate output quality and catch regressions before they reach users.

Prompt Engineering

Iterate on prompts with version control and A/B testing.

Debugging

Quickly identify and fix issues in complex LLM workflows.

Compliance

Maintain audit logs of all LLM interactions for regulatory compliance.

Key Concepts

Traces

Top-level container for a complete LLM interaction or workflow.

Spans

Nested steps within a trace (e.g., retrieval, processing, generation).

Generations

LLM API calls with input, output, and metadata.

Scores

Quality ratings (manual or automated) for evaluating outputs.

Datasets

Collections of test cases for evaluation and regression testing.

Integration with Oversight

Langfuse seamlessly integrates with other Oversight components:

  • Keycloak: SSO and user authentication
  • MinIO: Object storage for events and media
  • DataHub: Cross-reference with data lineage

Supported Models

Langfuse works with all major LLM providers:

  • OpenAI (GPT-3.5, GPT-4, GPT-4-turbo)
  • Anthropic (Claude)
  • Google (PaLM, Gemini)
  • Cohere
  • Hugging Face
  • Azure OpenAI
  • Custom models

Supported Frameworks

  • LangChain: Native callback integration
  • LlamaIndex: Built-in observability
  • Haystack: Custom integration
  • AutoGPT: Agent monitoring
  • Raw API calls: Direct SDK integration

Dashboard Features

Overview

  • Request volume over time
  • Cost trends
  • Latency percentiles
  • Error rates

Traces

  • Searchable trace list
  • Detailed trace viewer
  • Span visualization
  • Timeline view

Generations

  • Model usage statistics
  • Token consumption
  • Cost per generation
  • Performance metrics

Prompts

  • Prompt library
  • Version history
  • Usage statistics
  • Performance comparison

Users

  • User-level analytics
  • Session tracking
  • Feedback scores
  • Usage patterns

Security & Privacy

  • Self-hosted: Full data control
  • Encryption: At-rest and in-transit
  • Access control: Role-based permissions
  • Audit logs: Complete activity tracking
  • Data retention: Configurable policies

Pricing

Langfuse is 100% open-source and free to self-host. For managed cloud hosting, see Langfuse Cloud.

Resources

Next Steps