MinIO - High-Performance Object Storage
What is MinIO?
MinIO is a high-performance, S3-compatible object storage system designed for cloud-native and containerized environments. It’s used within Oversight as the primary storage backend for Langfuse and can be integrated with other components for scalable data storage.
Key Features
🚀 High Performance
- Sub-millisecond latency for object operations
- Multi-threaded for concurrent access
- Read-after-write consistency
- Optimized for NVMe storage
☁️ S3 Compatible
- Full S3 API compatibility
- Works with existing S3 tools and SDKs
- Easy migration from AWS S3
- Standard AWS SDK integration
🔒 Security First
- Encryption at-rest and in-transit
- IAM-style access control
- Multi-tenancy support
- Audit logging
- Versioning support
📊 Enterprise Features
- Distributed mode for high availability
- Erasure coding for data protection
- Replication across sites
- Lifecycle management
- Event notifications
🎯 Cloud Native
- Kubernetes native
- Containerized deployment
- Horizontal scaling
- Stateless design
Installation in Oversight
MinIO is automatically installed as part of the Langfuse stack:
cd langfuse
docker compose up -dMinIO is configured with:
- API endpoint:
http://localhost:9090 - Console:
http://localhost:9091 - Default credentials:
minio/miniosecret
Accessing MinIO Console
- Open browser to
http://localhost:9091 - Login with credentials:
- Username:
minio - Password:
miniosecret
- Username:
- View buckets, objects, and metrics
Buckets in Oversight
MinIO automatically creates the langfuse bucket with prefixes:
- events/: LLM trace events
- media/: Uploaded media files
- exports/: Batch export data
Configuration
Environment Variables
MinIO in Langfuse is configured via environment variables:
# Event Upload (Internal)
LANGFUSE_S3_EVENT_UPLOAD_BUCKET=langfuse
LANGFUSE_S3_EVENT_UPLOAD_ENDPOINT=http://minio:9000
LANGFUSE_S3_EVENT_UPLOAD_ACCESS_KEY_ID=minio
LANGFUSE_S3_EVENT_UPLOAD_SECRET_ACCESS_KEY=miniosecret
# Media Upload (External)
LANGFUSE_S3_MEDIA_UPLOAD_BUCKET=langfuse
LANGFUSE_S3_MEDIA_UPLOAD_ENDPOINT=http://localhost:9090
LANGFUSE_S3_MEDIA_UPLOAD_ACCESS_KEY_ID=minio
LANGFUSE_S3_MEDIA_UPLOAD_SECRET_ACCESS_KEY=miniosecretStandalone Installation
For standalone MinIO deployment:
# Using Docker
docker run -d \
-p 9000:9000 \
-p 9001:9001 \
--name minio \
-e "MINIO_ROOT_USER=admin" \
-e "MINIO_ROOT_PASSWORD=adminpassword" \
-v /mnt/data:/data \
quay.io/minio/minio server /data --console-address ":9001"SDK Integration
Python
pip install miniofrom minio import Minio
client = Minio(
"localhost:9090",
access_key="minio",
secret_key="miniosecret",
secure=False
)
# Upload object
client.fput_object(
"mybucket",
"myobject.txt",
"/path/to/file.txt"
)
# Download object
client.fget_object(
"mybucket",
"myobject.txt",
"/path/to/download.txt"
)JavaScript
npm install minioconst Minio = require('minio');
const minioClient = new Minio.Client({
endPoint: 'localhost',
port: 9090,
useSSL: false,
accessKey: 'minio',
secretKey: 'miniosecret'
});
// Upload file
minioClient.fPutObject('mybucket', 'myobject.txt', '/path/to/file.txt');AWS SDK (S3 Compatible)
import boto3
s3 = boto3.client(
's3',
endpoint_url='http://localhost:9090',
aws_access_key_id='minio',
aws_secret_access_key='miniosecret'
)
# Upload
s3.upload_file('/path/to/file.txt', 'mybucket', 'myobject.txt')
# Download
s3.download_file('mybucket', 'myobject.txt', '/path/to/download.txt')Use Cases in Oversight
LLM Trace Storage
Store detailed traces and events from Langfuse for long-term analysis.
Media File Storage
Handle images, audio, and video files used in LLM interactions.
Data Lake Storage
Build a data lake for analytics with DataHub metadata.
Backup Storage
Store backups of databases and configurations.
Artifact Storage
Store ML model artifacts, datasets, and experiment results.
Advanced Features
Bucket Policies
Create fine-grained access policies:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {"AWS": ["*"]},
"Action": ["s3:GetObject"],
"Resource": ["arn:aws:s3:::public-bucket/*"]
}
]
}Lifecycle Rules
Automatically expire or transition objects:
<LifecycleConfiguration>
<Rule>
<ID>expire-old-files</ID>
<Status>Enabled</Status>
<Expiration>
<Days>90</Days>
</Expiration>
</Rule>
</LifecycleConfiguration>Event Notifications
Trigger webhooks on object events:
mc event add local/mybucket arn:minio:sqs::primary:webhook \
--event put,deleteReplication
Set up cross-region replication:
mc replicate add local/source-bucket \
--remote-bucket remote-bucket \
--arn arn:minio:replication::replica:destMonitoring
Metrics
MinIO exposes Prometheus-compatible metrics:
- Request rate
- Throughput
- Latency percentiles
- Error rates
- Storage capacity
Health Checks
# Liveness check
curl http://localhost:9090/minio/health/live
# Readiness check
curl http://localhost:9090/minio/health/readyIntegration with Other Components
With DataHub
Store large dataset artifacts and reference them in DataHub metadata.
With Langfuse
Automatic integration for event and media storage (already configured).
With Applications
Use as S3-compatible storage for any application data.
Management Tools
MinIO Client (mc)
# Install mc
brew install minio/stable/mc
# Configure alias
mc alias set local http://localhost:9090 minio miniosecret
# List buckets
mc ls local
# Copy files
mc cp myfile.txt local/mybucket/
# Mirror directories
mc mirror /local/dir local/mybucket/prefixWeb Console
Access the browser-based console at http://localhost:9091 for:
- Bucket management
- Object browsing
- User management
- Monitoring dashboards
- Configuration
Performance Tuning
Storage Configuration
# Use multiple drives for better performance
minio server /data{1...4}Network Optimization
# Increase parallel transfers
export MINIO_API_REQUESTS_MAX=1000Cache Settings
# Enable cache for frequently accessed objects
export MINIO_CACHE="on"
export MINIO_CACHE_DRIVES="/mnt/cache"Security Best Practices
- Change default credentials immediately
- Enable HTTPS in production
- Use IAM policies for access control
- Enable encryption at-rest
- Regular security updates
- Audit logging enabled
- Network isolation where possible
High Availability
Distributed Mode
Deploy MinIO in distributed mode:
# 4-node cluster
minio server \
http://host{1...4}/data{1...4}Kubernetes Deployment
# Using MinIO Operator
kubectl apply -k github.com/minio/operator
# Create tenant
kubectl apply -f minio-tenant.yamlBackup & Recovery
Backup
# Mirror bucket to backup location
mc mirror local/mybucket backup/mybucketRestore
# Restore from backup
mc mirror backup/mybucket local/mybucket