Open-source|Freemium

CVAT Review

CVAT is the go-to open-source choice for computer vision teams who want control and flexibility, though you'll need technical resources to self-host or accept the limitations of the free cloud tier.

Monthly Visitors
6.7K
Pricing
Open-source|Freemium
Free (self-hosted) / $23/mo+
Data Types
Images, Video, 3D/LiDAR
Best For
Computer vision teams wanting open-source flexibility with optional managed cloud hosting

What is CVAT?

CVAT (Computer Vision Annotation Tool) is the most widely adopted open-source annotation platform for computer vision. Originally developed by Intel and now backed by the OpenCV Foundation, it has over 14,000 GitHub stars and a 4.8/5 rating on G2. The platform supports images, video, and 3D point clouds with annotation types including bounding boxes, polygons, polylines, points, skeletons, and cuboids.

What sets CVAT apart is the combination of open-source flexibility with professional cloud options. You can self-host for free with full functionality, or use their cloud service starting at $23/month. The platform exports to 19 formats (YOLO, COCO, PASCAL VOC, KITTI, and more) which avoids lock-in to any specific ML framework. AI-assisted annotation with SAM 2 integration and auto-annotation can speed up labeling significantly. The tradeoff is a steeper learning curve than simpler tools, and performance can suffer with very large files. For computer vision teams who want control over their annotation infrastructure without paying enterprise prices, CVAT is the default choice.

Key Features

  • Open-source with 14K+ GitHub stars and active community
  • Auto-annotation with AI models (up to 10x faster labeling)
  • SAM 2/3 segmentation integration
  • 19 export formats: PASCAL VOC, YOLO, COCO, KITTI, and more
  • 3D point cloud and cuboid annotation
  • Cloud storage integration: AWS S3, Google Cloud, Azure Blob
  • Three deployment options: cloud, self-hosted, or enterprise on-premise
  • Backed by the OpenCV Foundation

Pros & Cons

Pros

  • + Truly open-source: self-host for free with full functionality
  • + 4.8/5 G2 rating — highest-rated open-source annotation tool
  • + Extensive export format support (19 formats) avoids vendor lock-in
  • + 3D/LiDAR annotation included, unlike many competitors
  • + Active community and continuous development (14K+ GitHub stars)
  • + Cloud plans more affordable than enterprise competitors ($23/mo starting)

Cons

  • Performance degrades with very large video files or thousands of images
  • Steeper learning curve — complex interface for beginners
  • Self-hosted deployment requires technical expertise to maintain
  • Free cloud tier is very limited: 1 project, 3 tasks, 1GB storage
  • No in-app notifications when annotators complete tasks
  • Computer vision only — no text, audio, or document annotation

Pricing

Pricing model: Open-source|Freemium

Free (Cloud) $0 (1 project, 3 tasks, 1GB)
Solo $23-33/month
Team $23-33/user/month
Enterprise ~$12,000/year
Self-Hosted Free (open-source)

Who Is CVAT Best For?

CVAT is ideal for computer vision teams who value open-source flexibility and want to avoid vendor lock-in. The self-hosted option is perfect for organizations with security requirements that prevent cloud storage of training data, or teams with engineering resources who want full control. The cloud plans work well for smaller teams who want managed infrastructure without enterprise pricing. CVAT is less suited for teams needing multi-modal annotation (text, audio, documents) — use Label Studio or Labelbox instead. It's also not ideal for non-technical teams without engineering support to handle self-hosted deployment, or organizations needing extensive managed services and dedicated support.

Frequently Asked Questions

Is CVAT free?
Yes, in two ways. You can self-host CVAT for free using the open-source version with full functionality. The cloud version also has a free tier, but it's limited to 1 project, 3 tasks, 1GB storage, and 100 AI calls per month.
What data types does CVAT support?
CVAT supports images, video, and 3D data (point clouds, LiDAR, cuboids). It's specialized for computer vision and does not support text, audio, or document annotation.
How does CVAT compare to Label Studio?
Both are popular open-source options. CVAT is specialized for computer vision with better 3D/video support and more CV-specific export formats (YOLO, COCO, KITTI). Label Studio supports more data types (text, audio, HTML) and is more flexible for non-CV use cases. Choose CVAT for pure computer vision, Label Studio for multi-modal projects.
Is CVAT good for large datasets?
CVAT handles large datasets well in most cases, but users report performance issues with very large video files or projects containing thousands of images. For enterprise-scale workloads, consider CVAT Enterprise or ensure your self-hosted infrastructure has adequate resources.
What export formats does CVAT support?
CVAT supports 19 export formats including PASCAL VOC, YOLO, COCO, KITTI, CVAT XML, LabelMe, ImageNet, CamVid, and more. This broad format support makes it easy to integrate with most ML frameworks.
Does CVAT have AI-assisted labeling?
Yes. CVAT includes auto-annotation that can speed up labeling by up to 10x using integrated AI models or your own custom models. The cloud version also includes SAM 2/3 segmentation for interactive AI-assisted annotation.
Who maintains CVAT?
CVAT is the official annotation tool supported by the OpenCV Foundation. It was originally developed by Intel and is now maintained by cvat.ai with an active open-source community contributing to development.

Alternatives to CVAT

Appen
Enterprise

Enterprise teams needing high-volume, multi-language labeling with managed workforce

Roboflow
Freemium

Computer vision teams wanting fast AI-assisted annotation with training and deployment built in

Scale AI
Enterprise

Frontier AI labs and enterprises needing LLM training data, RLHF, or autonomous vehicle annotation at scale

V7 Labs
Enterprise

Teams needing AI-assisted annotation for images, video, or medical imaging with compliance requirements

Labelbox
Freemium

Teams needing multimodal annotation with a strong free tier and path to enterprise scale

Labellerr
Freemium

Small teams wanting AI-assisted annotation with transparent pricing and no minimum commitment

Dataloop
Enterprise

Enterprise teams building production AI pipelines who need annotation, model training, and deployment in one platform

Label Studio
Open-source

Teams needing multi-modal annotation flexibility who can invest time in template configuration

Supervisely
Enterprise

Computer vision teams needing specialized support for medical imaging, LiDAR, or 3D data with built-in AI models

Deepen AI
Enterprise

Autonomous vehicle and robotics teams needing LiDAR annotation with integrated multi-sensor calibration

Tasq.ai
Enterprise

Enterprise teams deploying production LLMs who need human-in-the-loop evaluation and hallucination detection

Amazon SageMaker Ground Truth
Enterprise

AWS-native ML teams who want a managed labeling service integrated with SageMaker training pipelines

Hasty.ai (CloudFactory)
Enterprise

Computer vision teams who want AI-assisted annotation combined with optional managed workforce services

SuperAnnotate
Freemium

Enterprise teams needing multimodal annotation with strong compliance, custom workflows, and optional managed labeling services

Encord
Enterprise

Enterprise teams building physical AI (robotics, autonomous vehicles) or medical AI who need multimodal annotation with 3D/LiDAR and DICOM support

Snorkel AI
Enterprise

Large enterprises with dedicated AI teams who want to replace manual labeling with programmatic weak supervision for text and structured data