MapAnything V1.1: Metric 3D Reconstruction for Gaussian Splatting Workflows

Research Date: 2026-01-29 Source URL: https://x.com/nik__v__/status/2016556649092165792

Reference URLs

Summary

MapAnything V1.1, released January 18, 2026, is a unified transformer-based feed-forward model for metric 3D reconstruction developed by Meta Reality Labs and Carnegie Mellon University. The system directly regresses 3D scene geometry and camera parameters from images, producing output compatible with Gaussian splatting pipelines through COLMAP format export. This release introduces a model factory supporting multiple reconstruction backends, improved checkpoints, and practical tooling for COLMAP integration.

For users seeking to test MapAnything against their own data with minimal friction, two primary paths exist: a zero-setup Hugging Face demo requiring no local installation, and a local deployment with conda environment setup. Both paths can produce COLMAP-compatible output suitable for downstream Gaussian splatting training.

Relationship to Gaussian Splatting Workflows

Traditional 3DGS Pipeline

Standard 3D Gaussian Splatting (3DGS) workflows follow this sequence:

flowchart LR CaptureImages[Capture Images] --> RunCOLMAP[Run COLMAP SfM] RunCOLMAP --> TrainGaussians[Train Gaussian Splatting] TrainGaussians --> RenderNovel[Render Novel Views]

COLMAP performs Structure-from-Motion (SfM) to estimate camera poses and sparse point clouds. This step can take minutes to hours depending on image count and requires careful parameter tuning for challenging scenes.

MapAnything as COLMAP Replacement

MapAnything operates as a feed-forward alternative to iterative SfM:

flowchart LR CaptureImages[Capture Images] --> RunMapAnything[Run MapAnything] RunMapAnything --> ExportCOLMAP[Export COLMAP Format] ExportCOLMAP --> TrainGaussians[Train Gaussian Splatting] TrainGaussians --> RenderNovel[Render Novel Views]

Key differences from traditional COLMAP:

Aspect	COLMAP	MapAnything
Processing Model	Iterative optimization	Single feed-forward pass
Speed	Minutes to hours	Under 1 second for up to 50 views
Metric Scale	Up-to-scale reconstruction	True metric scale via learned prior
Failure Modes	Feature matching failures, drift	Out-of-distribution inputs
GPU Requirements	Primarily CPU-bound	Requires CUDA GPU

Output Format Compatibility

MapAnything exports standard COLMAP format:

output_dir/
├── images/           # Processed images at model resolution
│   ├── img1.jpg
│   └── img2.jpg
└── sparse/
    ├── cameras.bin   # Camera intrinsics
    ├── images.bin    # Camera extrinsics
    ├── points3D.bin  # Sparse point cloud
    └── points.ply    # Point cloud in PLY format

This output structure is directly compatible with:

Official 3DGS implementation
gsplat library
Nerfstudio
Instant-NGP
Any tool accepting COLMAP sparse reconstruction

Practical Usage Options

Option 1: Zero-Setup Testing via Hugging Face

The Hugging Face Space provides immediate access without local installation:

URL: https://huggingface.co/spaces/facebook/map-anything

Workflow:

Upload video or images via drag-and-drop
Adjust video sample interval if using video input
Click “Reconstruct”
View 3D reconstruction with depth, normal, and measurement visualizations
Download GLB file for external use

Limitations:

No COLMAP format export from the web demo
Limited to image-only inference mode
Queue times during high demand
Maximum input constraints imposed by Hugging Face Zero GPU allocation

Best for: Quick validation of reconstruction quality on custom data before committing to local setup.

Option 2: Local Installation for Full Pipeline

Local deployment provides COLMAP export and full feature access.

Minimum Requirements:

CUDA-capable GPU (tested on A100, RTX 4090)
Python 3.12
~10GB disk space for model weights

Installation:

git clone https://github.com/facebookresearch/map-anything.git
cd map-anything

conda create -n mapanything python=3.12 -y
conda activate mapanything

pip install -e .

Basic Inference with COLMAP Export:

import torch
from mapanything.models import MapAnything

device = "cuda" if torch.cuda.is_available() else "cpu"
model = MapAnything.from_pretrained("facebook/map-anything").to(device)

# Load images
images = [...]  # List of PIL images or paths

# Run inference
pred = model.infer(images)

# Access outputs
pts3d = pred["pts3d"]           # 3D points in world coordinates
cam_trans = pred["cam_trans"]   # Camera translations
cam_quats = pred["cam_quats"]   # Camera rotations as quaternions
depth = pred["depth"]           # Per-image depth maps

COLMAP Export from Existing Data:

python scripts/demo_inference_on_colmap_outputs.py \
    --colmap_path /path/to/your/images \
    --save_colmap \
    --viz

Additional flags:

--apache: Use Apache 2.0 licensed model for commercial applications
--stride N: Process every Nth image
--ignore_pose_inputs: Use images only, ignore existing poses

Option 3: Local Gradio Demo

For interactive local testing with GUI:

pip install -e ".[demo]"
python scripts/demo_gradio.py

Opens browser interface at http://127.0.0.1:7860 with same functionality as Hugging Face demo, plus confidence slider and local file access.

Option 4: Rerun Visualization

For 3D visualization during development:

pip install -e ".[rerun]"
rerun --serve --port 2004 --web-viewer-port 2006

# In another terminal
python scripts/demo_rerun.py --input_dir /path/to/images

Comparison with Alternative Models

V1.1 introduces a model factory supporting multiple 3D reconstruction backends through a unified API:

Model	Full Name	Parameters	Notes
mapanything	MapAnything	518M	Default, best overall performance
da3	Depth Anything 3	504M	ByteDance, 35.7% improved pose accuracy
pi3x	Pi3-X	518M	Permutation-equivariant, no ref bias
vggt	VGGT 1B	518M	Visual Geometry Grounded Transformer
dust3r	DUSt3R + Global BA	512M	Dense unconstrained stereo
mast3r	MASt3R + SGA	512M	Matching-aware stereo
moge	MoGe	518M	Monocular geometry estimation

Using Alternative Models:

pip install -e ".[pi3x]"  # Install optional dependencies

python -c "
from mapanything.model_factory import init_model_from_config
model = init_model_from_config({'model_name': 'pi3x'})
"

Performance Comparison

Based on profiling data from the repository, MapAnything achieves the best speed and memory profile across view counts from 2 to 1000:

Model	Inference Speed	GPU Memory	Notes
MapAnything	Fastest	Lowest	Best overall efficiency
VGGT	Fast	Moderate	Good balance
Pi3X	Fast	Moderate	Permutation-equivariant
DA3	Moderate	Moderate	Strong pose accuracy
DUSt3R	Slowest	Highest	Dense stereo reconstruction

The memory-efficient mode (memory_efficient_inference=True) enables processing up to 2000 views on 140GB VRAM.

V1.1 Release Features

Released January 18, 2026, V1.1 introduces:

Model Factory:

Unified API for running MapAnything, VGGT, DUSt3R, MASt3R, MUSt3R, Pi3-X, Pow3R, MoGe, AnyCalib, and Depth Anything 3
Consistent output format across all wrappers
Optional dependencies via pip extras

Improved Checkpoints:

V1.1 checkpoints on Hugging Face Hub
V1 preserved as facebook/map-anything-v1 for backward compatibility

Profiling:

GPU memory and inference speed benchmarking scripts
Comparison profiling against all supported external models
Visualization outputs for memory and speed plots

COLMAP Integration:

Demo script for inference on existing COLMAP outputs
Bidirectional workflow: ingest COLMAP data or export to COLMAP format
Voxelization tooling for point cloud processing

WAI Format Benchmarking:

AerialMegaDepth dataset integration
ScanNet++V2 rendering and config updates

Workflow: MapAnything to Gaussian Splatting

Complete pipeline for using MapAnything output with 3DGS:

# 1. Run MapAnything on your images
python scripts/demo_inference_on_colmap_outputs.py \
    --colmap_path /path/to/your/images \
    --save_colmap \
    --output_dir /path/to/output

# 2. Train Gaussian Splatting using gsplat
pip install gsplat

python -m gsplat.examples.simple_trainer \
    --data_dir /path/to/output \
    --result_dir /path/to/splat_output

# Or use official 3DGS
git clone https://github.com/graphdeco-inria/gaussian-splatting
cd gaussian-splatting
python train.py -s /path/to/output

Licensing Considerations

Two model variants available:

Variant	License	Use Case
`facebook/map-anything`	CC-BY-NC 4.0	Research and academic
`facebook/map-anything-apache`	Apache 2.0	Commercial applications

For commercial Gaussian splatting workflows, use the Apache-licensed variant:

model = MapAnything.from_pretrained("facebook/map-anything-apache")

Key Findings

MapAnything provides a practical feed-forward alternative to COLMAP for Gaussian splatting initialization
Zero-setup testing is available via Hugging Face Space for rapid validation
Local deployment requires standard Python environment with CUDA GPU
COLMAP export enables direct integration with any 3DGS training tool
V1.1 model factory allows benchmarking against DA3, Pi3X, and other recent alternatives
Commercial use requires the Apache-licensed model variant
Processing speed under 1 second for 50 views significantly accelerates iteration compared to traditional SfM

References

MapAnything Project Page - Accessed 2026-01-29
MapAnything arXiv Paper - Submitted September 2025, revised January 2026
MapAnything GitHub Repository - Apache 2.0 License
Depth Anything 3 Project - ByteDance, November 2025
Pi3: Permutation-Equivariant Visual Geometry Learning - arXiv:2507.13347
3D Gaussian Splatting - Kerbl et al., SIGGRAPH 2023