MapAnything V1.1: Metric 3D Reconstruction for Gaussian Splatting Workflows

Research Date: 2026-01-29 Source URL: https://x.com/nik__v__/status/2016556649092165792

Reference URLs

Summary

MapAnything V1.1, released January 18, 2026, is a unified transformer-based feed-forward model for metric 3D reconstruction developed by Meta Reality Labs and Carnegie Mellon University. The system directly regresses 3D scene geometry and camera parameters from images, producing output compatible with Gaussian splatting pipelines through COLMAP format export. This release introduces a model factory supporting multiple reconstruction backends, improved checkpoints, and practical tooling for COLMAP integration.

For users seeking to test MapAnything against their own data with minimal friction, two primary paths exist: a zero-setup Hugging Face demo requiring no local installation, and a local deployment with conda environment setup. Both paths can produce COLMAP-compatible output suitable for downstream Gaussian splatting training.

Relationship to Gaussian Splatting Workflows

Traditional 3DGS Pipeline

Standard 3D Gaussian Splatting (3DGS) workflows follow this sequence:

COLMAP performs Structure-from-Motion (SfM) to estimate camera poses and sparse point clouds. This step can take minutes to hours depending on image count and requires careful parameter tuning for challenging scenes.

MapAnything as COLMAP Replacement

MapAnything operates as a feed-forward alternative to iterative SfM:

Key differences from traditional COLMAP:

AspectCOLMAPMapAnything
Processing ModelIterative optimizationSingle feed-forward pass
SpeedMinutes to hoursUnder 1 second for up to 50 views
Metric ScaleUp-to-scale reconstructionTrue metric scale via learned prior
Failure ModesFeature matching failures, driftOut-of-distribution inputs
GPU RequirementsPrimarily CPU-boundRequires CUDA GPU

Output Format Compatibility

MapAnything exports standard COLMAP format:

output_dir/
├── images/           # Processed images at model resolution
│   ├── img1.jpg
│   └── img2.jpg
└── sparse/
    ├── cameras.bin   # Camera intrinsics
    ├── images.bin    # Camera extrinsics
    ├── points3D.bin  # Sparse point cloud
    └── points.ply    # Point cloud in PLY format

This output structure is directly compatible with:

  • Official 3DGS implementation
  • gsplat library
  • Nerfstudio
  • Instant-NGP
  • Any tool accepting COLMAP sparse reconstruction

Practical Usage Options

Option 1: Zero-Setup Testing via Hugging Face

The Hugging Face Space provides immediate access without local installation:

URL: https://huggingface.co/spaces/facebook/map-anything

Workflow:

  1. Upload video or images via drag-and-drop
  2. Adjust video sample interval if using video input
  3. Click “Reconstruct”
  4. View 3D reconstruction with depth, normal, and measurement visualizations
  5. Download GLB file for external use

Limitations:

  • No COLMAP format export from the web demo
  • Limited to image-only inference mode
  • Queue times during high demand
  • Maximum input constraints imposed by Hugging Face Zero GPU allocation

Best for: Quick validation of reconstruction quality on custom data before committing to local setup.

Option 2: Local Installation for Full Pipeline

Local deployment provides COLMAP export and full feature access.

Minimum Requirements:

  • CUDA-capable GPU (tested on A100, RTX 4090)
  • Python 3.12
  • ~10GB disk space for model weights

Installation:

git clone https://github.com/facebookresearch/map-anything.git
cd map-anything

conda create -n mapanything python=3.12 -y
conda activate mapanything

pip install -e .

Basic Inference with COLMAP Export:

import torch
from mapanything.models import MapAnything

device = "cuda" if torch.cuda.is_available() else "cpu"
model = MapAnything.from_pretrained("facebook/map-anything").to(device)

# Load images
images = [...]  # List of PIL images or paths

# Run inference
pred = model.infer(images)

# Access outputs
pts3d = pred["pts3d"]           # 3D points in world coordinates
cam_trans = pred["cam_trans"]   # Camera translations
cam_quats = pred["cam_quats"]   # Camera rotations as quaternions
depth = pred["depth"]           # Per-image depth maps

COLMAP Export from Existing Data:

python scripts/demo_inference_on_colmap_outputs.py \
    --colmap_path /path/to/your/images \
    --save_colmap \
    --viz

Additional flags:

  • --apache: Use Apache 2.0 licensed model for commercial applications
  • --stride N: Process every Nth image
  • --ignore_pose_inputs: Use images only, ignore existing poses

Option 3: Local Gradio Demo

For interactive local testing with GUI:

pip install -e ".[demo]"
python scripts/demo_gradio.py

Opens browser interface at http://127.0.0.1:7860 with same functionality as Hugging Face demo, plus confidence slider and local file access.

Option 4: Rerun Visualization

For 3D visualization during development:

pip install -e ".[rerun]"
rerun --serve --port 2004 --web-viewer-port 2006

# In another terminal
python scripts/demo_rerun.py --input_dir /path/to/images

Comparison with Alternative Models

V1.1 introduces a model factory supporting multiple 3D reconstruction backends through a unified API:

ModelFull NameParametersNotes
mapanythingMapAnything518MDefault, best overall performance
da3Depth Anything 3504MByteDance, 35.7% improved pose accuracy
pi3xPi3-X518MPermutation-equivariant, no ref bias
vggtVGGT 1B518MVisual Geometry Grounded Transformer
dust3rDUSt3R + Global BA512MDense unconstrained stereo
mast3rMASt3R + SGA512MMatching-aware stereo
mogeMoGe518MMonocular geometry estimation

Using Alternative Models:

pip install -e ".[pi3x]"  # Install optional dependencies

python -c "
from mapanything.model_factory import init_model_from_config
model = init_model_from_config({'model_name': 'pi3x'})
"

Performance Comparison

Based on profiling data from the repository, MapAnything achieves the best speed and memory profile across view counts from 2 to 1000:

ModelInference SpeedGPU MemoryNotes
MapAnythingFastestLowestBest overall efficiency
VGGTFastModerateGood balance
Pi3XFastModeratePermutation-equivariant
DA3ModerateModerateStrong pose accuracy
DUSt3RSlowestHighestDense stereo reconstruction

The memory-efficient mode (memory_efficient_inference=True) enables processing up to 2000 views on 140GB VRAM.

V1.1 Release Features

Released January 18, 2026, V1.1 introduces:

Model Factory:

  • Unified API for running MapAnything, VGGT, DUSt3R, MASt3R, MUSt3R, Pi3-X, Pow3R, MoGe, AnyCalib, and Depth Anything 3
  • Consistent output format across all wrappers
  • Optional dependencies via pip extras

Improved Checkpoints:

  • V1.1 checkpoints on Hugging Face Hub
  • V1 preserved as facebook/map-anything-v1 for backward compatibility

Profiling:

  • GPU memory and inference speed benchmarking scripts
  • Comparison profiling against all supported external models
  • Visualization outputs for memory and speed plots

COLMAP Integration:

  • Demo script for inference on existing COLMAP outputs
  • Bidirectional workflow: ingest COLMAP data or export to COLMAP format
  • Voxelization tooling for point cloud processing

WAI Format Benchmarking:

  • AerialMegaDepth dataset integration
  • ScanNet++V2 rendering and config updates

Workflow: MapAnything to Gaussian Splatting

Complete pipeline for using MapAnything output with 3DGS:

# 1. Run MapAnything on your images
python scripts/demo_inference_on_colmap_outputs.py \
    --colmap_path /path/to/your/images \
    --save_colmap \
    --output_dir /path/to/output

# 2. Train Gaussian Splatting using gsplat
pip install gsplat

python -m gsplat.examples.simple_trainer \
    --data_dir /path/to/output \
    --result_dir /path/to/splat_output

# Or use official 3DGS
git clone https://github.com/graphdeco-inria/gaussian-splatting
cd gaussian-splatting
python train.py -s /path/to/output

Licensing Considerations

Two model variants available:

VariantLicenseUse Case
facebook/map-anythingCC-BY-NC 4.0Research and academic
facebook/map-anything-apacheApache 2.0Commercial applications

For commercial Gaussian splatting workflows, use the Apache-licensed variant:

model = MapAnything.from_pretrained("facebook/map-anything-apache")

Key Findings

  • MapAnything provides a practical feed-forward alternative to COLMAP for Gaussian splatting initialization
  • Zero-setup testing is available via Hugging Face Space for rapid validation
  • Local deployment requires standard Python environment with CUDA GPU
  • COLMAP export enables direct integration with any 3DGS training tool
  • V1.1 model factory allows benchmarking against DA3, Pi3X, and other recent alternatives
  • Commercial use requires the Apache-licensed model variant
  • Processing speed under 1 second for 50 views significantly accelerates iteration compared to traditional SfM

References

  1. MapAnything Project Page - Accessed 2026-01-29
  2. MapAnything arXiv Paper - Submitted September 2025, revised January 2026
  3. MapAnything GitHub Repository - Apache 2.0 License
  4. Depth Anything 3 Project - ByteDance, November 2025
  5. Pi3: Permutation-Equivariant Visual Geometry Learning - arXiv:2507.13347
  6. 3D Gaussian Splatting - Kerbl et al., SIGGRAPH 2023