Reconstruct a 3D scene and simultaneously obtain the camera poses using Structure from Motion.
Phase 1: Reconstructed a 3D scene and simultaneously obtained the camera poses of a monocular camera from a set of images with different view points using feature point correspondences (classical CV).
Phase 2: Used Neural Radiance fields (NeRF) to synthesize novel views of complex scenes by optimizing a continuous volumetric scene function using a sparse set of input views (Deep Learning)
There are a few steps that collectively form SfM:
- Feature Matching and Outlier rejection using RANSAC
- Estimating Fundamental Matrix
- Estimating Essential Matrix from Fundamental Matrix
- Estimate Camera Pose from Essential Matrix
- Check for Cheirality Condition using Triangulation
- Perspective-n-Point
- Bundle Adjustment
The data given to us is a set of 5 images of Unity Hall at WPI, using a Samsung S22 Ultra’s primary camera at f/1.8 aperture, ISO 50 and 1/500 sec shutter speed.
The data folder contains 4 matching files named matching*.txt where * refers to numbers from 1 to 5. For eg., matching3.txt contains the matching between the third image and images that come after, i.e., I3↔I4,I3↔I5. This is the reason image 5 does not have a text file.
- Run
python3 Wrapper.py
- All Intermediate Images Output are saved in Phase1-> Data-> IntermideateOutputs
- You can change the savepath loacation and Data path in Arg-Parser
Implementing the original NERF method from this paper.
Download the lego data for NeRF from the original author’s link here
- Change the directory to Phase 2.
- To train the NeRF model on GPU:
python3 NeRF_train.py
- Output of Loss plot will be saved in Results folder.
- Change the directory to Phase 2.
- To test the model:
python3 NeRF_test.py
- Output video will be saved in the same directory.









.gif)