3D Visual Reconstruction from 2D and 3D Images

    Abstract: Half of computer vision is about 3D reconstruction, that is, using many images of the same scene to build a 3D model of it (the other half is recognition). Two recent developments by Microsoft Research have brought 3D reconstruction into the mainstream. The project on Building Rome in a Day produced a system that reconstructs the 3D shape of famous buildings from tourist pictures harvested from the web. The KinectFusion project takes relatively noisy and narrow-field-of-view depth images from a $200 depth sensor made by Israeli company PrimeSense and stitches them into gorgeous, detailed, and accurate 3D models in real time. We will discuss underlying technologies, achievements, trade-offs, and opportunities. I will also clarify why it is still useful to do reconstruction the hard way, from 2D images, and why one still needs to do reconstruction even with a 3D sensor.

    Important: Please bring questions to the discussion.

    Required Readings:

  • Sameer Agarwal, Yasutaka Furukawa, Noah Snavely, Ian Simon, Brian Curless, Steven M. Seitz, and Richard Szeliski. Building Rome in a Day. Communications of the ACM  54(10):105-112, October 2011.
  • Shahram Izadi, David Kim, Otmar Hilliges, David Molyneaux, Richard Newcombe, Pushmeet Kohli, Jamie Shotton, Steve Hodges, Dustin Freeman, Andrew Davison, and Andrew Fitzgibbon. KinectFusion: Real-time 3D Reconstruction and Interaction Using a Moving Depth Camera. ACM Symposium on User Interface Software and Technology, October 2011.
  • Optional Readings:

  • Project page for Building Rome in a Day
  • Project page for KinectFusion
  • Carlo Dal Mutto, Pietro Zanuttigh, and Guido M. Cortelazzo. Microsoft Kinect Range Camera. Chapter 3, pages 33-47 in Time-of-Flight Cameras and Microsoft Kinect, Springer Briefs in Electrical and Computer Engineering, 2012.
  • Carlo Tomasi. Visual Reconstruction: Technical Perspective. Communications of the ACM 54(10):104, October 2011.