attached to a zip folder
Multiple View Geometry in Computer Vision, Second Edition http://www.cambridge.org/9780521540513 This page intentionally left blank Multiple View Geometry in Computer Vision Second Edition Richard Hartley Australian National University, Canberra, Australia Andrew Zisserman University of Oxford, UK cambridge university press Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo Cambridge University Press The Edinburgh Building, Cambridge cb2 2ru, UK First published in print format isbn-13 978-0-521-54051-3 isbn-13 978-0-511-18618-9 © Cambridge University Press 2000, 2003 2004 Information on this title: www.cambridge.org/9780521540513 This publication is in copyright. Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. isbn-10 0-511-18618-5 isbn-10 0-521-54051-8 Cambridge University Press has no responsibility for the persistence or accuracy of urls for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate. Published in the United States of America by Cambridge University Press, New York www.cambridge.org paperback eBook (EBL) eBook (EBL) paperback http://www.cambridge.org http://www.cambridge.org/9780521540513 Dedication This book is dedicated to Joe Mundy whose vision and constant search for new ideas led us into this field. Contents Foreword page xi Preface xiii 1 Introduction – a Tour of Multiple View Geometry 1 1.1 Introduction – the ubiquitous projective geometry 1 1.2 Camera projections 6 1.3 Reconstruction from more than one view 10 1.4 Three-view geometry 12 1.5 Four view geometry and n-view reconstruction 13 1.6 Transfer 14 1.7 Euclidean reconstruction 16 1.8 Auto-calibration 17 1.9 The reward I : 3D graphical models 18 1.10 The reward II: video augmentation 19 PART 0: The Background: Projective Geometry, Transformations and Esti- mation 23 Outline 24 2 Projective Geometry and Transformations of 2D 25 2.1 Planar geometry 25 2.2 The 2D projective plane 26 2.3 Projective transformations 32 2.4 A hierarchy of transformations 37 2.5 The projective geometry of 1D 44 2.6 Topology of the projective plane 46 2.7 Recovery of affine and metric properties from images 47 2.8 More properties of conics 58 2.9 Fixed points and lines 61 2.10 Closure 62 3 Projective Geometry and Transformations of 3D 65 3.1 Points and projective transformations 65 3.2 Representing and transforming planes, lines and quadrics 66 v vi Contents 3.3 Twisted cubics 75 3.4 The hierarchy of transformations 77 3.5 The plane at infinity 79 3.6 The absolute conic 81 3.7 The absolute dual quadric 83 3.8 Closure 85 4 Estimation – 2D Projective Transformations 87 4.1 The Direct Linear Transformation (DLT) algorithm 88 4.2 Different cost functions 93 4.3 Statistical cost functions and Maximum Likelihood estimation 102 4.4 Transformation invariance and normalization 104 4.5 Iterative minimization methods 110 4.6 Experimental comparison of the algorithms 115 4.7 Robust estimation 116 4.8 Automatic computation of a homography 123 4.9 Closure 127 5 Algorithm Evaluation and Error Analysis 132 5.1 Bounds on performance 132 5.2 Covariance of the estimated transformation 138 5.3 Monte Carlo estimation of covariance 149 5.4 Closure 150 PART I: Camera Geometry and Single View Geometry 151 Outline 152 6 Camera Models 153 6.1 Finite cameras 153 6.2 The projective camera 158 6.3 Cameras at infinity 166 6.4 Other camera models 174 6.5 Closure 176 7 Computation of the Camera Matrix P 178 7.1 Basic equations 178 7.2 Geometric error 180 7.3 Restricted camera estimation 184 7.4 Radial distortion 189 7.5 Closure 193 8 More Single View Geometry 195 8.1 Action of a projective camera on planes, lines, and conics 195 8.2 Images of smooth surfaces 200 8.3 Action of a projective camera on quadrics 201 8.4 The importance of the camera centre 202 8.5 Camera calibration and the image of the absolute conic 208 Contents vii 8.6 Vanishing points and vanishing lines 213 8.7 Affine 3D measurements and reconstruction 220 8.8 Determining camera calibration K from a single view 223 8.9 Single view reconstruction 229 8.10 The calibrating conic 231 8.11 Closure 233 PART II: Two-View Geometry 237 Outline 238 9 Epipolar Geometry and the Fundamental Matrix 239 9.1 Epipolar geometry 239 9.2 The fundamental matrix F 241 9.3 Fundamental matrices arising from special motions 247 9.4 Geometric representation of the fundamental matrix 250 9.5 Retrieving the camera matrices 253 9.6 The essential matrix 257 9.7 Closure 259 10 3D Reconstruction of Cameras and Structure 262 10.1 Outline of reconstruction method 262 10.2 Reconstruction ambiguity 264 10.3 The projective reconstruction theorem 266 10.4 Stratified reconstruction 267 10.5 Direct reconstruction – using ground truth 275 10.6 Closure 276 11 Computation of the Fundamental Matrix F 279 11.1 Basic equations 279 11.2 The normalized 8-point algorithm 281 11.3 The algebraic minimization algorithm 282 11.4 Geometric distance 284 11.5 Experimental evaluation of the algorithms 288 11.6 Automatic computation of F 290 11.7 Special cases of F-computation 293 11.8 Correspondence of other entities 294 11.9 Degeneracies 295 11.10 A geometric interpretation of F-computation 297 11.11 The envelope of epipolar lines 298 11.12 Image rectification 302 11.13 Closure 308 12 Structure Computation 310 12.1 Problem statement 310 12.2 Linear triangulation methods 312 12.3 Geometric error cost function 313 12.4 Sampson approximation (first-order geometric correction) 314 viii Contents 12.5 An optimal solution 315 12.6 Probability distribution of the estimated 3D point 321 12.7 Line reconstruction 321 12.8 Closure 323 13 Scene planes and homographies 325 13.1 Homographies given the plane and vice versa 326 13.2 Plane induced homographies given F and image correspondences 329 13.3 Computing F given the homography induced by a plane 334 13.4 The infinite homography H∞ 338 13.5 Closure 340 14 Affine Epipolar Geometry 344 14.1 Affine epipolar geometry 344 14.2 The affine fundamental matrix 345 14.3 Estimating FA from image point correspondences 347 14.4 Triangulation 353 14.5 Affine reconstruction 353 14.6 Necker reversal and the bas-relief ambiguity 355 14.7 Computing the motion 357 14.8 Closure 360 PART III: Three-View Geometry 363 Outline 364 15 The Trifocal Tensor 365 15.1 The geometric basis for the trifocal tensor 365 15.2 The trifocal tensor and tensor notation 376 15.3 Transfer 379 15.4 The fundamental matrices for three views 383 15.5 Closure 387 16 Computation of the Trifocal Tensor T 391 16.1 Basic equations 391 16.2 The normalized linear algorithm 393 16.3 The algebraic minimization algorithm 395 16.4 Geometric distance 396 16.5 Experimental evaluation of the algorithms 399 16.6 Automatic computation of T 400 16.7 Special cases of T -computation 404 16.8 Closure 406 PART IV: N-View Geometry 409 Outline 410 17 N -Linearities and Multiple View Tensors 411 17.1 Bilinear relations 411 17.2 Trilinear relations 414 Contents ix 17.3 Quadrilinear relations 418 17.4 Intersections of four planes 421 17.5 Counting arguments 422 17.6 Number of independent equations 428 17.7 Choosing equations 431 17.8 Closure 432 18 N -View Computational Methods 434 18.1 Projective reconstruction – bundle adjustment 434 18.2 Affine reconstruction – the factorization algorithm 436 18.3 Non-rigid factorization 440 18.4 Projective factorization 444 18.5 Projective reconstruction using planes 447 18.6 Reconstruction from sequences 452 18.7 Closure 456 19 Auto-Calibration 458 19.1 Introduction 458 19.2 Algebraic framework and problem statement 459 19.3 Calibration using the absolute dual quadric 462 19.4 The Kruppa equations 469 19.5 A stratified solution 473 19.6 Calibration from rotating cameras 481 19.7 Auto-calibration from planes 485 19.8 Planar motion 486 19.9 Single axis rotation – turntable motion 490 19.10 Auto-calibration of a stereo rig 493 19.11 Closure 497 20 Duality 502 20.1 Carlsson–Weinshall duality 502 20.2 Reduced reconstruction 508 20.3 Closure 513 21 Cheirality 515 21.1 Quasi-affine transformations 515 21.2 Front and back of a camera 518 21.3 Three-dimensional point sets 519 21.4 Obtaining a quasi-affine reconstruction 520 21.5 Effect of transformations on cheirality 521 21.6 Orientation 523 21.7 The cheiral inequalities 525 21.8 Which points are visible in a third view 528 21.9 Which points are in front of which 530 21.10 Closure 531 x Contents 22 Degenerate Configurations 533 22.1 Camera resectioning 533 22.2 Degeneracies in two views 539 22.3 Carlsson–Weinshall duality 546 22.4 Three-view critical configurations 553 22.5 Closure 558 PART V : Appendices 561 Appendix 1 Tensor Notation 562 Appendix 2 Gaussian (Normal) and χ2 Distributions 565 Appendix 3 Parameter Estimation 568 Appendix 4 Matrix Properties and Decompositions 578 Appendix 5 Least-squares Minimization 588 Appendix 6 Iterative Estimation Methods 597 Appendix 7 Some Special Plane Projective Transformations 628 Bibliography 634 Index 646 Foreword By Olivier Faugeras Making a computer see was something that leading experts in the field of Artificial Intelligence thought to be at the level of difficulty of a summer student’s project back in the sixties. Forty years later the task is still unsolved and seems formidable. A whole field, called Computer Vision, has emerged as a discipline in itself with strong connections to mathematics and computer science and looser connections to physics, the psychology of perception and the neuro sciences. One of the likely reasons for this half-failure is the fact that researchers had over- looked the fact, perhaps because of this plague called naive introspection, that percep- tion in general and visual perception in particular are far more complex in animals and humans than was initially thought. There is of course no reason why we should pattern Computer Vision algorithms after biological ones, but the fact of the matter is that (i) the way biological vision works is still largely unknown and therefore hard to emulate on computers, and (ii) attempts to ignore biological vision and reinvent a sort of silicon-based vision have not been so successful as initially expected. Despite these negative remarks, Computer Vision researchers have obtained some outstanding successes, both practical and theoretical. On the side of practice, and to single out one example, the possibility of guiding vehi- cles such as cars and trucks on regular roads or on rough terrain using computer vision technology was demonstrated many years ago in Europe, the USA and Japan. This requires capabilities for real-time three-dimensional dynamic scene analysis which are quite elaborate. Today, car manufacturers are slowly incorporating some of these func- tions in their products. On the theoretical side some remarkable progress has been achieved in the area of what one could call geometric Computer Vision. This includes the description of the way the appearance of objects changes when viewed from different viewpoints as a function of the objects’ shape and the cameras parameters. This endeavour would not have been achieved without the use of fairly sophisticated mathematical techniques en- compassing many areas of geometry, ancient and novel. This book deals in particular with the intricate and beautiful geometric relations that exist between the images of ob- jects in the world. These relations are