Algebraic Vision
Taken from a talk on 11/15/2024 by Jessie Loucks - Tavitas from Sac State.
1: What is Computer Vision?
Specifically from mathematicians' perspective. Two motivating questions:
- Given cameras (positions, angles) + images (color data), recover the object: Traingulation
- Given objects + images, recover the camera: Resectioning
2: What is a pinhole camera?
This is the idea of the hole in a box projecting the world's image onto the backboard of the box. Mathematically, the camera is a map
But notice that this map is not invertible, and if
The other thing is that in 3D space two parallel lines don't intersect; but their map via
The fix...
3: Perspective & Projective Geometry
The idea is that we look at this via perspective geometry, where we actually have these converging lines. This is in contrast to orthographic geometry, where the 2D counterparts also are parallel if they are in the
You can think of the left set as
where . - At least one non-zero coordinate
.
For example, if a line through
So the end point of this line is just this limit point
For
With projective space then we get linearity. The map
4: Triangulation and Resectioning
Triangulation
Say we have
Here
Resectioning
This is the meat and potatoes here. A hypercamera configuration is a tuple of world points
5: Duality
Like with the Chapter 3 (cont.) - Products and Quotients of Vector Spaces#3.F Duality, a lot of things have duality like Graphs!
This gets into Carlsson-Weinshall Duality.