CS 512

cs 512 - computer vision

Cameras and computers are everywhere. Can we make computers see? Can we make computers respond to our behavior? Automated sensing by computers is a key element in ubiquitous computing. Similar to the importance of vision in humans, cameras are the most eminent sensors computers may have. By using cameras, computers can interact with people, inspect and monitor various processes, and identify people and behavior. Computer vision is a rapidly evolving field of study in computer science which is facilitated by recent years developments in computing hardware. Advanced computer hardware is now a standard component in off the shelf PCs, and digital cameras are cheap and abundant. The consequence of which is a surge in graphics and imaging applications. Computer vision aims at the analysis of the real world based on images captured by one or more cameras or similar sensory data for various purposes. Such analysis is a common practice in industrial, medical, military, and scientific applications such as product inspection and quality control, surveillance and monitoring, automated recognition of objects, robot guidance, human motion tracking, document/handwriting analysis, and augmented and virtualized reality. Topics to be covered by the cs512 course in this semester include: overview of computer vision and related areas, extraction of features from images, probabilistic modeling in images, camera calibration, epipolar geometry estimation, statistical estimation, model reconstruction from images, statistical filtering and tracking in video sequences, motion estimation in video sequences, optical flow, and object recognition. The course serves as a good starting point for graduate students interested in getting acquainted with the area of computer vision as well as for students interested in pursuing the area of visual computing in greater detail. The course assumes a basic knowledge of calculus and linear algebra and some programming experience. Programming assignments in the course will use the OpenCV standard to support capturing and manipulation of images and videos. For further details please refer to the course website or contact the course instructor.

Instructor

Gady Agam
SB 237e, x7-583

Office hours:
Tuesday, Thursday
6:30-7:30pm

TA

Ying Chen, Lawrence Amadi
SB-110, x7-5705
Office hours:

Mon 10-11am (Chen)
Tue 11-12pm (Chen)
Wed 3:30-4:30pm (Amadi)
Thu 3:30-4:30pm (Amadi)

Sections

CS-512-01: (SB-104)

CS-512-02: (Internet)

CS-512-03: (Internet)

Class hours:
Tuesday, Thursday
5:00-6:15pm

Syllabus

Course outline

What to expect from this course

Computer vision can be covered at different levels. The focus of this course is the understanding of algorithms and techniques used in computer vision. Students in the course are expected to write computer programs implementing different techniques taught in the course. The course requires mathematical background and some programming experience. This course does not intend to teach how to use a specific application software.

Objectives

Introduce the fundamental problems of computer vision.
Provide understanding of techniques, mathematical concepts and algorithms used in computer vision to facilitate further study in this area.
Provide pointers into the literature and exercise a project based on a literature search and one or more research papers.
Practice software implementation of different concepts and techniques covered in the course.
Utilize programming and scientific tools for relevant software implementation.

Overview

Introduction: overview of computer vision, related areas, and applications; overview of software tools; overview of course objectives.; introduction to OpenCV.
Image formation and representation: imaging geometry, radiometry, digitization, cameras and projections, rigid and affine transformations.
Filtering: convolution, smoothing, differencing, and scale space.
Feature detection: edge detection, corner detection, line and curve detection, active contours, SIFT and HOG descriptors, shape context descriptors.
Model fitting: Hough transform, line fitting, ellipse and conic sections fitting, algebraic and Euclidean distance measures.
Camera calibration: camera models; intrinsic and extrinsic parameters; radial lens distortion; direct parameter calibration; camera parameters from projection matrices; orthographic, weak perspective, affine, and perspective camera models.
Epipolar geometry: introduction to projective geometry; epipolar constraints; the essential and fundamental matrices; estimation of the essential/fundamental matrix.
Model reconstruction: reconstruction by triangulation; Euclidean reconstruction; affine and projective reconstruction.
Motion analysis: the motion field of rigid objects; motion parallax; optical flow, the image brightness constancy equation, affine flow; differential techniques; feature-based techniques; regularization and robust estimation; motion segmentation through EM.
Motion tracking: statistical filtering; iterated estimation; observability and linear systems; the Kalman filter; the extended Kalman filter
Object recognition and shape representation: alignment, appearance-based methods, invariants, image eigenspaces, data-based techniques.
Final presentation: students present selected topics and develop software implementation of related techniques based on the review of relevant literature. The work should be summarized in a concluding report which should include simulation results. A list of possible topics will be advertised prior to the project selection due date.

Grading

component	description	weight
participation	up to 4 unjustified missed classes and all quizes ⇒ full credit	5%
assignments	5 TBD	25%
project	presentation (5%) project (15%)	20%
midterm exam	open notes (1 double sided 8.5x11" page)	10%
final exam	open notes (2 double sided 8.5x11" pages)	40%
total		100%

There is an additional mandatory assignment (assignment 0) which does not carry any credit. There is a penalty of 5% for not submitting this assignment. The best 5 assignment grades will be used for the combined assignment score.
A certain percentage of the students may be invited to discuss their assignments.
Late days: there is a total of 6 "free late days" with no grade penalty for all the assignments to cover various reasons such as not feeling well, being busy, etc. Up to 2 free late days may be applied to each assignment. Being late beyond what is allowed by the free late days will result in a grade reduction of 25% per day. Late days are counted past midnight when an assignment is due and include weekends and holidays. The final project can not be late and no submissions will be accepted beyond the last day of class.
Each member of this course bears responsibility for maintaining the highest standards of academic integrity. All breaches of academic integrity must be reported immediately. Copying of programs from any source (e.g. other students or the web) is considered to be a serious breach of academic integrity.
The usual grade scale applies: A > 90, B > 80, C > 70, etc.

Books

Computer Vision: Algorithms and Applications, R. Szeliski, Springer, 2011.
Computer Vision: A Modern Approach, D. Forsyth and J. Ponce, Prentice Hall, 2nd ed., 2011.
Introductory techniques for 3D computer vision, E. Trucco and A. Verri, Prentice Hall, 1998.

Tentative schedule


class	date	topic	assignment

1	08/21	Introduction	AS0
2	08/23
3	08/28	Geometric image formation
4	08/30
5	09/04	Filtering	AS1
6	09/06
7	09/11
8	09/13	Feature detection
9	09/18
10	09/20		AS2
11	09/25	Object recognition
12	09/27
13	10/02	Midterm exam 5-6pm (SB104)
14	10/04
15	10/09	Camera calibration	PROJ
16	10/11
17	10/16
18	10/18	Multiple view geometry	AS3
19	10/23
20	10/25
21	10/30
22	11/01	Motion and tracking	AS4
23	11/06
24	11/08
25	11/13
26	11/15	Project presentations	AS5
27	11/20
28	11/22	No class (Thanksgiving)
29	11/27
30	11/29
31	12/04	Final exam 5-7pm (SB104)

Books Szeliski's book (IIT library copy , author's draft) The Forsyth and Ponce book (2nd ed.) website Software OpenCV: download, manuals, source, docs Java to C++: tutorial 1, tutorial 2, tutorial 3 GUI: wxWidgets, FLTK Numerical computation: Octave, Matlab, SciPy Graphics: OpenGL, GLUT, VRML viewers Linux: Fedora, Ubuntu, VirtualBox Office suite: Open Office, or Libre office Other: mexopencv Python Download: Conda, Canopy, Python.org IDE: PyCharm, Canopy, notebook, Eric, Spyder pyDev (eclipse)	Data Video sequences Image database Stereo pairs Face recognition Face detection CMU data page C++ IDE Code::Blocks Dev-C++ Eclipse NetBeans Visual Studio express	Latex Latex home page, CTAN Editors: Lyx, Texmaker, Kile Slides: Beamer, Prosper Posters: Example 1, Example 2 Writing: The science of scientific writing Grammar checker: Grammarly, lacheck, chktex Papers CiteSeer IEEE (full access from within the IIT network) ACM (full access from within the IIT network) Computer vision bibliography General Image processing demos (HIPR2) CV Online Optical illusions Computer vision companies The computer vision homepage

Faq

Q1: C++ IDE

Q1.1: Instructions for downloading MS visual studio for Windows

Download MS Visual studio from here

Q1.2: Instructions for installing DEV-C++ for Windows

Install the latest Dev-C++ package

Q1.3: Instructions for installing IDE for Linux

Download one of the following Code::Blocks, Eclipse, NetBeans

Q2: OpenCV

Q2.1 Installing OpenCV in windows

Download OpenCV from sourceforge
Execute the file you downloaded and follow the instructions
For additional help read the installation guide

Q2.2 Using OpenCV in windows

Read the OpenCV notes on creating a new OpenCV project
Additional instructions for VS.NET (old)

Q2.3 Specifying the location of DLL files if the compiler cannot find them in windows

Option 1:

Add to the PATH environment variable the path to C:\Program Files\OpenCV\bin

Follow the instructions in the following figure

Option 2:

Edit the file C:\AUTOEXEC.BAT and add the following line to it:
set PATH=%PATH%;C:\Program Files\OpenCV\bin;
Reboot your computer

Option 3:

Copy all the DLL files from C:\Program Files\OpenCV\bin to C:\WINDOWS\system32 and/or C:\WINDOWS\system

Option 4:

Copy all the DLL files from C:\Program Files\OpenCV\bin into the directory where your program resides

Q2.4 Installing OpenCV in Linux

Download the source code from sourceforge
Read the installation guide

Q2.5 Using OpenCV in Linux

Compile the program using

g++ prog.cpp -o prog `pkg-config --cflags --libs opencv`

Q3: wxWidgets

Q3.1: Instructions on how to use wxWidgets in Visual C++

Install wxpack (wxWidgets v2.8.12)
Create an empty win32 console project, change the following property of the project:

Configuration Property - C/C++ - General - Additional Include Directories

    %InstalledPath%\wxWidgets-2.8.12\include
    %InstalledPath%\wxWidgets-2.8.12\lib\vc_lib\mswd

Configuration Property - Linkers - General - Addtional Library Directories

    %InstalledPath%\wxWidgets-2.8.12\lib\vc_lib

Configuration Property - Linkers - Input - Additional Dependancies

    comctl32.lib rpcrt4.lib winmm.lib advapi32.lib wsock32.lib wxbase28d.lib wxmsw28d_core.lib wxmsw28d_gl.lib wxjpegd.lib wxpngd.lib wxregexd.lib wxtiffd.lib wxzlibd.lib wxexpatd.lib

Code Generation - Runtime Library

    Multi-threaded Debug (/MTd)

Linker - System

    Window (/SUBSYSTEM:WINDOWS)

Build Events - Post Build Events ( this starts a command line window together with your wxwidgets frame)

    editbin $(TargetPath) /SUBSYSTEM:CONSOLE

Q3.2: Instructions on how to use wxWidgets in Linux

    Compile the source or install the appropriate package.
    Compile with the following flags: `wx-config --cxxflags`
    Link with the following libraries: `wx-config --libs` -lwx_gtk2_gl-2.8

Q3.3: Specific problems with wxWidgets

Conversion between wxString and a normal string: http://wiki.wxwidgets.org/WxString
Global Keyboard events in wxWidgets: http://wiki.wxwidgets.org/Catching_key_events_globally

Topic	Reading
Introduction to computer vision	Ch. 1
Image formation	Ch. 2
Filtering	Ch. 3
Feature detection	Ch. 4
Segmentation	Ch 5
Camera calibration	Ch. 6
Epipolar geometry	Ch. 11
Model reconstruction	Ch 7
Motion analysis	Ch. 8
Recognition	Ch. 14

CS 512
Computer Vision (Fall 2018)