Computer Vision Assignmnet 1

Interactive Computer Vision Tool with GUI

Submission: PDF documentation (Primary resource) + .py script + demo video link

Objective

Design and implement a standalone desktop computer vision tool using OpenCV and Tkinter. The tool must provide a graphical user interface (GUI) that allows users to open a local image or access the webcam, apply image processing and vision operations,** adjust parameters interactively**, and save the output. The emphasis is on correct implementation, usability, parameter control, and clear documentation.

Functional Requirements (What your tool must do)

A. Input & I/O

Open Local Image

i) File Open… (Tkinter menu), ii) Supported formats: JPG, PNG, BMP (at least)

Access Live Webcam

i) File Access Live Webcam, ii) Display live feed in the main window, iii) Take Snapshot button to freeze a frame and switch to static-image mode

Save Output

i) File Save As…, ii) Save the currently displayed image (processed result)

B. GUI & Interaction

Implement** at least** the following GUI features:

Menu bar (File / Tools)

Buttons (e.g., Apply, Snapshot)

Trackbars / sliders for real-time parameter control

Text box for numeric input (with validation)

Clean exit and proper resource release (camera, windows)

C. Image Processing & Vision Operations (Choose 10 total, with constraints)

You must implement at least 10 operations overall, satisfying the minimum coverage below. You may implement more for bonus credit.

1. Foundations & Color (Choose 2)

RGB channel access and manipulation

Grayscale conversion

Brightness / contrast adjustment

HSV (Hue/Saturation/Intensity) adjustment

Color blindness simulation (matrix-based)

2. Image Statistics (Implement all)

Histogram computation and display

Histogram equalization

3. Point & Local Operators (Choose 2)

Contrast stretching

Median filter (kernel size via text box)

Gaussian smoothing (kernel size + )

Sharpening (Laplacian or custom kernels)

4. Edge Detection (Choose 2)

-** Sobel** (kernel size + threshold defined as a ratio of mean)

Canny (control t1, t2, aperture size)

Laplacian of Gaussian (LoG) (, kernel size, threshold)

5. Segmentation (Choose 2)

Global thresholding

Adaptive thresholding (block size, method, etc.)

Contour detection

OR one advanced method:

Mean Shift (sp, sr, pyramid levels)

Superpixels (SLIC: n_segments, compactness)

Implementation Rules:

##** i) Python only, **

##** ii) OpenCV + Tkinter only (no PyQt, no web frameworks)**

##** iii) Must run locally (not Colab), your tool will be tested by the instructor via calling your script from terminal **

AI tools may be used, but **you must understand and explain your code in the technical interview **

Code must be well-structured (functions/classes, not one giant script)

Your tool should handle edge cases (such as no image loaded, invalid parameters). These should not cause a crash; instead, the user should be informed (for example: Please load an image before running the function) ) and when necessary parameter values should be set to default (when they are set outside of acceptable ranges (e.g. kernel size = -1))

Deliverables:

IMPORTANT: In your submission, you should upload the PDF as primary resource and a zip file containing both the PDF and the script as a secondary resource.

1.** PDF Documentation (36 pages): Primary resource to upload**

Your PDF must include:

**Overview of the Tool **

A short description of the main features of the tool

**List of Implemented Functionalities **

One subsection per operation

**Parameters Table **

For each operation:

Parameter name

Type (slider/textbox/menu)

Valid range

Effect on output

**GUI Screenshots **

Annotated where appropriate

2. Python Script (****.py**): To pack together with the PDF file and upload as a secondary resource **

Your script must:

Launch the GUI

Support image and webcam modes

Provide interactive controls for selected operations

Save outputs correctly

Be readable and commented

**File name example: **cv_gui_assignment1_.py

3. Video Demonstration (24 minutes): To include in the first page of the PDF

Include** link** (Google Drive) in your PDF (at the beginning of the) document showing the following user operations:

Opening an image

Using **at least 4 different operations **

Adjusting parameters live

Accessing webcam and taking a snapshot

Saving the output

Bonus:

Side-by-side original vs processed view

Preset buttons (e.g., Low Noise, Strong Edges)

Keyboard shortcuts

Down is an exercise document to give you an idea.

WRITE MY PAPER

Comments

Leave a Reply Cancel reply