Interactive Computer Vision Tool with GUI
Submission: PDF documentation (Primary resource) + .py script + demo video link
Objective
Design and implement a standalone desktop computer vision tool using OpenCV and Tkinter. The tool must provide a graphical user interface (GUI) that allows users to open a local image or access the webcam, apply image processing and vision operations,** adjust parameters interactively**, and save the output. The emphasis is on correct implementation, usability, parameter control, and clear documentation.
Functional Requirements (What your tool must do)
A. Input & I/O
Open Local Image
i) File Open… (Tkinter menu), ii) Supported formats: JPG, PNG, BMP (at least)
Access Live Webcam
i) File Access Live Webcam, ii) Display live feed in the main window, iii) Take Snapshot button to freeze a frame and switch to static-image mode
Save Output
i) File Save As…, ii) Save the currently displayed image (processed result)
B. GUI & Interaction
Implement** at least** the following GUI features:
Menu bar (File / Tools)
Buttons (e.g., Apply, Snapshot)
Trackbars / sliders for real-time parameter control
Text box for numeric input (with validation)
Clean exit and proper resource release (camera, windows)
C. Image Processing & Vision Operations (Choose 10 total, with constraints)
You must implement at least 10 operations overall, satisfying the minimum coverage below. You may implement more for bonus credit.
1. Foundations & Color (Choose 2)
RGB channel access and manipulation
Grayscale conversion
Brightness / contrast adjustment
HSV (Hue/Saturation/Intensity) adjustment
Color blindness simulation (matrix-based)
2. Image Statistics (Implement all)
Histogram computation and display
Histogram equalization
3. Point & Local Operators (Choose 2)
Contrast stretching
Median filter (kernel size via text box)
Gaussian smoothing (kernel size + )
Sharpening (Laplacian or custom kernels)
4. Edge Detection (Choose 2)
-** Sobel** (kernel size + threshold defined as a ratio of mean)
Canny (control t1, t2, aperture size)
Laplacian of Gaussian (LoG) (, kernel size, threshold)
5. Segmentation (Choose 2)
Global thresholding
Adaptive thresholding (block size, method, etc.)
Contour detection
OR one advanced method:
Mean Shift (sp, sr, pyramid levels)
Superpixels (SLIC: n_segments, compactness)
Implementation Rules:
##** i) Python only, **
##** ii) OpenCV + Tkinter only (no PyQt, no web frameworks)**
##** iii) Must run locally (not Colab), your tool will be tested by the instructor via calling your script from terminal **
AI tools may be used, but **you must understand and explain your code in the technical interview **
Code must be well-structured (functions/classes, not one giant script)
Your tool should handle edge cases (such as no image loaded, invalid parameters). These should not cause a crash; instead, the user should be informed (for example: Please load an image before running the function) ) and when necessary parameter values should be set to default (when they are set outside of acceptable ranges (e.g. kernel size = -1))
Deliverables:
IMPORTANT: In your submission, you should upload the PDF as primary resource and a zip file containing both the PDF and the script as a secondary resource.
1.** PDF Documentation (36 pages): Primary resource to upload**
Your PDF must include:
**Overview of the Tool **
A short description of the main features of the tool
**List of Implemented Functionalities **
One subsection per operation
**Parameters Table **
For each operation:
Parameter name
Type (slider/textbox/menu)
Valid range
Effect on output
**GUI Screenshots **
Annotated where appropriate
2. Python Script (****.py**): To pack together with the PDF file and upload as a secondary resource **
Your script must:
Launch the GUI
Support image and webcam modes
Provide interactive controls for selected operations
Save outputs correctly
Be readable and commented
**File name example: **cv_gui_assignment1_.py
3. Video Demonstration (24 minutes): To include in the first page of the PDF
Include** link** (Google Drive) in your PDF (at the beginning of the) document showing the following user operations:
Opening an image
Using **at least 4 different operations **
Adjusting parameters live
Accessing webcam and taking a snapshot
Saving the output
Bonus:
Side-by-side original vs processed view
Preset buttons (e.g., Low Noise, Strong Edges)
Keyboard shortcuts
Down is an exercise document to give you an idea.

Leave a Reply
You must be logged in to post a comment.