Pipe an arbitrary youtube livestream to an html page
(yt-dlp and ffmpeg for any changes/ extraction)
detect objects in it based on yolo v5, 8 or 9 image detection models.
Do various things like detect from a list, from a NOT list, save images with bbox, etc.
To do:
try persistence with video