How is the data recollected from the videos?

I used python with the following libraries

Library Reason
Pytube Used to extract metadata from Youtube videos and download it
numpy
cv2 Used VideoCapture to read frames from video
Pillow Image manipulation, needed to crop images before passes to OCR software
pytesseract Python wrapper to use Tesseract-OCR osftware
re Regular expressions
scrapetube To get all the identificator from a Youtube Channel
json Dump and load data in json format

The proceddure*

*Main idea

First i get all the videos uploaded in the DGR Channel video
Then i check if i searched data from the current video, if I used it in the past, the video is skiped

Download the video to analyse the frames

After the video is downloaded, a frame is stracted from the video and is analiced

The frame is croped in two special places, where the level code is located

A OCR (optical character recognition) is made over the cropped images
Because OCR use some weird strategic to reconoce letters, a REGEX is used to filter the character recognition

The LEVEL code is stracted and saved

The code extracted is used in the webpage WEBPAGE to retrieve all the metadata of the level

Finally, all the data is saved with the Youtube metadata [Thumbnail, Description, URL]

Visit the repository here