Complete o8NPllzkFhE linux is in millions of computers example

Benjamin Loison 2022-12-21 22:48:56 +01:00
parent 33cc99ee88
commit 7682fdd505

16
Home.md

@ -4,9 +4,11 @@ As described on [the project proposal page](https://gitea.lemnoslife.com/Benjami
As a first feeling it seems that YouTube returns videos that only match auto-generated captions.
Let's consider [`7TXEZ4tP06c`](https://www.youtube.com/watch?v=7TXEZ4tP06c) at [0:18](https://www.youtube.com/watch?v=7TXEZ4tP06c&t=18s) the auto-generated captions are `how many people here would say they can draw`.
### - [`7TXEZ4tP06c`](https://www.youtube.com/watch?v=7TXEZ4tP06c) `how many people here would say they can draw`
[Passing this sentence to YouTube Data API v3 Search: list endpoint](https://yt.lemnoslife.com/noKey/search?part=snippet&q=%22how%20many%20people%20here%20would%20say%20they%20can%20draw%22&maxResults=50) returns these videos:
Let's consider [`7TXEZ4tP06c`](https://www.youtube.com/watch?v=7TXEZ4tP06c) at [0:18](https://www.youtube.com/watch?v=7TXEZ4tP06c&t=18s) the auto-generated captions and not auto-generated captions are `how many people here would say they can draw`.
[Passing this sentence to YouTube Data API v3 Search: list endpoint](https://yt.lemnoslife.com/noKey/search?part=snippet&q="how many people here would say they can draw"&maxResults=50) returns these videos:
- [`7TXEZ4tP06c`](https://www.youtube.com/watch?v=7TXEZ4tP06c): is the original video ([`7TXEZ4tP06c`](https://www.youtube.com/watch?v=7TXEZ4tP06c)) ([0:18](https://www.youtube.com/watch?v=7TXEZ4tP06c&t=18s))
- [`qH-yY7UZW_k`](https://www.youtube.com/watch?v=qH-yY7UZW_k): reupload part of the original video at [1:13](https://www.youtube.com/watch?v=qH-yY7UZW_k&t=73s) ([1:16](https://www.youtube.com/watch?v=qH-yY7UZW_k&t=76s))
@ -16,3 +18,13 @@ Let's consider [`7TXEZ4tP06c`](https://www.youtube.com/watch?v=7TXEZ4tP06c) at [
- [`ZI7XTsGTl34`](https://www.youtube.com/watch?v=gpMp6tz3d7w): reupload part of the original video at [23:36](https://youtu.be/ZI7XTsGTl34?t=1416s) ([23:43](https://www.youtube.com/watch?v=ZI7XTsGTl34&t=1423s))
Note that all of these videos are partial uploads of the original video and they have auto-generated captions and all exactly contains `how many people here would say they can draw`.
### - [`o8NPllzkFhE`](https://www.youtube.com/watch?v=o8NPllzkFhE) `linux is in millions of computers`
Completing [the project proposal example](https://gitea.lemnoslife.com/Benjamin_Loison/YouTube_captions_search_engine/wiki/Project-proposal), [`Vo9KPk-gqKk`](https://www.youtube.com/watch?v=Vo9KPk-gqKk) reupload part of the original video and has only auto-generated captions which contains `your software Linux is in millions of computers` while [`o8NPllzkFhE`](https://www.youtube.com/watch?v=o8NPllzkFhE) that is the original video has auto-generated captions which contains `your software uh linux is in millions of computers` and has not auto-generated captions which contains `Your software, Linux, is in millions of computers`.
The weird thing is that when [passing `linux is in millions of computers` to YouTube Data API v3 Search: list endpoint](https://yt.lemnoslife.com/noKey/search?part=snippet&q="linux is in millions of computers"), it returns only these videos:
- [`Vo9KPk-gqKk`](https://www.youtube.com/watch?v=Vo9KPk-gqKk): reupload part of the original video (`your software Linux is in millions of computers`)
- [`krakddj30eU`](https://www.youtube.com/watch?v=krakddj30eU): reupload part of the original video at [0:05](https://www.youtube.com/watch?v=krakddj30eU?t=5s) (`your software uh linux is in millions of computers`)
Note that all of these videos are partial uploads of the original video and they have auto-generated captions and all exactly contains case-insensitively `Linux is in millions of computers`.