diff --git a/Home.md b/Home.md index c055750..16bd5c8 100644 --- a/Home.md +++ b/Home.md @@ -1 +1,18 @@ -As described on [the project proposal page](https://gitea.lemnoslife.com/Benjamin_Loison/YouTube_captions_search_engine/wiki/Project-proposal), there is a discovery process consisting in going through comments, so we will try to also keep comments. That way we could end up, potentially after the project, doing interesting stuff such as listing all comments written by a given user, as [my French only without discovery process project](https://github.com/Benjamin-Loison/YouTube-comments-graph) was doing. \ No newline at end of file +As described on [the project proposal page](https://gitea.lemnoslife.com/Benjamin_Loison/YouTube_captions_search_engine/wiki/Project-proposal), there is a discovery process consisting in going through comments, so we will try to also keep comments. That way we could end up, potentially after the project, doing interesting stuff such as listing all comments written by a given user, as [my French only without discovery process project](https://github.com/Benjamin-Loison/YouTube-comments-graph) was doing. + +## Dive into YouTube search results + +As a first feeling it seems that YouTube returns videos that only match auto-generated captions. + +Let's consider [`7TXEZ4tP06c`](https://www.youtube.com/watch?v=7TXEZ4tP06c) at [0:18](https://www.youtube.com/watch?v=7TXEZ4tP06c&t=18s) the auto-generated captions are `how many people here would say they can draw`. + +[Passing this sentence to YouTube Data API v3 Search: list endpoint](https://yt.lemnoslife.com/noKey/search?part=snippet&q=%22how%20many%20people%20here%20would%20say%20they%20can%20draw%22&maxResults=50) returns these videos: + +- [`7TXEZ4tP06c`](https://www.youtube.com/watch?v=7TXEZ4tP06c): is the original video ([`7TXEZ4tP06c`](https://www.youtube.com/watch?v=7TXEZ4tP06c)) ([0:18](https://www.youtube.com/watch?v=7TXEZ4tP06c&t=18s)) +- [`qH-yY7UZW_k`](https://www.youtube.com/watch?v=qH-yY7UZW_k): reupload part of the original video at [1:13](https://www.youtube.com/watch?v=qH-yY7UZW_k&t=73s) ([1:16](https://www.youtube.com/watch?v=qH-yY7UZW_k&t=76s)) +- [`cOwYXnpW-8A`](https://www.youtube.com/watch?v=cOwYXnpW-8A): reupload part of the original video ([0:05](https://www.youtube.com/watch?v=cOwYXnpW-8A&t=5s)) +- [`vzH9Fo9GI9Y`](https://www.youtube.com/watch?v=vzH9Fo9GI9Y): reupload part of the original video at [3:31](https://youtu.be/vzH9Fo9GI9Y?t=211s) ([3:39](https://www.youtube.com/watch?v=vzH9Fo9GI9Y&t=219s)) +- [`gpMp6tz3d7w`](https://www.youtube.com/watch?v=gpMp6tz3d7w): reupload part of the original video at [0:37](https://youtu.be/gpMp6tz3d7w?t=37s) ([0:41](https://www.youtube.com/watch?v=gpMp6tz3d7w&t=41s)) +- [`ZI7XTsGTl34`](https://www.youtube.com/watch?v=gpMp6tz3d7w): reupload part of the original video at [23:36](https://youtu.be/ZI7XTsGTl34?t=1416s) ([23:43](https://www.youtube.com/watch?v=ZI7XTsGTl34&t=1423s)) + +Note that all of these videos are partial uploads of the original video and they have auto-generated captions and all exactly contains `how many people here would say they can draw`. \ No newline at end of file