diff --git a/README.md b/README.md index e258783..54fd522 100644 --- a/README.md +++ b/README.md @@ -1,9 +1,12 @@ As explained in the project proposal, the idea to retrieve all video ids is to start from a starting set of channels, then list their videos using YouTube Data API v3 PlaylistItems: list, then list the comments on their videos and then restart the process as we potentially retrieved new channels thanks to comment authors on videos from already known channels. For a given channel, there is a single way to list comments users published on it: -As explained, YouTube Data API v3 PlaylistItems: list endpoint enables us to list the channel videos **(isn't there any limit?)** and CommentThreads: list and Comments: list endpoints enable us to retrieve their comments +As explained, YouTube Data API v3 PlaylistItems: list endpoint enables us to list the channel videos up to 20,000 videos and CommentThreads: list and Comments: list endpoints enable us to retrieve their comments We can multi-thread this process by channel or we can multi-thread per videos of a given channel. As would like to proceed channel per channel, the question is **how much time does it take to retrieve all comments from the biggest YouTube channel? If the answer is a long period of time, then multi-threading per videos of a given channel may make sense.** +**In fact should proceed fastly with CommentThreads: list with `allThreads...` when possible** +**do I have an example of channels where commentthreads: list work but doesn't list a comment of a video ... ?** + Have to proceed with a breadth-first search approach as treating all *child* channels might take a time equivalent to treating the whole original tree.