Add support for multiple keys to be resilient against exceeded quota errors #6

New Issue

Benjamin_Loison · 2023-01-02T19:50:26+01:00

Benjamin_Loison commented

2023-01-02 19:50:26 +01:00

While #4 avoids this improvement, doing it would avoid using a proxy and so adding latency (in addition that the https://yt.lemnoslife.com no-key service has two logical cores and so wouldn't be able to manage requests from multi-threaded processes).

I have a personal YouTube Data API v3 API key with 16,000,000 quota per day and 180,000 per minute. As 180,000 * 60 * 24 = 259,200,000 >> 16,000,000 our limitation will be the quota per day, as our algorithm have in average no special quota burning period.

From my home server and my VPS it takes respectively 250 ms and 175 ms to complete (tested twice):

time curl 'https://www.googleapis.com/youtube/v3/commentThreads?part=snippet,replies&allThreadsRelatedToChannelId=UC4QobU6STFB0P71PMvOGN5A&maxResults=100&key=AIzaSy...' > /dev/null

So can assume that in the best case in average it takes 50 ms to complete a request to YouTube Data API v3.

Both YouTube Data API v3 CommentThreads: list and Comments: list endpoints cost 1 quota per call.

So per day it limits us to 16,000,000 / (3,600 * 24 * 20) = 9(.259) threads.
Will work with this multi-threading limitation for the moment as it's already a good start. Even with 8 logical cores, as threads are idling while waiting response from YouTube Data API v3 servers and treating the responses is quite trivial (only some mutex and logging stuff), we could go far beyond this limit.

Related to #7 and #9.

While #4 avoids this improvement, doing it would avoid using a proxy and so adding latency (in addition that the https://yt.lemnoslife.com no-key service has two logical cores and so wouldn't be able to manage requests from multi-threaded processes). I have a personal YouTube Data API v3 API key with 16,000,000 quota per day and 180,000 per minute. As 180,000 * 60 * 24 = 259,200,000 >> 16,000,000 our limitation will be the quota per day, as our algorithm have in average no special quota burning period. From my home server and my VPS it takes respectively 250 ms and 175 ms to complete (tested twice): ```sh time curl 'https://www.googleapis.com/youtube/v3/commentThreads?part=snippet,replies&allThreadsRelatedToChannelId=UC4QobU6STFB0P71PMvOGN5A&maxResults=100&key=AIzaSy...' > /dev/null ``` So can assume that in the best case in average it takes 50 ms to complete a request to YouTube Data API v3. Both YouTube Data API v3 [CommentThreads: list](https://developers.google.com/youtube/v3/docs/commentThreads/list) and [Comments: list](https://developers.google.com/youtube/v3/docs/comments/list) endpoints cost 1 quota per call. So per day it limits us to 16,000,000 / (3,600 * 24 * 20) = 9(.259) threads. Will work with this multi-threading limitation for the moment as it's already a good start. Even with 8 logical cores, as threads are idling while waiting response from YouTube Data API v3 servers and treating the responses is quite trivial (only some mutex and logging stuff), we could go far beyond this limit. Related to #7 and #9.

Benjamin_Loison changed title from ~~Add support for multiple keys~~ to Add support for multiple keys to be resilient against exceeded quota errors

2023-01-02 19:50:43 +01:00

Benjamin_Loison commented

2023-01-03 05:31:53 +01:00

With 10 threads, for no clear reason the 8 logical cores are used at the maximum of their capacity according to htop, that should be investigated (finally treated in fdfec17817 thanks to a different search approach for already treated channels). However with such a number of threads I observed that in practice we treat about 100,000 comments per minute.

This might highlight the importance of being more precise with the data we store (cf #2).

With 10 threads, for no clear reason the 8 logical cores are used at the maximum of their capacity according to `htop`, that should be investigated (finally treated in fdfec17817 thanks to a different search approach for already treated channels). However with such a number of threads I observed that in practice we treat about 100,000 comments per minute. This might highlight the importance of being more precise with the data we store (cf #2).

Benjamin_Loison added the

medium

label 2023-01-06 18:50:46 +01:00

Benjamin_Loison added the

enhancement

label 2023-01-06 18:52:52 +01:00

Benjamin_Loison added the

medium priority

label 2023-01-06 19:33:01 +01:00

Benjamin_Loison commented

2023-01-08 14:42:27 +01:00

Making statistics of estimated quota per key on the official instance no-key service would be interesting to know how many threads can I use.

No need to especially make statistics about quota usage, as we can't do better anyway. However it will possibly be interesting if we reach the quota limit of all keys of the YouTube operational API no-key service. Otherwise we could use a dichotomy approach on the number of threads but we have to keep in mind that these keys are used by their original users and by the no-key service, so we can't make any guarantee that they even have any quota.

Making statistics of estimated quota per key on the official instance no-key service would be interesting to know how many threads can I use. No need to especially make statistics about quota usage, as we can't do better anyway. However it will possibly be interesting if we reach the quota limit of all keys of the YouTube operational API no-key service. Otherwise we could use a dichotomy approach on the number of threads but we have to keep in mind that these keys are used by their original users and by the no-key service, so we can't make any guarantee that they even have any quota.

Benjamin_Loison added

high priority

and removed

medium

labels 2023-01-08 16:59:49 +01:00

Benjamin_Loison referenced this issue

2023-01-08 17:26:41 +01:00

Stop using macros for user inputs to notably make releases #24

Benjamin_Loison referenced this issue from a commit

2023-01-08 17:58:23 +01:00

Fix #6: Add support for multiple keys to be resilient against exceeded quota errors

Benjamin_Loison closed this issue