From df3af2780b1211ee10dd8a1f4b6749f3bdff9aa5 Mon Sep 17 00:00:00 2001 From: Benjamin Loison Date: Thu, 23 Feb 2023 23:12:18 +0100 Subject: [PATCH] #19: Detail how to run the website and reference `channels.txt` on it --- README.md | 73 ++++++++++++++++++++++++++++++++++++++++++++--- website/index.php | 3 +- 2 files changed, 71 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 3705c18..c1c3574 100644 --- a/README.md +++ b/README.md @@ -9,26 +9,91 @@ A ready to be used by the end-user website instance of this project is hosted at See more details on [the Wiki](https://gitea.lemnoslife.com/Benjamin_Loison/YouTube_captions_search_engine/wiki). -# Running the algorithm: +# Running the YouTube graph discovery algorithm: Because of [the current compression mechanism](https://gitea.lemnoslife.com/Benjamin_Loison/YouTube_captions_search_engine/issues/30), Linux is the only known OS able to run this algorithm. +To install the dependencies on an `apt` based Linux distribution of this project, run: + ```sh sudo apt install nlohmann-json3-dev yt-dlp +``` + +To compile the YouTube discovery graph algorithm, run: + +```sh make +``` + +To see the command line arguments of the algorithm, run: + +```sh ./youtubeCaptionsSearchEngine -h ``` -If you plan to use the front-end website, also run: +To run the YouTube discovery graph algorithm, run: ```sh -pip install webvtt-py +./youtubeCaptionsSearchEngine ``` Except if you provide the argument `--youtube-operational-api-instance-url https://yt.lemnoslife.com`, you have [to host your own instance of the YouTube operational API](https://github.com/Benjamin-Loison/YouTube-operational-API/#install-your-own-instance-of-the-api). Except if you provide the argument `--no-keys`, you have to provide at least one [YouTube Data API v3 key](https://developers.google.com/youtube/v3/getting-started) in `keys.txt`. +# Hosting the website enabling users to make requests: + +To install its dependencies make sure to have [`pip`](https://pip.pypa.io/en/stable/installation/) and [`composer`](https://getcomposer.org/doc/00-intro.md) installed and run: + ```sh -./youtubeCaptionsSearchEngine +composer install +pip install webvtt-py +``` + +Add the following configuration to your Nginx website one: + +```nginx + # Make the default webpage of your website to be `index.php`. + index index.php; + + # Allow end-users to retrieve the content of a file within a channel zip. + location /channels { + rewrite ^(.*).zip$ /channels.php; + rewrite ^(.*).zip/(.*).json$ /channels.php; + rewrite ^(.*).zip/(.*).txt$ /channels.php; + rewrite ^(.*).zip/(.*).vtt$ /channels.php; + # Allow end-users to list `channels/` content. + autoindex on; + } + + # Disable end-users to access to other end-users requests. + location /users { + deny all; + } + + # Configure the websocket endpoint. + location /websocket { + # switch off logging + access_log off; + + # redirect all HTTP traffic to localhost + proxy_pass http://localhost:4430; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header Host $host; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + + # WebSocket support (nginx 1.4) + proxy_http_version 1.1; + proxy_set_header Upgrade $http_upgrade; + proxy_set_header Connection "upgrade"; + + # timeout extension, possibly keep this short if using a ping strategy + proxy_read_timeout 99999s; + } +``` + +Start the websocket worker by running: + +```sh +php websockets.php ``` diff --git a/website/index.php b/website/index.php index 3359f10..2d7fd93 100644 --- a/website/index.php +++ b/website/index.php @@ -9,7 +9,8 @@ See for more information.
-Access raw data with: . +Access raw data with: .
+Access found channels with: .