Commit Graph

422 Commits

Author SHA1 Message Date
Akash Mahanty 4fd9d142e7 Merge pull request #112 from akamhy/fix
escape '.' before 'archive.org'
2022-01-21 19:52:55 +05:30
Akash Mahanty 5e9fdb40ce escape '.' before 'archive.org'
escape '.' before 'archive.org' on line 88 so it does not match more hosts than expected.
2022-01-21 19:51:08 +05:30
Akash Mahanty fa72098270 _get_response is not used anymore
- datashaman (<https://stackoverflow.com/users/401467/datashaman>) for <https://stackoverflow.com/a/35504626>. _get_response is based on this amazing answer.
2022-01-21 19:43:35 +05:30
Akash Mahanty d18f955044 date year range 2020-2022 2022-01-21 11:55:42 +05:30
Akash Mahanty 9c340d6967 Create codeql-analysis.yml 2022-01-21 11:12:59 +05:30
Akash Mahanty 78d0e0c126 Update README.md 2022-01-21 09:54:04 +05:30
Akash Mahanty 564101e6f5 🐳 for docker image 2022-01-21 01:23:05 +05:30
Akash Mahanty de5a3e1561 improve usage code 3.0.0 2022-01-18 21:18:17 +05:30
Akash Mahanty 52e46fecc2 more usage example 2022-01-18 20:58:39 +05:30
Akash Mahanty 3b6415abc7 updating examples 2022-01-18 20:44:47 +05:30
Akash Mahanty 66e16d6d89 define __repr__ for the Availability API class 2022-01-18 20:34:21 +05:30
Akash Mahanty 16b9bdd7f9 output the file name if known_url and file flag are passed. 2022-01-18 20:14:44 +05:30
Akash Mahanty 7adc01bff2 implement known_urls for cli from the newer interface. Although use of CDX is recommended but backward-compatibility matters. 2022-01-18 20:07:12 +05:30
Akash Mahanty 9bbd056268 Update README.md 2022-01-17 02:15:38 +05:30
Akash Mahanty 2ab44391cf close #107, added link to SecSI/Docker image 2022-01-16 23:01:31 +05:30
Akash Mahanty cc3628ae18 define __str__ for objects of WaybackMachineAvailabilityAPI class, the check for self.JSON ensures that the API was atleast called. 2022-01-16 22:28:12 +05:30
Akash Mahanty 1d751b942b invoke json, was a bad idea removing it the earlier commit as the end user should not have to call it 2022-01-16 22:15:25 +05:30
Akash Mahanty 261a867a21 near() method of WaybackMachineAvailabilityAPI return self to preserve past behaviour 2022-01-16 21:53:54 +05:30
Akash Mahanty 2e487e88d3 define __len__ on Url objects, if any method not used prior to len op then default to len of oldest archive. 2022-01-16 21:29:43 +05:30
Akash Mahanty c8d0ad493a defined __str__ for Url objects, print func should print the url. 2022-01-16 21:22:43 +05:30
Akash Mahanty ce869177fd Merge pull request #103 from akamhy/whitesource/configure
Configure WhiteSource Bolt for GitHub
2022-01-02 16:04:15 +05:30
whitesource-bolt-for-github[bot] 58616fb986 Add .whitesource configuration file 2022-01-02 08:45:07 +00:00
Akash Mahanty 4e68cd5743 Create separate module for the 3 different APIs also CDX is now CLI supported. 2022-01-02 14:14:45 +05:30
akamhy a7b805292d changes made for v2.4.4 (update download_url) (#100)
* v2.4.4 (update download_url)

* v2.4.4 (update __version__)

* +1

add jonasjancarik
2.4.4
2021-09-03 11:28:26 +05:30
Jonáš Jančařík 6dc6124dc4 Raise error on a 509 response (too many sessions) (#99)
* Raise error on a 509 response (too many sessions)

When the response code is 509, raise an error with an explanation (based on the actual error message contained in the response HTML).

* Raise error on a 509 response (too many sessions) - linting
2021-09-03 08:04:36 +05:30
Jens Finkhaeuser 5a7fc7d568 Fix typo (#95) 2021-04-13 16:58:34 +05:30
Akash Mahanty 5a9c861cad v2.4.3 (#94)
* 2.4.3

* 2.4.3
2.4.3
2021-04-02 10:41:59 +05:30
Akash Mahanty dd1917c77e added RedirectSaveError - for failed saves if the URL is a permanent … (#93)
* added RedirectSaveError - for failed saves if the URL is a permanent redirect.

* check if url is redirect before throwing exceptions, res.url is the redirect url if redirected at all

* update tests and cli errors
2021-04-02 10:38:17 +05:30
Akash Mahanty db8f902cff Add doc strings (#90)
* Added some docstrings in utils.py

* renamed some func/meth to better names and added doc strings + lint

* added more docstrings

* more docstrings

* improve docstrings

* docstrings

* added more docstrings, lint

* fix import error
2021-01-26 11:56:03 +05:30
Akash Mahanty 88cda94c0b v2.4.2 (#89)
* v2.4.2

* v2.4.2
2.4.2
2021-01-24 17:03:35 +05:30
Akash Mahanty 09290f88d1 fix one more error 2021-01-24 16:58:53 +05:30
Akash Mahanty e5835091c9 import re 2021-01-24 16:56:59 +05:30
Akash Mahanty 7312ed1f4f set cached_save to True if archive older than 3 mins. 2021-01-24 16:53:36 +05:30
Akash Mahanty 6ae8f843d3 add --file to --known_urls 2021-01-24 16:15:11 +05:30
Akash Mahanty 36b936820b known urls now yileds, more reliable. And save the file in chucks wrt to response. --file arg can be used to create output file, if --file not used no output will be saved in any file. (#88) 2021-01-24 16:11:39 +05:30
Akash Mahanty a3bc6aad2b too much API usage by duplicate tests was causing too much tests failure 2021-01-23 21:08:21 +05:30
Akash Mahanty edc2f63d93 Output valid JSON, dumps python dict. Make JSON valid. 2021-01-23 20:43:52 +05:30
Akash Mahanty ffe0810b12 flag to check if the archive saved is 30 mins older or not 2021-01-16 12:06:08 +05:30
Akash Mahanty 40233eb115 improve code quality, remove unused imports, use system randomness etc 2021-01-16 11:35:13 +05:30
Akash Mahanty d549d31421 improve save method, now we know that 302 errors indicates that wayback machine is archiving the URL and hasn't yet archived. We construct an artifical archive with the current UTC time and check for HTTP status code 20* or 30*. If we verify the archival, we return the artifical archive. The artificial archive will automatically point to the new archive or in best case will be the new archive after some time. 2021-01-16 10:47:43 +05:30
Akash Mahanty 0725163af8 mimify the logo, remove ugly old logos 2021-01-15 18:14:48 +05:30
Akash Mahanty 712471176b better error messages(str), check latest version before asking for an upgrade and rm alive checking 2021-01-15 16:47:26 +05:30
Akash Mahanty dcd7b03302 getting rid of c style str formatting, now using .format 2021-01-14 19:30:07 +05:30
Akash Mahanty 76205d9cf6 backoff_factor=2 for save, incr success by 25% 2021-01-13 10:13:16 +05:30
Akash Mahanty ec0a0d04cc + dequeued0
dequeued0 (https://github.com/dequeued0) for reporting bugs and useful feature requests.
2021-01-12 10:52:41 +05:30
Akash Mahanty 7bb01df846 v2.4.1 2.4.1 2021-01-12 10:18:09 +05:30
Akash Mahanty 6142e0b353 get should retrive the last fetched archive by default 2021-01-12 10:07:14 +05:30
Akash Mahanty a65990aee3 don't use pagination API if total pages <= 2 2021-01-12 09:46:07 +05:30
Akash Mahanty 259a024eb1 joke? they changed their robots.txt 2021-01-11 23:17:01 +05:30
Akash Mahanty 91402792e6 + Supported Features
tell what the package can do, many users probably do not read the full usage.
2021-01-11 23:01:18 +05:30