Akash Mahanty
|
31e3eaeab8
|
fix import error
|
2021-01-26 11:53:01 +05:30 |
|
Akash Mahanty
|
acebfabc3e
|
added more docstrings, lint
|
2021-01-26 11:42:29 +05:30 |
|
Akash Mahanty
|
817c0ee844
|
docstrings
|
2021-01-26 02:13:52 +05:30 |
|
Akash Mahanty
|
7e2b51b155
|
improve docstrings
|
2021-01-25 23:47:28 +05:30 |
|
Akash Mahanty
|
36ab6405be
|
more docstrings
|
2021-01-25 23:18:09 +05:30 |
|
Akash Mahanty
|
5e2fac666a
|
added more docstrings
|
2021-01-25 22:50:17 +05:30 |
|
Akash Mahanty
|
0a241818ff
|
renamed some func/meth to better names and added doc strings + lint
|
2021-01-25 21:11:49 +05:30 |
|
Akash Mahanty
|
d1061bfdc8
|
Added some docstrings in utils.py
|
2021-01-25 20:37:59 +05:30 |
|
Akash Mahanty
|
88cda94c0b
|
v2.4.2 (#89)
* v2.4.2
* v2.4.2
2.4.2
|
2021-01-24 17:03:35 +05:30 |
|
Akash Mahanty
|
09290f88d1
|
fix one more error
|
2021-01-24 16:58:53 +05:30 |
|
Akash Mahanty
|
e5835091c9
|
import re
|
2021-01-24 16:56:59 +05:30 |
|
Akash Mahanty
|
7312ed1f4f
|
set cached_save to True if archive older than 3 mins.
|
2021-01-24 16:53:36 +05:30 |
|
Akash Mahanty
|
6ae8f843d3
|
add --file to --known_urls
|
2021-01-24 16:15:11 +05:30 |
|
Akash Mahanty
|
36b936820b
|
known urls now yileds, more reliable. And save the file in chucks wrt to response. --file arg can be used to create output file, if --file not used no output will be saved in any file. (#88)
|
2021-01-24 16:11:39 +05:30 |
|
Akash Mahanty
|
a3bc6aad2b
|
too much API usage by duplicate tests was causing too much tests failure
|
2021-01-23 21:08:21 +05:30 |
|
Akash Mahanty
|
edc2f63d93
|
Output valid JSON, dumps python dict. Make JSON valid.
|
2021-01-23 20:43:52 +05:30 |
|
Akash Mahanty
|
ffe0810b12
|
flag to check if the archive saved is 30 mins older or not
|
2021-01-16 12:06:08 +05:30 |
|
Akash Mahanty
|
40233eb115
|
improve code quality, remove unused imports, use system randomness etc
|
2021-01-16 11:35:13 +05:30 |
|
Akash Mahanty
|
d549d31421
|
improve save method, now we know that 302 errors indicates that wayback machine is archiving the URL and hasn't yet archived. We construct an artifical archive with the current UTC time and check for HTTP status code 20* or 30*. If we verify the archival, we return the artifical archive. The artificial archive will automatically point to the new archive or in best case will be the new archive after some time.
|
2021-01-16 10:47:43 +05:30 |
|
Akash Mahanty
|
0725163af8
|
mimify the logo, remove ugly old logos
|
2021-01-15 18:14:48 +05:30 |
|
Akash Mahanty
|
712471176b
|
better error messages(str), check latest version before asking for an upgrade and rm alive checking
|
2021-01-15 16:47:26 +05:30 |
|
Akash Mahanty
|
dcd7b03302
|
getting rid of c style str formatting, now using .format
|
2021-01-14 19:30:07 +05:30 |
|
Akash Mahanty
|
76205d9cf6
|
backoff_factor=2 for save, incr success by 25%
|
2021-01-13 10:13:16 +05:30 |
|
Akash Mahanty
|
ec0a0d04cc
|
+ dequeued0
dequeued0 (https://github.com/dequeued0) for reporting bugs and useful feature requests.
|
2021-01-12 10:52:41 +05:30 |
|
Akash Mahanty
|
7bb01df846
|
v2.4.1
2.4.1
|
2021-01-12 10:18:09 +05:30 |
|
Akash Mahanty
|
6142e0b353
|
get should retrive the last fetched archive by default
|
2021-01-12 10:07:14 +05:30 |
|
Akash Mahanty
|
a65990aee3
|
don't use pagination API if total pages <= 2
|
2021-01-12 09:46:07 +05:30 |
|
Akash Mahanty
|
259a024eb1
|
joke? they changed their robots.txt
|
2021-01-11 23:17:01 +05:30 |
|
Akash Mahanty
|
91402792e6
|
+ Supported Features
tell what the package can do, many users probably do not read the full usage.
|
2021-01-11 23:01:18 +05:30 |
|
Akash Mahanty
|
eabf4dc046
|
don't fetch more pages if >=2 pages are empty
|
2021-01-11 22:43:14 +05:30 |
|
Akash Mahanty
|
5a7bd73565
|
support unix ts as an arg in near
|
2021-01-11 19:53:37 +05:30 |
|
Akash Mahanty
|
4693dbf9c1
|
change str repr of cdxsnapshot to cdx line
|
2021-01-11 09:34:37 +05:30 |
|
Akash Mahanty
|
f4f2e51315
|
V2.4.0 (#62)
* v 2.4.0
* v 2.4.0
2.4.0
|
2021-01-10 11:53:45 +05:30 |
|
Akash Mahanty
|
d6b7df6837
|
no need to de-duplicate as we are collapsing the results by urlkey
Same urls aren't recieved
|
2021-01-10 11:36:46 +05:30 |
|
Akash Mahanty
|
dafba5d0cb
|
collapses=["urlkey"] for known urls
|
2021-01-10 11:34:06 +05:30 |
|
Akash Mahanty
|
6c71dfbe41
|
use cdx matchtype for domain and host
|
2021-01-10 11:10:49 +05:30 |
|
Akash Mahanty
|
a6470b1036
|
not passing dict to cdxsnapshot
|
2021-01-10 10:40:32 +05:30 |
|
Akash Mahanty
|
04cda4558e
|
fix test
|
2021-01-10 03:18:09 +05:30 |
|
Akash Mahanty
|
625ed63482
|
remove asserts stmnts
|
2021-01-10 03:05:48 +05:30 |
|
Akash Mahanty
|
a03813315f
|
full cdx api support
|
2021-01-10 02:23:53 +05:30 |
|
Akash Mahanty
|
a2550f17d7
|
retries support for get requests
|
2021-01-06 01:58:38 +05:30 |
|
Akash Mahanty
|
15ef5816db
|
Always cast url to string, avoid passing waybackpy objects to _get_response
|
2021-01-05 19:46:17 +05:30 |
|
Akash Mahanty
|
93b52bd0fe
|
FIX : don't use self.user_agent if user_agent passed in get()
|
2021-01-05 19:31:27 +05:30 |
|
Akash Mahanty
|
28ff877081
|
Update README.md
|
2021-01-05 19:08:35 +05:30 |
|
Akash Mahanty
|
3e3ecff9df
|
l2 heading and lint
2.3.3
|
2021-01-05 01:59:29 +05:30 |
|
Akash Mahanty
|
ce64135ba8
|
ce
|
2021-01-05 01:52:35 +05:30 |
|
Akash Mahanty
|
2af6580ffb
|
docs link
|
2021-01-05 01:51:53 +05:30 |
|
Akash Mahanty
|
8a3c515176
|
v2.3.3
|
2021-01-05 01:49:26 +05:30 |
|
Akash Mahanty
|
d98c4f32ad
|
v2.3.3
|
2021-01-05 01:48:54 +05:30 |
|
Akash Mahanty
|
e0a4b007d5
|
improve docs
|
2021-01-05 01:46:12 +05:30 |
|