Akash Mahanty
ffe0810b12
flag to check if the archive saved is 30 mins older or not
2021-01-16 12:06:08 +05:30
Akash Mahanty
40233eb115
improve code quality, remove unused imports, use system randomness etc
2021-01-16 11:35:13 +05:30
Akash Mahanty
d549d31421
improve save method, now we know that 302 errors indicates that wayback machine is archiving the URL and hasn't yet archived. We construct an artifical archive with the current UTC time and check for HTTP status code 20* or 30*. If we verify the archival, we return the artifical archive. The artificial archive will automatically point to the new archive or in best case will be the new archive after some time.
2021-01-16 10:47:43 +05:30
Akash Mahanty
712471176b
better error messages(str), check latest version before asking for an upgrade and rm alive checking
2021-01-15 16:47:26 +05:30
Akash Mahanty
dcd7b03302
getting rid of c style str formatting, now using .format
2021-01-14 19:30:07 +05:30
Akash Mahanty
76205d9cf6
backoff_factor=2 for save, incr success by 25%
2021-01-13 10:13:16 +05:30
Akash Mahanty
7bb01df846
v2.4.1
2021-01-12 10:18:09 +05:30
Akash Mahanty
6142e0b353
get should retrive the last fetched archive by default
2021-01-12 10:07:14 +05:30
Akash Mahanty
a65990aee3
don't use pagination API if total pages <= 2
2021-01-12 09:46:07 +05:30
Akash Mahanty
eabf4dc046
don't fetch more pages if >=2 pages are empty
2021-01-11 22:43:14 +05:30
Akash Mahanty
5a7bd73565
support unix ts as an arg in near
2021-01-11 19:53:37 +05:30
Akash Mahanty
4693dbf9c1
change str repr of cdxsnapshot to cdx line
2021-01-11 09:34:37 +05:30
Akash Mahanty
f4f2e51315
V2.4.0 ( #62 )
...
* v 2.4.0
* v 2.4.0
2021-01-10 11:53:45 +05:30
Akash Mahanty
d6b7df6837
no need to de-duplicate as we are collapsing the results by urlkey
...
Same urls aren't recieved
2021-01-10 11:36:46 +05:30
Akash Mahanty
dafba5d0cb
collapses=["urlkey"] for known urls
2021-01-10 11:34:06 +05:30
Akash Mahanty
6c71dfbe41
use cdx matchtype for domain and host
2021-01-10 11:10:49 +05:30
Akash Mahanty
a6470b1036
not passing dict to cdxsnapshot
2021-01-10 10:40:32 +05:30
Akash Mahanty
625ed63482
remove asserts stmnts
2021-01-10 03:05:48 +05:30
Akash Mahanty
a03813315f
full cdx api support
2021-01-10 02:23:53 +05:30
Akash Mahanty
a2550f17d7
retries support for get requests
2021-01-06 01:58:38 +05:30
Akash Mahanty
15ef5816db
Always cast url to string, avoid passing waybackpy objects to _get_response
2021-01-05 19:46:17 +05:30
Akash Mahanty
93b52bd0fe
FIX : don't use self.user_agent if user_agent passed in get()
2021-01-05 19:31:27 +05:30
Akash Mahanty
d98c4f32ad
v2.3.3
2021-01-05 01:48:54 +05:30
Akash Mahanty
e0a4b007d5
improve docs
2021-01-05 01:46:12 +05:30
Akash Mahanty
1882862992
now using cdx Pagination API
2021-01-04 20:46:54 +05:30
Akash Mahanty
0c6107e675
increase coverage
2021-01-04 01:54:40 +05:30
Akash Mahanty
5dec4927cd
refactoring, try to code complexity
2021-01-04 00:14:38 +05:30
Akash Mahanty
62e5217b9e
reduce code complexity: refactoring, less flow breaking structures
2021-01-03 19:38:25 +05:30
Akash Mahanty
9823c809e9
Added doc strings in wrapper.py, documenting code and improving docs.
2021-01-03 17:11:32 +05:30
Akash Mahanty
db5737a857
JSON is now available for near and other other methods that call it
2021-01-02 18:52:46 +05:30
Akash Mahanty
bb4dbc7d3c
rm url = obj.url
2021-01-02 11:19:09 +05:30
Akash Mahanty
7c7fd75376
No need to fetch archive_url and timestamp from availability API on init ( #55 )
...
* No need to fetch archive_url and timestamp from availability API on init.
Not useful if all I want is to archive a page
* Update test_wrapper.py
* Update wrapper.py
* Update test_wrapper.py
* Update wrapper.py
* Update cli.py
* Update wrapper.py
* Update __version__.py
* Update __version__.py
* Update __version__.py
* Update __version__.py
* Update setup.py
* Update README.md
2021-01-02 11:10:23 +05:30
Akash Mahanty
0b71433667
v2.3.1 ( #54 )
...
* 2.3.1
* 2.3.1
2021-01-01 19:15:23 +05:30
Akash Mahanty
1b499a7594
removed JSON from init, this was resulting in too much unnecessary taffic. Some users who are thousands of URLs were blocked by IA ( #53 )
...
closes #52
2021-01-01 16:38:57 +05:30
Akash Mahanty
da390ee8a3
improve maintainability and reduce code cognitive complexity ( #49 )
2020-12-15 10:24:13 +05:30
Akash Mahanty
d3e68d0e70
code formated with black ( #47 )
2020-12-14 01:18:04 +05:30
Akash Mahanty
93ef60ecd2
v2.3.0 ( #46 )
...
* v2.3.0
* v2.3.0
* decrease line length
2020-12-14 00:14:54 +05:30
Akash Mahanty
fd163f3d36
Update wrapper.py
2020-12-13 17:12:32 +05:30
Akash Mahanty
a0a918cf0d
.
2020-12-13 17:10:28 +05:30
Akash Mahanty
4943cf6873
remove print stmnt, update ci
2020-12-13 16:37:35 +05:30
Akash Mahanty
bc3efc7d63
now using requests lib as it handles errors nicely ( #42 )
...
* now using requests lib as it handles errors nicely
* remove unused import (urllib)
* FIX : replaced full_url with endpoint (not using urlib)
* LINT : Found in waybackpy\wrapper.py:88 Unnecessary else after return
2020-12-13 15:44:37 +05:30
Akash Mahanty
f89368f16d
LINT : Found in waybackpy\wrapper.py:88 Unnecessary else after return
2020-12-13 15:39:23 +05:30
Akash Mahanty
c919a6a605
FIX : replaced full_url with endpoint (not using urlib)
2020-12-13 15:22:56 +05:30
Akash Mahanty
60ee8b95a8
now using requests lib as it handles errors nicely
2020-12-13 15:05:57 +05:30
Akash Mahanty
ca51c14332
deleted .travis.yml, link with flake ( #41 )
...
close #38
2020-11-26 13:06:50 +05:30
Akash Mahanty
58cd9c28e7
Threading enabled checking for URLs
2020-11-26 06:15:42 +05:30
Akash Mahanty
5088305a58
removed python2 compatibility code
2020-11-21 17:00:11 +05:30
Akash Mahanty
925be7b17e
V2.2.0
2020-10-17 17:10:46 +05:30
Akash Mahanty
2b132456ac
updated index.rst and minor docs updated.
2020-10-17 16:56:51 +05:30
Akash Mahanty
0a2f97c034
Update README, drop python 2 support
...
* Drop python 2 support
* updated docs
* added new docs
2020-10-16 22:37:32 +05:30