Commit Graph

406 Commits

Author SHA1 Message Date
Akash Mahanty
552967487e
Merge pull request #114 from rafaelrdealmeida/patch-1
Update setup.py

See also <https://github.com/akamhy/waybackpy/issues/111#issuecomment-1020673814>
2022-01-25 10:30:34 +05:30
Rafael de Almeida
86a90a3840
Update setup.py
pep8
2022-01-24 22:03:28 -03:00
Rafael de Almeida
759874cdc6
Update setup.py
see: https://github.com/akamhy/waybackpy/issues/111#issuecomment-1020673814
2022-01-24 21:23:31 -03:00
Akash Mahanty
06095202fe BUG FIX : forgot to use the endpoint from the instance and also assign payload to param. Bug caught by the flake8 in the CI tests. 2022-01-24 23:35:48 +05:30
Akash Mahanty
06fc7855bf waybackpy/cdx_api.py : deafult user agent is now DEFAULT_USER_AGENT, get_response now take url and headers as arguments and request url is generated by full_url function. max_tries added as parameter for the WaybackMachineCDXServerAPI class with default value of 3. 2022-01-24 23:20:49 +05:30
Akash Mahanty
c49fe971fd update the older deprecation not for Url class, the newer date is now 2025 instead of 2024. 2022-01-24 23:15:59 +05:30
Akash Mahanty
d6783d5525 added tests for cdx_utils.py 2022-01-24 23:05:47 +05:30
Akash Mahanty
9262f5da21 improve functions get_total_pages, get_response and lint check_filters, check_collapses and check_match_type
get_total_pages : default user agent is now DEFAULT_USER_AGENT
                  and now instead of str formatting passing payload
                  as param to full_url to generate the request url
                  also get_response make the request instead of directly
                  using requests.get()

get_response : get_response is now not taking param as keyword arguments
               instead the invoker is supposed to pass the full url which
               may be generated by the full_url function therefore the return_full_url=False,
               is deprecated also.
               Also now closing the session via session.close()
               No need to check 'Exceeded 30 redirects' as save API uses a
               diffrent method.

check_filters : Not assigning to variables the return of match groups
                beacause we wont be using them and the linter picks these
                unused assignments.

check_collapses : Same reason as for check_filters but also removed a foolish
                  test that checks equality with objects that are guaranteed
                  to be same.

check_match_type : Updated the text that of WaybackError
2022-01-24 22:57:20 +05:30
Akash Mahanty
d1a1cf2546 added tests for utils.py at tests/test_utils.py also changed a keyword argument from headers to user_agent for latest_version of utils.py with the usual default vaule. 2022-01-24 17:50:36 +05:30
Akash Mahanty
cd8a32ed1f added tests for cdx_snapshot.py at tests/test_cdx_snapshot.py 2022-01-24 16:29:44 +05:30
Akash Mahanty
57512c65ff change test oldest method from google.com to example.com, the oldest on google is for some unknown reason is not very stable. 2022-01-24 16:27:35 +05:30
Akash Mahanty
d9ea26e11c added code style black badge 2022-01-24 13:46:31 +05:30
Akash Mahanty
2bea92b348 fix bug with the third matching case of the archive_url_parser, caught while writing more tests fo the save API interface. 2022-01-24 13:31:30 +05:30
Akash Mahanty
d506685f68 added some tests for save_api interface 2022-01-23 18:35:54 +05:30
Akash Mahanty
7844d15d99 close the session in save api interface 2022-01-23 18:34:06 +05:30
Akash Mahanty
c0252edff2 updated tests for availability_api.py and also added max_tries(default value is 3) with delay (sleep) between successive API calls. The dealy actually improves the performace of the availability_api interface. 2022-01-23 15:05:10 +05:30
Akash Mahanty
e7488f3a3e added test badge, rename test to Tests from ubuntu and fix the Incomplete URL substring sanitization(or trying to) 2022-01-23 02:26:53 +05:30
Akash Mahanty
aed75ad1db Make modules imprtable as part of a Python package, waybackpy by creating __init__.py file in tests 2022-01-23 02:14:38 +05:30
Akash Mahanty
d740959c34 more dev reqs 2022-01-23 02:10:12 +05:30
Akash Mahanty
2d83043ef7 + flake8 in requirements-dev.txt 2022-01-23 02:05:08 +05:30
Akash Mahanty
31b1056217 fix typo in CI 2022-01-23 02:03:30 +05:30
Akash Mahanty
97712b2c1e add CI unit_test.yml 2022-01-23 02:00:15 +05:30
Akash Mahanty
a8acc4c4d8 Fix Incomplete URL substring sanitization in the last commit. 2022-01-23 01:42:48 +05:30
Akash Mahanty
1bacd73002 created pytest.ini, added test for waybackpy/availability_api.py, new exceptions all of which inherit from the main WaybackError and created requirements-dev.txt 2022-01-23 01:29:07 +05:30
Akash Mahanty
79901ba968 updated README.md 2022-01-22 03:08:26 +05:30
Akash Mahanty
df64e839d7 added trove classifiers for python 3.10 2022-01-22 00:57:10 +05:30
Akash Mahanty
405e9a2a79 waybackpy/save_api.py : Added doc strings and also lint with black. 2022-01-22 00:41:10 +05:30
Akash Mahanty
db551abbf6 lint waybackpy/cdx_api.py and added some doc strings 2022-01-22 00:11:35 +05:30
Akash Mahanty
d13dd4db1a added notice on waybackpy/wrapper.py that the Url class will cease to exist after 2024-01-01 and also removed unused imports. 2022-01-21 23:14:20 +05:30
Akash Mahanty
d3bb8337a1 make setup.py smarter, now no need to update the URL again and also added more keywords. And in __version__.py updated the __author__ 2022-01-21 23:01:09 +05:30
Akash Mahanty
fd5e85420c waybackpy/availability_api.py : removed unused imports, added doc strings, removed redundant function. 2022-01-21 22:47:44 +05:30
Akash Mahanty
5c685ef5d7
upload logo and make p path not text
I was dumb to forget to convert the p to path.
2022-01-21 21:11:42 +05:30
Akash Mahanty
6a3d96b453
Logo (#113)
* Create logo.txt

* Delete waybackpy_logo.svg

* Add files via upload

* Delete logo.txt
2022-01-21 21:02:38 +05:30
Akash Mahanty
afe1b15a5f
Add files via upload 2022-01-21 20:58:53 +05:30
Akash Mahanty
4fd9d142e7
Merge pull request #112 from akamhy/fix
escape '.' before 'archive.org'
2022-01-21 19:52:55 +05:30
Akash Mahanty
5e9fdb40ce
escape '.' before 'archive.org'
escape '.' before 'archive.org' on line 88 so it does not match more hosts than expected.
2022-01-21 19:51:08 +05:30
Akash Mahanty
fa72098270
_get_response is not used anymore
- datashaman (<https://stackoverflow.com/users/401467/datashaman>) for <https://stackoverflow.com/a/35504626>. _get_response is based on this amazing answer.
2022-01-21 19:43:35 +05:30
Akash Mahanty
d18f955044
date year range 2020-2022 2022-01-21 11:55:42 +05:30
Akash Mahanty
9c340d6967
Create codeql-analysis.yml 2022-01-21 11:12:59 +05:30
Akash Mahanty
78d0e0c126
Update README.md 2022-01-21 09:54:04 +05:30
Akash Mahanty
564101e6f5
🐳 for docker image 2022-01-21 01:23:05 +05:30
Akash Mahanty
de5a3e1561
improve usage code 2022-01-18 21:18:17 +05:30
Akash Mahanty
52e46fecc2
more usage example 2022-01-18 20:58:39 +05:30
Akash Mahanty
3b6415abc7
updating examples 2022-01-18 20:44:47 +05:30
Akash Mahanty
66e16d6d89 define __repr__ for the Availability API class 2022-01-18 20:34:21 +05:30
Akash Mahanty
16b9bdd7f9 output the file name if known_url and file flag are passed. 2022-01-18 20:14:44 +05:30
Akash Mahanty
7adc01bff2 implement known_urls for cli from the newer interface. Although use of CDX is recommended but backward-compatibility matters. 2022-01-18 20:07:12 +05:30
Akash Mahanty
9bbd056268
Update README.md 2022-01-17 02:15:38 +05:30
Akash Mahanty
2ab44391cf
close #107, added link to SecSI/Docker image 2022-01-16 23:01:31 +05:30
Akash Mahanty
cc3628ae18 define __str__ for objects of WaybackMachineAvailabilityAPI class, the check for self.JSON ensures that the API was atleast called. 2022-01-16 22:28:12 +05:30