Commit Graph

73 Commits

Author SHA1 Message Date
Akash Mahanty
15ef5816db Always cast url to string, avoid passing waybackpy objects to _get_response 2021-01-05 19:46:17 +05:30
Akash Mahanty
93b52bd0fe FIX : don't use self.user_agent if user_agent passed in get() 2021-01-05 19:31:27 +05:30
Akash Mahanty
e0a4b007d5 improve docs 2021-01-05 01:46:12 +05:30
Akash Mahanty
1882862992 now using cdx Pagination API 2021-01-04 20:46:54 +05:30
Akash Mahanty
0c6107e675 increase coverage 2021-01-04 01:54:40 +05:30
Akash Mahanty
5dec4927cd refactoring, try to code complexity 2021-01-04 00:14:38 +05:30
Akash Mahanty
9823c809e9 Added doc strings in wrapper.py, documenting code and improving docs. 2021-01-03 17:11:32 +05:30
Akash Mahanty
db5737a857 JSON is now available for near and other other methods that call it 2021-01-02 18:52:46 +05:30
Akash Mahanty
7c7fd75376 No need to fetch archive_url and timestamp from availability API on init (#55)
* No need to fetch archive_url and timestamp from availability API on init. 

Not useful if all I want is to archive a page

* Update test_wrapper.py

* Update wrapper.py

* Update test_wrapper.py

* Update wrapper.py

* Update cli.py

* Update wrapper.py

* Update __version__.py

* Update __version__.py

* Update __version__.py

* Update __version__.py

* Update setup.py

* Update README.md
2021-01-02 11:10:23 +05:30
Akash Mahanty
1b499a7594 removed JSON from init, this was resulting in too much unnecessary taffic. Some users who are thousands of URLs were blocked by IA (#53)
closes #52
2021-01-01 16:38:57 +05:30
Akash Mahanty
d3e68d0e70 code formated with black (#47) 2020-12-14 01:18:04 +05:30
Akash Mahanty
fd163f3d36 Update wrapper.py 2020-12-13 17:12:32 +05:30
Akash Mahanty
a0a918cf0d . 2020-12-13 17:10:28 +05:30
Akash Mahanty
4943cf6873 remove print stmnt, update ci 2020-12-13 16:37:35 +05:30
Akash Mahanty
bc3efc7d63 now using requests lib as it handles errors nicely (#42)
* now using requests lib as it handles errors nicely

* remove unused import (urllib)

* FIX : replaced full_url with endpoint (not using urlib)

* LINT :  Found in waybackpy\wrapper.py:88  Unnecessary else after return
2020-12-13 15:44:37 +05:30
Akash Mahanty
f89368f16d LINT : Found in waybackpy\wrapper.py:88 Unnecessary else after return 2020-12-13 15:39:23 +05:30
Akash Mahanty
c919a6a605 FIX : replaced full_url with endpoint (not using urlib) 2020-12-13 15:22:56 +05:30
Akash Mahanty
60ee8b95a8 now using requests lib as it handles errors nicely 2020-12-13 15:05:57 +05:30
Akash Mahanty
ca51c14332 deleted .travis.yml, link with flake (#41)
close #38
2020-11-26 13:06:50 +05:30
Akash Mahanty
58cd9c28e7 Threading enabled checking for URLs 2020-11-26 06:15:42 +05:30
Akash Mahanty
5088305a58 removed python2 compatibility code 2020-11-21 17:00:11 +05:30
Akash Mahanty
7f927ec7be added tests for json and archive_url, updated broken tests (#34)
* added tests for json and archive_url, updated broken tests

* drop 2.7 support
2020-10-16 19:25:45 +05:30
danvalen1
91e7f65617 Fixing len() bug (#32)
* added class functionality

* Update wrapper.py

* style edits

* fixed bug with len() of url()

* fixing len() bug

* fixing len() bug

* squashing bug

* removed test notebook
2020-10-16 10:04:13 +05:30
danvalen1
d465454019 Adding attributes to Url class (#28)
* added class functionality

* Update wrapper.py

* style edits
2020-10-15 22:10:32 +05:30
Akash Mahanty
1a81eb97fb lint 2020-10-03 16:58:11 +05:30
Akash Mahanty
ce7294d990 Implemented new feature, known urls for domain. 2020-10-02 20:27:28 +05:30
Akash
ca9186c301 update message, sometimes raised for poor performance by wayback machine even if the url is archived. 2020-08-09 10:43:16 +05:30
Akash
8a4b631c13 new regex to parse archive, IA changed the header again :( 2020-08-09 10:36:25 +05:30
Akash
56116551ac Coverge improvements (#22)
* Update cli.py

* improved tests

* chnages for proper testing

* Type check using isinstance

* Replace elifs with if when used after return

* twitter.com --> www.ibm.com

* fix typo

* test archive urll parser and dunders

* Update test_wrapper.py
2020-07-24 15:31:21 +05:30
Akash
ed24184b99 Remove duplicate get response method 2020-07-24 00:57:22 +05:30
Akash
dee9105794 command_line support (#18)
* Update wrapper.py

* entry points cli

* Suppress the urllib2/3 Exception

* rm cli code, will create a new cli.py file

* Create cli.py

* update cli entry pts

* Update cli.py

* Update cli.py

* import print_function

* Update cli.py

* Update cli.py

* Delete pypi_uploader.sh

* resolve conflicts with the master

* update the test ; resolve the conflicts

* decrease code complexity

* cli method changed to main

* get is not for just local usage

* get method should be available from interface

* get is used in the interface

* Update cli.py
2020-07-22 16:40:13 +05:30
Akash
b3a7e714a5 Update wrapper.py 2020-07-22 10:57:43 +05:30
Akash
cd9841713c Update wrapper.py 2020-07-22 10:52:43 +05:30
AntiCompositeNumber
1ea9548d46 Raise WaybackError from URLError and include URL (#19)
* Raise WaybackError from URLError and include URL

* python2 compatibility

Co-authored-by: Akash <64683866+akamhy@users.noreply.github.com>
2020-07-22 10:51:44 +05:30
AntiCompositeNumber
be7642c837 Code style improvements (#20)
* Add sane line length to setup.cfg

* Use Black for quick readability improvements

* Clean up exceptions, docstrings, and comments

Docstrings on dunder functions are redundant and typically ignored
Limit to reasonable line length
General grammar and style corrections
Clarify docstrings and exceptions
Format docstrings per PEP 257 -- Docstring Conventions

* Move archive_url_parser out of Url.save()

It's generally poor form to define a function in a function, as it will
be re-defined each time the function is run.

archive_url_parser does not depend on anything in Url, so it makes sense
to move it out of the class.

* move wayback_timestamp out of class, mark private functions

* DRY in _wayback_timestamp

* Url._url_check should return None

There's no point in returning True if it's never checked and won't ever
be False.
Implicitly returning None or raising an exception is more idiomatic.

* Default parameters should be type-consistant with expected values

* Specify parameters to near

* Use datetime.datetime in _wayback_timestamp

* cleanup __init__.py

* Cleanup formatting in tests

* Fix names in tests

* Revert "Use datetime.datetime in _wayback_timestamp"

This reverts commit 5b30380865.

Introduced unnecessary complexity

* Move _get_response outside of Url

Because Codacy reminded me that I missed it.

* fix imports in tests
2020-07-22 10:09:14 +05:30
Akash
8fd4462025 Update wrapper.py 2020-07-20 20:17:18 +05:30
Akash
f3bb9a8540 Update wrapper.py 2020-07-20 10:11:36 +05:30
Akash
bb94e0d1c5 Update index.rst and remove dupes 2020-07-20 10:07:31 +05:30
Akash
83c962166d Raise 2020-07-19 23:02:04 +05:30
Akash
8ab116f276 API chnaged again. updated
* Update wrapper.py

* Update wrapper.py

* Update wrapper.py

* Update wrapper.py

* Update wrapper.py

* api changed; fix archive url parser

* Update wrapper.py

* - Trailing whitespace

* include the header in exception
2020-07-19 20:39:07 +05:30
Akash
58d2d585c8 No timeout for final try 2020-07-18 18:29:41 +05:30
Akash
0ad27f5ecc update readme for newer oop and some test changes (#12)
* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* docstrings

* user agent ; more variants

* description update

* Update __init__.py

* # -*- coding: utf-8 -*-

* Update test_1.py

* update docs for get()

* Update README.md
2020-07-18 16:22:09 +05:30
Akash
f2112c73f6 Python 2 support 2020-07-17 21:08:32 +05:30
Akash
9860527d96 OOP (#10)
* Update wrapper.py

* Update exceptions.py

* Update __init__.py

* test adjusted for new changes

* Update wrapper.py
2020-07-17 20:50:00 +05:30
Akash
f881705d00 detecet python version whith sys.version_info (#9) 2020-06-26 15:48:01 +05:30
akamhy
42ac399362 Most efficient method to count (yet) 2020-05-08 09:47:13 +05:30
akamhy
e9d010c793 just count the status code, consumes less memory 2020-05-08 09:28:18 +05:30
akamhy
0c4f119981 Update wrapper.py 2020-05-07 17:25:34 +05:30
akamhy
afded51a04 Update wrapper.py 2020-05-07 17:20:23 +05:30
akamhy
b950616561 Update wrapper.py 2020-05-07 17:17:17 +05:30