Before and After methods (#175 )

* Added before and after functions * add tests * formatting
Add Python 3.11 to setup.cfg classifiers list (#179 )
2022-11-17 07:58:46 +05:30 · 2022-11-17 07:56:19 +05:30 · 2022-03-29 11:30:43 +05:30 · 2022-03-29 10:42:57 +05:30 · 2022-03-29 10:24:55 +05:30 · 2022-03-29 03:39:50 +05:30
39 changed files with 2406 additions and 924 deletions
--- a/.github/ISSUE_TEMPLATE/bug_report.md
+++ b/.github/ISSUE_TEMPLATE/bug_report.md
@ -0,0 +1,34 @@
+---
+name: Bug report
+about: Create a report to help us improve
+title: ''
+labels: bug
+assignees: akamhy
+
+---
+
+**Describe the bug**
+A clear and concise description of what the bug is.
+
+**To Reproduce**
+Steps to reproduce the behavior:
+
+1. Go to '...'
+2. Click on '....'
+3. Scroll down to '....'
+4. See error
+
+**Expected behavior**
+A clear and concise description of what you expected to happen.
+
+**Screenshots**
+If applicable, add screenshots to help explain your problem.
+
+**Version:**
+
+- OS: [e.g. iOS]
+- Version [e.g. 22]
+- Is latest version? [e.g. Yes/No]
+
+**Additional context**
+Add any other context about the problem here.
--- a/.github/ISSUE_TEMPLATE/feature_request.md
+++ b/.github/ISSUE_TEMPLATE/feature_request.md
@ -0,0 +1,19 @@
+---
+name: Feature request
+about: Suggest an idea for this project
+title: ''
+labels: enhancement
+assignees: akamhy
+---
+
+**Is your feature request related to a problem? Please describe.**
+A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
+
+**Describe the solution you'd like**
+A clear and concise description of what you want to happen.
+
+**Describe alternatives you've considered**
+A clear and concise description of any alternative solutions or features you've considered.
+
+**Additional context**
+Add any other context or screenshots about the feature request here.
--- a/.github/workflows/build-test.yml
+++ b/.github/workflows/build-test.yml
@ -14,7 +14,7 @@ jobs:
    runs-on: ubuntu-latest
    strategy:
      matrix:
-        python-version: ['3.6', '3.10']
+        python-version: ['3.7', '3.10']
    steps:
    - uses: actions/checkout@v2
    - name: Set up Python ${{ matrix.python-version }}
@ -24,7 +24,7 @@ jobs:
    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
-        pip install setuptools wheel
+        pip install -U setuptools wheel
    - name: Build test the package
      run: |
        python setup.py sdist bdist_wheel
--- a/.github/workflows/unit-test.yml
+++ b/.github/workflows/unit-test.yml
@ -0,0 +1,43 @@
+# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
+# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions
+
+name: Tests
+
+on:
+  push:
+    branches: [ master ]
+  pull_request:
+    branches: [ master ]
+
+jobs:
+  build:
+
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        python-version: ['3.10']
+    steps:
+    - uses: actions/checkout@v2
+    - name: Set up Python ${{ matrix.python-version }}
+      uses: actions/setup-python@v2
+      with:
+        python-version: ${{ matrix.python-version }}
+    - name: Install dependencies
+      run: |
+        python -m pip install --upgrade pip
+        pip install '.[dev]'
+    - name: Lint with flake8
+      run: |
+        flake8 . --count --show-source --statistics
+    - name: Lint with black
+      run: |
+        black . --check --diff
+    - name: Static type test with mypy
+      run: |
+        mypy -p waybackpy -p tests
+    - name: Test with pytest
+      run: |
+        pytest
+    - name: Upload coverage to Codecov
+      run: |
+        bash <(curl -s https://codecov.io/bash) -t ${{ secrets.CODECOV_TOKEN }}
--- a/.github/workflows/unit_test.yml
+++ b/.github/workflows/unit_test.yml
@ -1,44 +0,0 @@
-# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
-# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions
-
-name: Tests
-
-on:
-  push:
-    branches: [ master ]
-  pull_request:
-    branches: [ master ]
-
-jobs:
-  build:
-
-    runs-on: ubuntu-latest
-    strategy:
-      matrix:
-        python-version: ['3.9']
-    steps:
-    - uses: actions/checkout@v2
-    - name: Set up Python ${{ matrix.python-version }}
-      uses: actions/setup-python@v2
-      with:
-        python-version: ${{ matrix.python-version }}
-    - name: Install dependencies
-      run: |
-        python -m pip install --upgrade pip
-        if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
-        if [ -f requirements-dev.txt ]; then pip install -r requirements-dev.txt; fi
-    - name: Lint with flake8
-      run: |
-        # stop the build if there are Python syntax errors or undefined names
-        flake8 waybackpy/ --count --select=E9,F63,F7,F82 --show-source --statistics
-        # exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
-        # flake8 waybackpy/ --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics --per-file-ignores="waybackpy/__init__.py:F401"
-    # - name: Static type test with mypy
-    #   run: |
-    #     mypy
-    - name: Test with pytest
-      run: |
-        pytest
-    # - name: Upload coverage to Codecov
-    #   run: |
-    #     bash <(curl -s https://codecov.io/bash) -t ${{ secrets.CODECOV_TOKEN }}
--- a/.pep8speaks.yml
+++ b/.pep8speaks.yml
@ -0,0 +1,7 @@
+scanner:
+    diff_only: True
+    linter: flake8
+
+flake8:
+    max-line-length: 88
+    extend-ignore: W503,W605
--- a/.whitesource
+++ b/.whitesource
@ -9,4 +9,4 @@
  "issueSettings": {
    "minSeverityLevel": "LOW"
  }
-}
+}
--- a/CITATION.cff
+++ b/CITATION.cff
@ -0,0 +1,25 @@
+cff-version: 1.2.0
+message: "If you use this software, please cite it as below."
+title: waybackpy
+abstract: "Python package that interfaces with the Internet Archive's Wayback Machine APIs. Archive pages and retrieve archived pages easily."
+version: '3.0.6'
+doi: 10.5281/ZENODO.3977276
+date-released: 2022-03-15
+type: software
+authors:
+  - given-names: Akash
+    family-names: Mahanty
+    email: akamhy@yahoo.com
+    orcid: https://orcid.org/0000-0003-2482-8227
+keywords:
+    - Archive Website
+    - Wayback Machine
+    - Internet Archive
+    - Wayback Machine CLI
+    - Wayback Machine Python
+    - Internet Archiving
+    - Availability API
+    - CDX API
+    - savepagenow
+license: MIT
+repository-code: "https://github.com/akamhy/waybackpy"
--- a/CODE_OF_CONDUCT.md
+++ b/CODE_OF_CONDUCT.md
@ -0,0 +1,128 @@
+# Contributor Covenant Code of Conduct
+
+## Our Pledge
+
+We as members, contributors, and leaders pledge to make participation in our
+community a harassment-free experience for everyone, regardless of age, body
+size, visible or invisible disability, ethnicity, sex characteristics, gender
+identity and expression, level of experience, education, socio-economic status,
+nationality, personal appearance, race, religion, or sexual identity
+and orientation.
+
+We pledge to act and interact in ways that contribute to an open, welcoming,
+diverse, inclusive, and healthy community.
+
+## Our Standards
+
+Examples of behavior that contributes to a positive environment for our
+community include:
+
+* Demonstrating empathy and kindness toward other people
+* Being respectful of differing opinions, viewpoints, and experiences
+* Giving and gracefully accepting constructive feedback
+* Accepting responsibility and apologizing to those affected by our mistakes,
+  and learning from the experience
+* Focusing on what is best not just for us as individuals, but for the
+  overall community
+
+Examples of unacceptable behavior include:
+
+* The use of sexualized language or imagery, and sexual attention or
+  advances of any kind
+* Trolling, insulting or derogatory comments, and personal or political attacks
+* Public or private harassment
+* Publishing others' private information, such as a physical or email
+  address, without their explicit permission
+* Other conduct which could reasonably be considered inappropriate in a
+  professional setting
+
+## Enforcement Responsibilities
+
+Community leaders are responsible for clarifying and enforcing our standards of
+acceptable behavior and will take appropriate and fair corrective action in
+response to any behavior that they deem inappropriate, threatening, offensive,
+or harmful.
+
+Community leaders have the right and responsibility to remove, edit, or reject
+comments, commits, code, wiki edits, issues, and other contributions that are
+not aligned to this Code of Conduct, and will communicate reasons for moderation
+decisions when appropriate.
+
+## Scope
+
+This Code of Conduct applies within all community spaces, and also applies when
+an individual is officially representing the community in public spaces.
+Examples of representing our community include using an official e-mail address,
+posting via an official social media account, or acting as an appointed
+representative at an online or offline event.
+
+## Enforcement
+
+Instances of abusive, harassing, or otherwise unacceptable behavior may be
+reported to the community leaders responsible for enforcement at
+akamhy@yahoo.com.
+All complaints will be reviewed and investigated promptly and fairly.
+
+All community leaders are obligated to respect the privacy and security of the
+reporter of any incident.
+
+## Enforcement Guidelines
+
+Community leaders will follow these Community Impact Guidelines in determining
+the consequences for any action they deem in violation of this Code of Conduct:
+
+### 1. Correction
+
+**Community Impact**: Use of inappropriate language or other behavior deemed
+unprofessional or unwelcome in the community.
+
+**Consequence**: A private, written warning from community leaders, providing
+clarity around the nature of the violation and an explanation of why the
+behavior was inappropriate. A public apology may be requested.
+
+### 2. Warning
+
+**Community Impact**: A violation through a single incident or series
+of actions.
+
+**Consequence**: A warning with consequences for continued behavior. No
+interaction with the people involved, including unsolicited interaction with
+those enforcing the Code of Conduct, for a specified period of time. This
+includes avoiding interactions in community spaces as well as external channels
+like social media. Violating these terms may lead to a temporary or
+permanent ban.
+
+### 3. Temporary Ban
+
+**Community Impact**: A serious violation of community standards, including
+sustained inappropriate behavior.
+
+**Consequence**: A temporary ban from any sort of interaction or public
+communication with the community for a specified period of time. No public or
+private interaction with the people involved, including unsolicited interaction
+with those enforcing the Code of Conduct, is allowed during this period.
+Violating these terms may lead to a permanent ban.
+
+### 4. Permanent Ban
+
+**Community Impact**: Demonstrating a pattern of violation of community
+standards, including sustained inappropriate behavior,  harassment of an
+individual, or aggression toward or disparagement of classes of individuals.
+
+**Consequence**: A permanent ban from any sort of public interaction within
+the community.
+
+## Attribution
+
+This Code of Conduct is adapted from the [Contributor Covenant][homepage],
+version 2.0, available at
+<https://www.contributor-covenant.org/version/2/0/code_of_conduct.html>.
+
+Community Impact Guidelines were inspired by [Mozilla's code of conduct
+enforcement ladder](https://github.com/mozilla/diversity).
+
+[homepage]: https://www.contributor-covenant.org
+
+For answers to common questions about this code of conduct, see the FAQ at
+<https://www.contributor-covenant.org/faq>. Translations are available at
+<https://www.contributor-covenant.org/translations>.
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@ -0,0 +1,54 @@
+# Welcome to waybackpy contributing guide
+
+
+## Getting started
+
+Read our [Code of Conduct](./CODE_OF_CONDUCT.md).
+
+## Creating an issue
+
+It's a good idea to open an issue and discuss suspected bugs and new feature ideas with the maintainers. Somebody might be working on your bug/idea and it would be best to discuss it to avoid wasting your time. It is a recommendation. You may avoid creating an issue and directly open pull requests.
+
+## Fork this repository
+
+Fork this repository. See '[Fork a repo](https://docs.github.com/en/get-started/quickstart/fork-a-repo)' for help forking this repository on GitHub.
+
+## Make changes to the forked copy
+
+Make the required changes to your forked copy of waybackpy, please don't forget to add or update comments and docstrings.
+
+## Add tests for your changes
+
+You have made the required changes to the codebase, now go ahead and add tests for newly written methods/functions and update the tests of code that you changed.
+
+## Testing and Linting
+
+You must run the tests and linter on your changes before opening a pull request.
+
+### pytest
+
+Runs all test from tests directory. pytest is a mature full-featured Python testing tool.
+```bash
+pytest
+```
+
+### mypy
+
+Mypy is a static type checker for Python. Type checkers help ensure that you're using variables and functions in your code correctly.
+```bash
+mypy -p waybackpy -p tests
+```
+
+### black
+
+After testing with pytest and type checking with mypy run black on the code base. The codestyle used by the project is 'black'.
+
+```bash
+black .
+```
+
+## Create a pull request
+
+Read [Creating a pull request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request).
+
+Try to make sure that all automated tests are passing, and if some of them do not pass then don't worry. Tests are meant to catch bugs and a failed test is better than introducing bugs to the master branch.
--- a/CONTRIBUTORS.md
+++ b/CONTRIBUTORS.md
@ -1,9 +0,0 @@
-## AUTHORS
-  - akamhy (<https://github.com/akamhy>)
-  - danvalen1 (<https://github.com/danvalen1>)
-  - AntiCompositeNumber (<https://github.com/AntiCompositeNumber>)
-  - jonasjancarik (<https://github.com/jonasjancarik>)
-
-## ACKNOWLEDGEMENTS
-  - mhmdiaa (<https://github.com/mhmdiaa>) for <https://gist.github.com/mhmdiaa/adf6bff70142e5091792841d4b372050>. known_urls is based on this gist.
-  - dequeued0 (<https://github.com/dequeued0>) for reporting bugs and useful feature requests.
--- a/README.md
+++ b/README.md
@ -1,61 +1,75 @@
+<!-- markdownlint-disable MD033 MD041 -->
 <div align="center">

 <img src="https://raw.githubusercontent.com/akamhy/waybackpy/master/assets/waybackpy_logo.svg"><br>

-<h3>A Python package & CLI tool that interfaces with the Wayback Machine API</h3>
+<h3>Python package & CLI tool that interfaces the Wayback Machine APIs</h3>

 </div>

 <p align="center">
 <a href="https://github.com/akamhy/waybackpy/actions?query=workflow%3ATests"><img alt="Unit Tests" src="https://github.com/akamhy/waybackpy/workflows/Tests/badge.svg"></a>
+<a href="https://codecov.io/gh/akamhy/waybackpy"><img alt="codecov" src="https://codecov.io/gh/akamhy/waybackpy/branch/master/graph/badge.svg"></a>
 <a href="https://pypi.org/project/waybackpy/"><img alt="pypi" src="https://img.shields.io/pypi/v/waybackpy.svg"></a>
 <a href="https://pepy.tech/project/waybackpy?versions=2*&versions=1*&versions=3*"><img alt="Downloads" src="https://pepy.tech/badge/waybackpy/month"></a>
+<a href="https://app.codacy.com/gh/akamhy/waybackpy?utm_source=github.com&utm_medium=referral&utm_content=akamhy/waybackpy&utm_campaign=Badge_Grade_Settings"><img alt="Codacy Badge" src="https://api.codacy.com/project/badge/Grade/6d777d8509f642ac89a20715bb3a6193"></a>
 <a href="https://github.com/akamhy/waybackpy/commits/master"><img alt="GitHub lastest commit" src="https://img.shields.io/github/last-commit/akamhy/waybackpy?color=blue&style=flat-square"></a>
 <a href="#"><img alt="PyPI - Python Version" src="https://img.shields.io/pypi/pyversions/waybackpy?style=flat-square"></a>
 <a href="https://github.com/psf/black"><img alt="Code style: black" src="https://img.shields.io/badge/code%20style-black-000000.svg"></a>
 </p>

-----------------------------------------------------------------------------------------------------------------------------------------------
+---

-## ⭐️ Introduction
-Waybackpy is a [Python package](https://www.udacity.com/blog/2021/01/what-is-a-python-package.html) and a [CLI](https://www.w3schools.com/whatis/whatis_cli.asp) tool that interfaces with the [Wayback Machine](https://en.wikipedia.org/wiki/Wayback_Machine) API.
+# <img src="https://github.githubassets.com/images/icons/emoji/unicode/2b50.png" width="30"></img> Introduction

- Wayback Machine has 3 client side [API](https://www.redhat.com/en/topics/api/what-are-application-programming-interfaces)s.
+Waybackpy is a Python package and a CLI tool that interfaces with the Wayback Machine APIs.

-  - [Save API](https://github.com/akamhy/waybackpy/wiki/Wayback-Machine-APIs#save-api)
-  - [Availability API](https://github.com/akamhy/waybackpy/wiki/Wayback-Machine-APIs#availability-api)
-  - [CDX API](https://github.com/akamhy/waybackpy/wiki/Wayback-Machine-APIs#cdx-api)
+Internet Archive's Wayback Machine has 3 useful public APIs.

-These three APIs can be accessed via the waybackpy either by importing it in a script or from the CLI.
+- SavePageNow or Save API
+- CDX Server API
+- Availability API

+These three APIs can be accessed via the waybackpy either by importing it from a python file/module or from the command-line interface.

-### 🏗 Installation
+## <img src="https://github.githubassets.com/images/icons/emoji/unicode/1f3d7.png" width="20"></img> Installation

-Using [pip](https://en.wikipedia.org/wiki/Pip_(package_manager)), from [PyPI](https://pypi.org/) (recommended):
+**Using [pip](https://en.wikipedia.org/wiki/Pip_(package_manager)), from [PyPI](https://pypi.org/) (recommended)**:

 ```bash
-pip install waybackpy
+pip install waybackpy -U
 ```

-Install directly from [this git repository](https://github.com/akamhy/waybackpy) (NOT recommended):
+**Using [conda](https://en.wikipedia.org/wiki/Conda_(package_manager)), from [conda-forge](https://anaconda.org/conda-forge/waybackpy) (recommended)**:
+
+See also [waybackpy feedstock](https://github.com/conda-forge/waybackpy-feedstock), maintainers are [@rafaelrdealmeida](https://github.com/rafaelrdealmeida/),
+ [@labriunesp](https://github.com/labriunesp/)
+ and [@akamhy](https://github.com/akamhy/).
+
+```bash
+conda install -c conda-forge waybackpy
+```
+
+**Install directly from [this git repository](https://github.com/akamhy/waybackpy) (NOT recommended)**:

 ```bash
 pip install git+https://github.com/akamhy/waybackpy.git
 ```

-### 🐳 Docker Image
-Docker Hub : <https://hub.docker.com/r/secsi/waybackpy>
+## <img src="https://github.githubassets.com/images/icons/emoji/unicode/1f433.png" width="20"></img> Docker Image

-[Docker image](https://searchitoperations.techtarget.com/definition/Docker-image) is automatically updated on every release by [Regulary and Automatically Updated Docker Images](https://github.com/cybersecsi/RAUDI) (RAUDI).
+Docker Hub: [hub.docker.com/r/secsi/waybackpy](https://hub.docker.com/r/secsi/waybackpy)

-RAUDI is a tool by SecSI (<https://secsi.io>), an Italian cybersecurity startup.
+Docker image is automatically updated on every release by [Regulary and Automatically Updated Docker Images](https://github.com/cybersecsi/RAUDI) (RAUDI).

+RAUDI is a tool by [SecSI](https://secsi.io), an Italian cybersecurity startup.

-### 🚀 Usage
+## <img src="https://github.githubassets.com/images/icons/emoji/unicode/1f680.png" width="20"></img> Usage

-#### As a Python package
+### As a Python package
+
+#### Save API aka SavePageNow

-##### Save API aka SavePageNow
 ```python
 >>> from waybackpy import WaybackMachineSaveAPI
 >>> url = "https://github.com"
@ -70,26 +84,68 @@ False
 datetime.datetime(2022, 1, 18, 12, 52, 49)
 ```

-##### Availability API
-```python
->>> from waybackpy import WaybackMachineAvailabilityAPI
->>>
->>> url = "https://google.com"
->>> user_agent = "Mozilla/5.0 (Windows NT 5.1; rv:40.0) Gecko/20100101 Firefox/40.0"
->>>
->>> availability_api = WaybackMachineAvailabilityAPI(url, user_agent)
->>>
->>> availability_api.oldest()
-https://web.archive.org/web/19981111184551/http://google.com:80/
->>>
->>> availability_api.newest()
-https://web.archive.org/web/20220118150444/https://www.google.com/
->>>
->>> availability_api.near(year=2010, month=10, day=10, hour=10)
-https://web.archive.org/web/20101010101708/http://www.google.com/
-```
+#### CDX API aka CDXServerAPI

-##### CDX API aka CDXServerAPI
+```python
+>>> from waybackpy import WaybackMachineCDXServerAPI
+>>> url = "https://google.com"
+>>> user_agent = "my new app's user agent"
+>>> cdx_api = WaybackMachineCDXServerAPI(url, user_agent)
+```
+##### oldest
+```python
+>>> cdx_api.oldest()
+com,google)/ 19981111184551 http://google.com:80/ text/html 200 HOQ2TGPYAEQJPNUA6M4SMZ3NGQRBXDZ3 381
+>>> oldest = cdx_api.oldest()
+>>> oldest
+com,google)/ 19981111184551 http://google.com:80/ text/html 200 HOQ2TGPYAEQJPNUA6M4SMZ3NGQRBXDZ3 381
+>>> oldest.archive_url
+'https://web.archive.org/web/19981111184551/http://google.com:80/'
+>>> oldest.original
+'http://google.com:80/'
+>>> oldest.urlkey
+'com,google)/'
+>>> oldest.timestamp
+'19981111184551'
+>>> oldest.datetime_timestamp
+datetime.datetime(1998, 11, 11, 18, 45, 51)
+>>> oldest.statuscode
+'200'
+>>> oldest.mimetype
+'text/html'
+```
+##### newest
+```python
+>>> newest = cdx_api.newest()
+>>> newest
+com,google)/ 20220217234427 http://@google.com/ text/html 301 Y6PVK4XWOI3BXQEXM5WLLWU5JKUVNSFZ 563
+>>> newest.archive_url
+'https://web.archive.org/web/20220217234427/http://@google.com/'
+>>> newest.timestamp
+'20220217234427'
+```
+##### near
+```python
+>>> near = cdx_api.near(year=2010, month=10, day=10, hour=10, minute=10)
+>>> near.archive_url
+'https://web.archive.org/web/20101010101435/http://google.com/'
+>>> near
+com,google)/ 20101010101435 http://google.com/ text/html 301 Y6PVK4XWOI3BXQEXM5WLLWU5JKUVNSFZ 391
+>>> near.timestamp
+'20101010101435'
+>>> near.timestamp
+'20101010101435'
+>>> near = cdx_api.near(wayback_machine_timestamp=2008080808)
+>>> near.archive_url
+'https://web.archive.org/web/20080808051143/http://google.com/'
+>>> near = cdx_api.near(unix_timestamp=1286705410)
+>>> near
+com,google)/ 20101010101435 http://google.com/ text/html 301 Y6PVK4XWOI3BXQEXM5WLLWU5JKUVNSFZ 391
+>>> near.archive_url
+'https://web.archive.org/web/20101010101435/http://google.com/'
+>>>
+```
+##### snapshots
 ```python
 >>> from waybackpy import WaybackMachineCDXServerAPI
 >>> url = "https://pypi.org"
@ -107,48 +163,59 @@ https://web.archive.org/web/20171127171549/https://pypi.org/
 https://web.archive.org/web/20171206002737/http://pypi.org:80/
 ```

+#### Availability API
+
+It is recommended to not use the availability API due to performance issues. All the methods of availability API interface class, `WaybackMachineAvailabilityAPI`, are also implemented in the CDX server API interface class, `WaybackMachineCDXServerAPI`. Also note
+that the `newest()` method of `WaybackMachineAvailabilityAPI` can be more recent than `WaybackMachineCDXServerAPI`'s same method.
+
+```python
+>>> from waybackpy import WaybackMachineAvailabilityAPI
+>>>
+>>> url = "https://google.com"
+>>> user_agent = "Mozilla/5.0 (Windows NT 5.1; rv:40.0) Gecko/20100101 Firefox/40.0"
+>>>
+>>> availability_api = WaybackMachineAvailabilityAPI(url, user_agent)
+```
+##### oldest
+```python
+>>> availability_api.oldest()
+https://web.archive.org/web/19981111184551/http://google.com:80/
+```
+##### newest
+```python
+>>> availability_api.newest()
+https://web.archive.org/web/20220118150444/https://www.google.com/
+```
+##### near
+```python
+>>> availability_api.near(year=2010, month=10, day=10, hour=10)
+https://web.archive.org/web/20101010101708/http://www.google.com/
+```
+
 > Documentation is at <https://github.com/akamhy/waybackpy/wiki/Python-package-docs>.

+### As a CLI tool

-#### As a CLI tool
+Demo video on [asciinema.org](https://asciinema.org/a/469890), you can copy the text from video:

-Saving a webpage:
-```bash
-waybackpy --save --url "https://en.wikipedia.org/wiki/Social_media" --user_agent "my-unique-user-agent"
-```
-```bash
-Archive URL:
-https://web.archive.org/web/20220121193801/https://en.wikipedia.org/wiki/Social_media
-Cached save:
-False
-```
+[![asciicast](https://asciinema.org/a/469890.svg)](https://asciinema.org/a/469890)

-
-Retriving the oldest archive and also printing the JSON response of the availability API:
-```bash
-waybackpy --oldest --json --url "https://en.wikipedia.org/wiki/Humanoid" --user_agent "my-unique-user-agent"
-```
-```bash
-Archive URL:
-https://web.archive.org/web/20040415020811/http://en.wikipedia.org:80/wiki/Humanoid
-JSON response:
-{"url": "https://en.wikipedia.org/wiki/Humanoid", "archived_snapshots": {"closest": {"status": "200", "available": true, "url": "http://web.archive.org/web/20040415020811/http://en.wikipedia.org:80/wiki/Humanoid", "timestamp": "20040415020811"}}, "timestamp": "199401212126"}
-```
-
-
-Archive close to a time, minute level precision is supported:
-```bash
-waybackpy --url google.com --user_agent "my-unique-user-agent" --near --year 2008 --month 8 --day 8
-```
-```bash
-Archive URL:
-https://web.archive.org/web/20080808014003/http://www.google.com:80/
-```
 > CLI documentation is at <https://github.com/akamhy/waybackpy/wiki/CLI-docs>.

-### 🛡 License
-[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://github.com/akamhy/waybackpy/blob/master/LICENSE)

-Copyright (c) 2020-2022 Akash Mahanty Et al.
+## CONTRIBUTORS

-Released under the MIT License. See [license](https://github.com/akamhy/waybackpy/blob/master/LICENSE) for details.
+### AUTHORS
+
+- akamhy (<https://github.com/akamhy>)
+- eggplants (<https://github.com/eggplants>)
+- danvalen1 (<https://github.com/danvalen1>)
+- AntiCompositeNumber (<https://github.com/AntiCompositeNumber>)
+- rafaelrdealmeida (<https://github.com/rafaelrdealmeida>)
+- jonasjancarik (<https://github.com/jonasjancarik>)
+- jfinkhaeuser (<https://github.com/jfinkhaeuser>)
+
+### ACKNOWLEDGEMENTS
+
+- mhmdiaa (<https://github.com/mhmdiaa>)  `--known-urls` is based on [this](https://gist.github.com/mhmdiaa/adf6bff70142e5091792841d4b372050) gist.
+- dequeued0 (<https://github.com/dequeued0>) for reporting bugs and useful feature requests.
--- a/pyproject.toml
+++ b/pyproject.toml
@ -0,0 +1,3 @@
+[build-system]
+requires = ["wheel", "setuptools"]
+build-backend = "setuptools.build_meta"
--- a/pytest.ini
+++ b/pytest.ini
@ -1,11 +0,0 @@
-[pytest]
-addopts =
-    # show summary of all tests that did not pass
-    -ra
-    # enable all warnings
-    -Wd
-    # coverage and html report
-    --cov=waybackpy
-    --cov-report=html
-testpaths =
-    tests
--- a/requirements-dev.txt
+++ b/requirements-dev.txt
@ -1,8 +1,10 @@
+black
 click
-requests
-pytest
-pytest-cov
 codecov
 flake8
 mypy
-black
+pytest
+pytest-cov
+requests
+setuptools>=46.4.0
+types-requests
--- a/requirements.txt
+++ b/requirements.txt
@ -1,2 +1,3 @@
 click
 requests
+urllib3
--- a/setup.cfg
+++ b/setup.cfg
@ -1,7 +1,103 @@
 [metadata]
-description-file = README.md
-license_file = LICENSE
+name = waybackpy
+version = attr: waybackpy.__version__
+description = Python package that interfaces with the Internet Archive's Wayback Machine APIs. Archive pages and retrieve archived pages easily.
+long_description = file: README.md
+long_description_content_type = text/markdown
+license = MIT
+author = Akash Mahanty
+author_email = akamhy@yahoo.com
+url = https://akamhy.github.io/waybackpy/
+download_url = https://github.com/akamhy/waybackpy/releases
+project_urls =
+    Documentation = https://github.com/akamhy/waybackpy/wiki
+    Source = https://github.com/akamhy/waybackpy
+    Tracker = https://github.com/akamhy/waybackpy/issues
+keywords =
+    Archive Website
+    Wayback Machine
+    Internet Archive
+    Wayback Machine CLI
+    Wayback Machine Python
+    Internet Archiving
+    Availability API
+    CDX API
+    savepagenow
+classifiers =
+    Development Status :: 5 - Production/Stable
+    Intended Audience :: Developers
+    Intended Audience :: End Users/Desktop
+    Natural Language :: English
+    Typing :: Typed
+    License :: OSI Approved :: MIT License
+    Programming Language :: Python
+    Programming Language :: Python :: 3
+    Programming Language :: Python :: 3.6
+    Programming Language :: Python :: 3.7
+    Programming Language :: Python :: 3.8
+    Programming Language :: Python :: 3.9
+    Programming Language :: Python :: 3.10
+    Programming Language :: Python :: 3.11
+    Programming Language :: Python :: Implementation :: CPython
+
+[options]
+packages = find:
+include-package-data = True
+python_requires = >= 3.6
+install_requires =
+    click
+    requests
+    urllib3
+
+[options.package_data]
+waybackpy = py.typed
+
+[options.extras_require]
+dev =
+    black
+    codecov
+    flake8
+    mypy
+    pytest
+    pytest-cov
+    setuptools>=46.4.0
+    types-requests
+
+[options.entry_points]
+console_scripts =
+    waybackpy = waybackpy.cli:main
+
+[isort]
+profile = black

 [flake8]
+indent-size = 4
 max-line-length = 88
-extend-ignore = E203,W503
+extend-ignore = W503,W605
+exclude =
+    venv
+    __pycache__
+    .venv
+    ./env
+    venv/
+    env
+    .env
+    ./build
+
+[mypy]
+python_version = 3.9
+show_error_codes = True
+pretty = True
+strict = True
+
+[tool:pytest]
+addopts =
+    # show summary of all tests that did not pass
+    -ra
+    # enable all warnings
+    -Wd
+    # coverage and html report
+    --cov=waybackpy
+    --cov-report=html
+testpaths =
+    tests
--- a/setup.py
+++ b/setup.py
@ -1,66 +1,3 @@
-import os.path
 from setuptools import setup

-readme_path = os.path.join(os.path.dirname(__file__), "README.md")
-with open(readme_path, encoding="utf-8") as f:
-    long_description = f.read()
-
-about = {}
-version_path = os.path.join(os.path.dirname(__file__), "waybackpy", "__version__.py")
-with open(version_path, encoding="utf-8") as f:
-    exec(f.read(), about)
-
-version = str(about["__version__"])
-
-download_url = "https://github.com/akamhy/waybackpy/archive/{version}.tar.gz".format(
-    version=version
-)
-
-setup(
-    name=about["__title__"],
-    packages=["waybackpy"],
-    version=version,
-    description=about["__description__"],
-    long_description=long_description,
-    long_description_content_type="text/markdown",
-    license=about["__license__"],
-    author=about["__author__"],
-    author_email=about["__author_email__"],
-    url=about["__url__"],
-    download_url=download_url,
-    keywords=[
-        "Archive Website",
-        "Wayback Machine",
-        "Internet Archive",
-        "Wayback Machine CLI",
-        "Wayback Machine Python",
-        "Internet Archiving",
-        "Availability API",
-        "CDX API",
-        "savepagenow",
-    ],
-    install_requires=["requests", "click"],
-    python_requires=">=3.4",
-    classifiers=[
-        "Development Status :: 4 - Beta",
-        "Intended Audience :: Developers",
-        "Natural Language :: English",
-        "License :: OSI Approved :: MIT License",
-        "Programming Language :: Python",
-        "Programming Language :: Python :: 3",
-        "Programming Language :: Python :: 3.4",
-        "Programming Language :: Python :: 3.5",
-        "Programming Language :: Python :: 3.6",
-        "Programming Language :: Python :: 3.7",
-        "Programming Language :: Python :: 3.8",
-        "Programming Language :: Python :: 3.9",
-        "Programming Language :: Python :: 3.10",
-        "Programming Language :: Python :: Implementation :: CPython",
-    ],
-    entry_points={"console_scripts": ["waybackpy = waybackpy.cli:main"]},
-    project_urls={
-        "Documentation": "https://github.com/akamhy/waybackpy/wiki",
-        "Source": "https://github.com/akamhy/waybackpy",
-        "Tracker": "https://github.com/akamhy/waybackpy/issues",
-    },
-)
+setup()
--- a/snapcraft.yaml
+++ b/snapcraft.yaml
@ -1,10 +1,10 @@
 name: waybackpy
 summary: Wayback Machine API interface and a command-line tool
 description: |
-      Waybackpy is a CLI tool that interfaces with the Wayback Machine APIs.
-      Wayback Machine has three client side public APIs, Save API, 
-      Availability API and CDX API. These three APIs can be accessed via 
-      the waybackpy from the terminal.
+  Waybackpy is a CLI tool that interfaces with the Wayback Machine APIs.
+  Wayback Machine has three client side public APIs, Save API, 
+  Availability API and CDX API. These three APIs can be accessed via 
+  the waybackpy from the terminal.
 version: git
 grade: stable
 confinement: strict
--- a/tests/test_availability_api.py
+++ b/tests/test_availability_api.py
@ -1,41 +1,53 @@
-import pytest
 import random
 import string
 from datetime import datetime, timedelta

+import pytest
+
 from waybackpy.availability_api import WaybackMachineAvailabilityAPI
 from waybackpy.exceptions import (
-    InvalidJSONInAvailabilityAPIResponse,
    ArchiveNotInAvailabilityAPIResponse,
+    InvalidJSONInAvailabilityAPIResponse,
 )

 now = datetime.utcnow()
 url = "https://example.com/"
-user_agent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36"
-
-rndstr = lambda n: "".join(
-    random.choice(string.ascii_uppercase + string.digits) for _ in range(n)
+user_agent = (
+    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 "
+    "(KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36"
 )


-def test_oldest():
+def rndstr(n: int) -> str:
+    return "".join(
+        random.choice(string.ascii_uppercase + string.digits) for _ in range(n)
+    )
+
+
+def test_oldest() -> None:
    """
    Test the oldest archive of Google.com and also checks the attributes.
    """
    url = "https://example.com/"
-    user_agent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36"
+    user_agent = (
+        "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 "
+        "(KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36"
+    )
    availability_api = WaybackMachineAvailabilityAPI(url, user_agent)
    oldest = availability_api.oldest()
    oldest_archive_url = oldest.archive_url
    assert "2002" in oldest_archive_url
    oldest_timestamp = oldest.timestamp()
    assert abs(oldest_timestamp - now) > timedelta(days=7000)  # More than 19 years
-    assert availability_api.JSON["archived_snapshots"]["closest"]["available"] is True
+    assert (
+        availability_api.json is not None
+        and availability_api.json["archived_snapshots"]["closest"]["available"] is True
+    )
    assert repr(oldest).find("example.com") != -1
    assert "2002" in str(oldest)


-def test_newest():
+def test_newest() -> None:
    """
    Assuming that the recent most Google Archive was made no more earlier than
    last one day which is 86400 seconds.
@ -51,16 +63,17 @@ def test_newest():
    assert abs(newest_timestamp - now) < timedelta(seconds=86400 * 3)


-def test_invalid_json():
+def test_invalid_json() -> None:
    """
-    When the API is malfunctioning or we don't pass a URL it may return invalid JSON data.
+    When the API is malfunctioning or we don't pass a URL,
+    it may return invalid JSON data.
    """
    with pytest.raises(InvalidJSONInAvailabilityAPIResponse):
        availability_api = WaybackMachineAvailabilityAPI(url="", user_agent=user_agent)
-        archive_url = availability_api.archive_url
+        _ = availability_api.archive_url


-def test_no_archive():
+def test_no_archive() -> None:
    """
    ArchiveNotInAvailabilityAPIResponse may be raised if Wayback Machine did not
    replied with the archive despite the fact that we know the site has million
@ -71,12 +84,12 @@ def test_no_archive():
    """
    with pytest.raises(ArchiveNotInAvailabilityAPIResponse):
        availability_api = WaybackMachineAvailabilityAPI(
-            url="https://%s.cn" % rndstr(30), user_agent=user_agent
+            url=f"https://{rndstr(30)}.cn", user_agent=user_agent
        )
-        archive_url = availability_api.archive_url
+        _ = availability_api.archive_url


-def test_no_api_call_str_repr():
+def test_no_api_call_str_repr() -> None:
    """
    Some entitled users maybe want to see what is the string representation
    if they don’t make any API requests.
@ -84,17 +97,17 @@ def test_no_api_call_str_repr():
    str() must not return None so we return ""
    """
    availability_api = WaybackMachineAvailabilityAPI(
-        url="https://%s.gov" % rndstr(30), user_agent=user_agent
+        url=f"https://{rndstr(30)}.gov", user_agent=user_agent
    )
-    assert "" == str(availability_api)
+    assert str(availability_api) == ""


-def test_no_call_timestamp():
+def test_no_call_timestamp() -> None:
    """
    If no API requests were made the bound timestamp() method returns
    the datetime.max as a default value.
    """
    availability_api = WaybackMachineAvailabilityAPI(
-        url="https://%s.in" % rndstr(30), user_agent=user_agent
+        url=f"https://{rndstr(30)}.in", user_agent=user_agent
    )
    assert datetime.max == availability_api.timestamp()
--- a/tests/test_cdx_api.py
+++ b/tests/test_cdx_api.py
@ -0,0 +1,214 @@
+import random
+import string
+
+import pytest
+
+from waybackpy.cdx_api import WaybackMachineCDXServerAPI
+from waybackpy.exceptions import NoCDXRecordFound
+
+
+def rndstr(n: int) -> str:
+    return "".join(
+        random.choice(string.ascii_uppercase + string.digits) for _ in range(n)
+    )
+
+
+def test_a() -> None:
+    user_agent = (
+        "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) AppleWebKit/605.1.15 "
+        "(KHTML, like Gecko) Version/14.1.1 Safari/604.1"
+    )
+    url = "https://twitter.com/jack"
+
+    wayback = WaybackMachineCDXServerAPI(
+        url=url,
+        user_agent=user_agent,
+        match_type="prefix",
+        collapses=["urlkey"],
+        start_timestamp="201001",
+        end_timestamp="201002",
+    )
+    #  timeframe bound prefix matching enabled along with active urlkey based collapsing
+
+    snapshots = wayback.snapshots()  # <class 'generator'>
+
+    for snapshot in snapshots:
+        assert snapshot.timestamp.startswith("2010")
+
+
+def test_b() -> None:
+    user_agent = (
+        "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) "
+        "AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.1 Safari/604.1"
+    )
+    url = "https://www.google.com"
+
+    wayback = WaybackMachineCDXServerAPI(
+        url=url,
+        user_agent=user_agent,
+        start_timestamp="202101",
+        end_timestamp="202112",
+        collapses=["urlkey"],
+    )
+    #  timeframe bound prefix matching enabled along with active urlkey based collapsing
+
+    snapshots = wayback.snapshots()  # <class 'generator'>
+
+    for snapshot in snapshots:
+        assert snapshot.timestamp.startswith("2021")
+
+
+def test_c() -> None:
+    user_agent = (
+        "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) "
+        "AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.1 Safari/604.1"
+    )
+    url = "https://www.google.com"
+
+    cdx = WaybackMachineCDXServerAPI(
+        url=url,
+        user_agent=user_agent,
+        closest="201010101010",
+        sort="closest",
+        limit="1",
+    )
+    snapshots = cdx.snapshots()
+    for snapshot in snapshots:
+        archive_url = snapshot.archive_url
+        timestamp = snapshot.timestamp
+        break
+
+    assert str(archive_url).find("google.com")
+    assert "20101010" in timestamp
+
+
+def test_d() -> None:
+    user_agent = (
+        "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) "
+        "AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.1 Safari/604.1"
+    )
+
+    cdx = WaybackMachineCDXServerAPI(
+        url="akamhy.github.io",
+        user_agent=user_agent,
+        match_type="prefix",
+        use_pagination=True,
+        filters=["statuscode:200"],
+    )
+    snapshots = cdx.snapshots()
+
+    count = 0
+    for snapshot in snapshots:
+        count += 1
+        assert str(snapshot.archive_url).find("akamhy.github.io")
+    assert count > 50
+
+
+def test_oldest() -> None:
+    user_agent = (
+        "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) "
+        "AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.1 Safari/604.1"
+    )
+
+    cdx = WaybackMachineCDXServerAPI(
+        url="google.com",
+        user_agent=user_agent,
+        filters=["statuscode:200"],
+    )
+    oldest = cdx.oldest()
+    assert "1998" in oldest.timestamp
+    assert "google" in oldest.urlkey
+    assert oldest.original.find("google.com") != -1
+    assert oldest.archive_url.find("google.com") != -1
+
+
+def test_newest() -> None:
+    user_agent = (
+        "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) "
+        "AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.1 Safari/604.1"
+    )
+
+    cdx = WaybackMachineCDXServerAPI(
+        url="google.com",
+        user_agent=user_agent,
+        filters=["statuscode:200"],
+    )
+    newest = cdx.newest()
+    assert "google" in newest.urlkey
+    assert newest.original.find("google.com") != -1
+    assert newest.archive_url.find("google.com") != -1
+
+
+def test_near() -> None:
+    user_agent = (
+        "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) "
+        "AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.1 Safari/604.1"
+    )
+
+    cdx = WaybackMachineCDXServerAPI(
+        url="google.com",
+        user_agent=user_agent,
+        filters=["statuscode:200"],
+    )
+    near = cdx.near(year=2010, month=10, day=10, hour=10, minute=10)
+    assert "2010101010" in near.timestamp
+    assert "google" in near.urlkey
+    assert near.original.find("google.com") != -1
+    assert near.archive_url.find("google.com") != -1
+
+    near = cdx.near(wayback_machine_timestamp="201010101010")
+    assert "2010101010" in near.timestamp
+    assert "google" in near.urlkey
+    assert near.original.find("google.com") != -1
+    assert near.archive_url.find("google.com") != -1
+
+    near = cdx.near(unix_timestamp=1286705410)
+    assert "2010101010" in near.timestamp
+    assert "google" in near.urlkey
+    assert near.original.find("google.com") != -1
+    assert near.archive_url.find("google.com") != -1
+
+    with pytest.raises(NoCDXRecordFound):
+        dne_url = f"https://{rndstr(30)}.in"
+        cdx = WaybackMachineCDXServerAPI(
+            url=dne_url,
+            user_agent=user_agent,
+            filters=["statuscode:200"],
+        )
+        cdx.near(unix_timestamp=1286705410)
+
+
+def test_before() -> None:
+    user_agent = (
+        "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) "
+        "AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.1 Safari/604.1"
+    )
+
+    cdx = WaybackMachineCDXServerAPI(
+        url="http://www.google.com/",
+        user_agent=user_agent,
+        filters=["statuscode:200"],
+    )
+    before = cdx.before(wayback_machine_timestamp=20160731235949)
+    assert "20160731233347" in before.timestamp
+    assert "google" in before.urlkey
+    assert before.original.find("google.com") != -1
+    assert before.archive_url.find("google.com") != -1
+
+
+def test_after() -> None:
+    user_agent = (
+        "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) "
+        "AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.1 Safari/604.1"
+    )
+
+    cdx = WaybackMachineCDXServerAPI(
+        url="http://www.google.com/",
+        user_agent=user_agent,
+        filters=["statuscode:200"],
+    )
+    after = cdx.after(wayback_machine_timestamp=20160731235949)
+    assert "20160801000917" in after.timestamp, after.timestamp
+    assert "google" in after.urlkey
+    assert after.original.find("google.com") != -1
+    assert after.archive_url.find("google.com") != -1
--- a/tests/test_cdx_snapshot.py
+++ b/tests/test_cdx_snapshot.py
@ -1,11 +1,13 @@
-import pytest
 from datetime import datetime

 from waybackpy.cdx_snapshot import CDXSnapshot


-def test_CDXSnapshot():
-    sample_input = "org,archive)/ 20080126045828 http://github.com text/html 200 Q4YULN754FHV2U6Q5JUT6Q2P57WEWNNY 1415"
+def test_CDXSnapshot() -> None:
+    sample_input = (
+        "org,archive)/ 20080126045828 http://github.com "
+        "text/html 200 Q4YULN754FHV2U6Q5JUT6Q2P57WEWNNY 1415"
+    )
    prop_values = sample_input.split(" ")
    properties = {}
    (
@ -39,3 +41,4 @@ def test_CDXSnapshot():
    )
    assert archive_url == snapshot.archive_url
    assert sample_input == str(snapshot)
+    assert sample_input == repr(snapshot)
--- a/tests/test_cdx_utils.py
+++ b/tests/test_cdx_utils.py
@ -1,73 +1,78 @@
+from typing import Any, Dict, List
+
 import pytest
-from waybackpy.exceptions import WaybackError
+
 from waybackpy.cdx_utils import (
-    get_total_pages,
+    check_collapses,
+    check_filters,
+    check_match_type,
+    check_sort,
    full_url,
    get_response,
-    check_filters,
-    check_collapses,
-    check_match_type,
+    get_total_pages,
 )
+from waybackpy.exceptions import WaybackError


-def test_get_total_pages():
+def test_get_total_pages() -> None:
    url = "twitter.com"
-    user_agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.2 Safari/605.1.15"
+    user_agent = (
+        "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/605.1.15 "
+        "(KHTML, like Gecko) Version/14.0.2 Safari/605.1.15"
+    )
    assert get_total_pages(url=url, user_agent=user_agent) >= 56


-def test_full_url():
-    params = {}
+def test_full_url() -> None:
    endpoint = "https://web.archive.org/cdx/search/cdx"
+    params: Dict[str, Any] = {}
    assert endpoint == full_url(endpoint, params)

    params = {"a": "1"}
-    assert "https://web.archive.org/cdx/search/cdx?a=1" == full_url(endpoint, params)
-    assert "https://web.archive.org/cdx/search/cdx?a=1" == full_url(
-        endpoint + "?", params
+    assert full_url(endpoint, params) == "https://web.archive.org/cdx/search/cdx?a=1"
+    assert (
+        full_url(endpoint + "?", params) == "https://web.archive.org/cdx/search/cdx?a=1"
    )

    params["b"] = 2
-    assert "https://web.archive.org/cdx/search/cdx?a=1&b=2" == full_url(
-        endpoint + "?", params
+    assert (
+        full_url(endpoint + "?", params)
+        == "https://web.archive.org/cdx/search/cdx?a=1&b=2"
    )

    params["c"] = "foo bar"
-    assert "https://web.archive.org/cdx/search/cdx?a=1&b=2&c=foo%20bar" == full_url(
-        endpoint + "?", params
+    assert (
+        full_url(endpoint + "?", params)
+        == "https://web.archive.org/cdx/search/cdx?a=1&b=2&c=foo%20bar"
    )


-def test_get_response():
+def test_get_response() -> None:
    url = "https://github.com"
    user_agent = (
        "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0"
    )
-    headers = {"User-Agent": "%s" % user_agent}
+    headers = {"User-Agent": str(user_agent)}
    response = get_response(url, headers=headers)
-    assert response.status_code == 200
-
-    url = "http/wwhfhfvhvjhmom"
-    with pytest.raises(WaybackError):
-        get_response(url, headers=headers)
+    assert not isinstance(response, Exception) and response.status_code == 200


-def test_check_filters():
-    filters = []
+def test_check_filters() -> None:
+    filters: List[str] = []
    check_filters(filters)

    filters = ["statuscode:200", "timestamp:20215678901234", "original:https://url.com"]
    check_filters(filters)

    with pytest.raises(WaybackError):
-        check_filters("not-list")
+        check_filters("not-list")  # type: ignore[arg-type]

    with pytest.raises(WaybackError):
        check_filters(["invalid"])


-def test_check_collapses():
-    collapses = []
+def test_check_collapses() -> None:
+    collapses: List[str] = []
    check_collapses(collapses)

    collapses = ["timestamp:10"]
@ -76,7 +81,7 @@ def test_check_collapses():
    collapses = ["urlkey"]
    check_collapses(collapses)

-    collapses = "urlkey"  # NOT LIST
+    collapses = "urlkey"  # type: ignore[assignment]
    with pytest.raises(WaybackError):
        check_collapses(collapses)

@ -85,11 +90,11 @@ def test_check_collapses():
        check_collapses(collapses)


-def test_check_match_type():
-    assert None == check_match_type(None, "url")
+def test_check_match_type() -> None:
+    assert check_match_type(None, "url")
    match_type = "exact"
    url = "test_url"
-    assert None == check_match_type(match_type, url)
+    assert check_match_type(match_type, url)

    url = "has * in it"
    with pytest.raises(WaybackError):
@ -97,3 +102,12 @@ def test_check_match_type():

    with pytest.raises(WaybackError):
        check_match_type("not a valid type", "url")
+
+
+def test_check_sort() -> None:
+    assert check_sort("default")
+    assert check_sort("closest")
+    assert check_sort("reverse")
+
+    with pytest.raises(WaybackError):
+        assert check_sort("random crap")
--- a/tests/test_cli.py
+++ b/tests/test_cli.py
@ -0,0 +1,136 @@
+import requests
+from click.testing import CliRunner
+
+from waybackpy import __version__
+from waybackpy.cli import main
+
+
+def test_oldest() -> None:
+    runner = CliRunner()
+    result = runner.invoke(main, ["--url", " https://github.com ", "--oldest"])
+    assert result.exit_code == 0
+    assert (
+        result.output
+        == "Archive URL:\nhttps://web.archive.org/web/2008051421\
+0148/http://github.com/\n"
+    )
+
+
+def test_near() -> None:
+    runner = CliRunner()
+    result = runner.invoke(
+        main,
+        [
+            "--url",
+            " https://facebook.com ",
+            "--near",
+            "--year",
+            "2010",
+            "--month",
+            "5",
+            "--day",
+            "10",
+            "--hour",
+            "6",
+        ],
+    )
+    assert result.exit_code == 0
+    assert (
+        result.output
+        == "Archive URL:\nhttps://web.archive.org/web/2010051008\
+2647/http://www.facebook.com/\n"
+    )
+
+
+def test_newest() -> None:
+    runner = CliRunner()
+    result = runner.invoke(main, ["--url", " https://microsoft.com ", "--newest"])
+    assert result.exit_code == 0
+    assert (
+        result.output.find("microsoft.com") != -1
+        and result.output.find("Archive URL:\n") != -1
+    )
+
+
+def test_cdx() -> None:
+    runner = CliRunner()
+    result = runner.invoke(
+        main,
+        "--url https://twitter.com/jack --cdx --user-agent some-user-agent \
+--start-timestamp 2010 --end-timestamp 2012 --collapse urlkey \
+--match-type prefix --cdx-print archiveurl --cdx-print length \
+--cdx-print digest --cdx-print statuscode --cdx-print mimetype \
+--cdx-print original --cdx-print timestamp --cdx-print urlkey".split(
+            " "
+        ),
+    )
+    assert result.exit_code == 0
+    assert result.output.count("\n") > 3000
+
+
+def test_save() -> None:
+    runner = CliRunner()
+    result = runner.invoke(
+        main,
+        "--url https://yahoo.com --user_agent my-unique-user-agent \
+--save --headers".split(
+            " "
+        ),
+    )
+    assert result.exit_code == 0
+    assert result.output.find("Archive URL:") != -1
+    assert (result.output.find("Cached save:\nTrue") != -1) or (
+        result.output.find("Cached save:\nFalse") != -1
+    )
+    assert result.output.find("Save API headers:\n") != -1
+    assert result.output.find("yahoo.com") != -1
+
+
+def test_version() -> None:
+    runner = CliRunner()
+    result = runner.invoke(main, ["--version"])
+    assert result.exit_code == 0
+    assert result.output == f"waybackpy version {__version__}\n"
+
+
+def test_license() -> None:
+    runner = CliRunner()
+    result = runner.invoke(main, ["--license"])
+    assert result.exit_code == 0
+    assert (
+        result.output
+        == requests.get(
+            url="https://raw.githubusercontent.com/akamhy/waybackpy/master/LICENSE"
+        ).text
+        + "\n"
+    )
+
+
+def test_only_url() -> None:
+    runner = CliRunner()
+    result = runner.invoke(main, ["--url", "https://google.com"])
+    assert result.exit_code == 0
+    assert (
+        result.output
+        == "NoCommandFound: Only URL passed, but did not specify what to do with the URL. Use \
+--help flag for help using waybackpy.\n"
+    )
+
+
+def test_known_url() -> None:
+    # with file generator enabled
+    runner = CliRunner()
+    result = runner.invoke(
+        main, ["--url", "https://akamhy.github.io", "--known-urls", "--file"]
+    )
+    assert result.exit_code == 0
+    assert result.output.count("\n") > 40
+    assert result.output.count("akamhy.github.io") > 40
+    assert result.output.find("in the current working directory.\n") != -1
+
+    # without file
+    runner = CliRunner()
+    result = runner.invoke(main, ["--url", "https://akamhy.github.io", "--known-urls"])
+    assert result.exit_code == 0
+    assert result.output.count("\n") > 40
+    assert result.output.count("akamhy.github.io") > 40
--- a/tests/test_save_api.py
+++ b/tests/test_save_api.py
@ -1,20 +1,28 @@
-import pytest
-import time
 import random
 import string
+import time
 from datetime import datetime
+from typing import cast
+
+import pytest
+from requests.structures import CaseInsensitiveDict

-from waybackpy.save_api import WaybackMachineSaveAPI
 from waybackpy.exceptions import MaximumSaveRetriesExceeded
-
-rndstr = lambda n: "".join(
-    random.choice(string.ascii_uppercase + string.digits) for _ in range(n)
-)
+from waybackpy.save_api import WaybackMachineSaveAPI


-def test_save():
+def rndstr(n: int) -> str:
+    return "".join(
+        random.choice(string.ascii_uppercase + string.digits) for _ in range(n)
+    )
+
+
+def test_save() -> None:
    url = "https://github.com/akamhy/waybackpy"
-    user_agent = "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.1 Safari/604.1"
+    user_agent = (
+        "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) AppleWebKit/605.1.15 "
+        "(KHTML, like Gecko) Version/14.1.1 Safari/604.1"
+    )
    save_api = WaybackMachineSaveAPI(url, user_agent)
    save_api.save()
    archive_url = save_api.archive_url
@ -23,19 +31,23 @@ def test_save():
    cached_save = save_api.cached_save
    assert cached_save in [True, False]
    assert archive_url.find("github.com/akamhy/waybackpy") != -1
+    assert timestamp is not None
    assert str(headers).find("github.com/akamhy/waybackpy") != -1
-    assert type(save_api.timestamp()) == type(datetime(year=2020, month=10, day=2))
+    assert isinstance(save_api.timestamp(), datetime)


-def test_max_redirect_exceeded():
+def test_max_redirect_exceeded() -> None:
    with pytest.raises(MaximumSaveRetriesExceeded):
-        url = "https://%s.gov" % rndstr
-        user_agent = "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.1 Safari/604.1"
+        url = f"https://{rndstr}.gov"
+        user_agent = (
+            "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) AppleWebKit/605.1.15 "
+            "(KHTML, like Gecko) Version/14.1.1 Safari/604.1"
+        )
        save_api = WaybackMachineSaveAPI(url, user_agent, max_tries=3)
        save_api.save()


-def test_sleep():
+def test_sleep() -> None:
    """
    sleeping is actually very important for SaveAPI
    interface stability.
@ -43,7 +55,10 @@ def test_sleep():
    is as intended.
    """
    url = "https://example.com"
-    user_agent = "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.1 Safari/604.1"
+    user_agent = (
+        "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) AppleWebKit/605.1.15 "
+        "(KHTML, like Gecko) Version/14.1.1 Safari/604.1"
+    )
    save_api = WaybackMachineSaveAPI(url, user_agent)
    s_time = int(time.time())
    save_api.sleep(6)  # multiple of 3 sleep for 10 seconds
@ -56,78 +71,153 @@ def test_sleep():
    assert (e_time - s_time) >= 5


-def test_timestamp():
+def test_timestamp() -> None:
    url = "https://example.com"
-    user_agent = "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.1 Safari/604.1"
-    save_api = WaybackMachineSaveAPI(url, user_agent)
-    now = datetime.utcnow()
-    save_api._archive_url = (
-        "https://web.archive.org/web/%s/" % now.strftime("%Y%m%d%H%M%S") + url
+    user_agent = (
+        "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) AppleWebKit/605.1.15 "
+        "(KHTML, like Gecko) Version/14.1.1 Safari/604.1"
    )
+    save_api = WaybackMachineSaveAPI(url, user_agent)
+    now = datetime.utcnow().strftime("%Y%m%d%H%M%S")
+    save_api._archive_url = f"https://web.archive.org/web/{now}/{url}/"
    save_api.timestamp()
    assert save_api.cached_save is False
-    save_api._archive_url = "https://web.archive.org/web/%s/" % "20100124063622" + url
+    now = "20100124063622"
+    save_api._archive_url = f"https://web.archive.org/web/{now}/{url}/"
    save_api.timestamp()
    assert save_api.cached_save is True


-def test_archive_url_parser():
+def test_archive_url_parser() -> None:
    """
    Testing three regex for matches and also tests the response URL.
    """
    url = "https://example.com"
-    user_agent = "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.1 Safari/604.1"
+    user_agent = (
+        "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) AppleWebKit/605.1.15 "
+        "(KHTML, like Gecko) Version/14.1.1 Safari/604.1"
+    )
    save_api = WaybackMachineSaveAPI(url, user_agent)

-    save_api.headers = """
-    START
-    Content-Location: /web/20201126185327/https://www.scribbr.com/citing-sources/et-al
-    END
-    """
-
-    assert (
-        save_api.archive_url_parser()
-        == "https://web.archive.org/web/20201126185327/https://www.scribbr.com/citing-sources/et-al"
+    h = (
+        "\nSTART\nContent-Location: "
+        "/web/20201126185327/https://www.scribbr.com/citing-sources/et-al"
+        "\nEND\n"
    )
+    save_api.headers = h  # type: ignore[assignment]

-    save_api.headers = """
-    {'Server': 'nginx/1.15.8', 'Date': 'Sat, 02 Jan 2021 09:40:25 GMT', 'Content-Type': 'text/html; charset=UTF-8', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'X-Archive-Orig-Server': 'nginx', 'X-Archive-Orig-Date': 'Sat, 02 Jan 2021 09:40:09 GMT', 'X-Archive-Orig-Transfer-Encoding': 'chunked', 'X-Archive-Orig-Connection': 'keep-alive', 'X-Archive-Orig-Vary': 'Accept-Encoding', 'X-Archive-Orig-Last-Modified': 'Fri, 01 Jan 2021 12:19:00 GMT', 'X-Archive-Orig-Strict-Transport-Security': 'max-age=31536000, max-age=0;', 'X-Archive-Guessed-Content-Type': 'text/html', 'X-Archive-Guessed-Charset': 'utf-8', 'Memento-Datetime': 'Sat, 02 Jan 2021 09:40:09 GMT', 'Link': '<https://www.scribbr.com/citing-sources/et-al/>; rel="original", <https://web.archive.org/web/timemap/link/https://www.scribbr.com/citing-sources/et-al/>; rel="timemap"; type="application/link-format", <https://web.archive.org/web/https://www.scribbr.com/citing-sources/et-al/>; rel="timegate", <https://web.archive.org/web/20200601082911/https://www.scribbr.com/citing-sources/et-al/>; rel="first memento"; datetime="Mon, 01 Jun 2020 08:29:11 GMT", <https://web.archive.org/web/20201126185327/https://www.scribbr.com/citing-sources/et-al/>; rel="prev memento"; datetime="Thu, 26 Nov 2020 18:53:27 GMT", <https://web.archive.org/web/20210102094009/https://www.scribbr.com/citing-sources/et-al/>; rel="memento"; datetime="Sat, 02 Jan 2021 09:40:09 GMT", <https://web.archive.org/web/20210102094009/https://www.scribbr.com/citing-sources/et-al/>; rel="last memento"; datetime="Sat, 02 Jan 2021 09:40:09 GMT"', 'Content-Security-Policy': "default-src 'self' 'unsafe-eval' 'unsafe-inline' data: blob: archive.org web.archive.org analytics.archive.org pragma.archivelab.org", 'X-Archive-Src': 'spn2-20210102092956-wwwb-spn20.us.archive.org-8001.warc.gz', 'Server-Timing': 'captures_list;dur=112.646325, exclusion.robots;dur=0.172010, exclusion.robots.policy;dur=0.158205, RedisCDXSource;dur=2.205932, esindex;dur=0.014647, LoadShardBlock;dur=82.205012, PetaboxLoader3.datanode;dur=70.750239, CDXLines.iter;dur=24.306278, load_resource;dur=26.520179', 'X-App-Server': 'wwwb-app200', 'X-ts': '200', 'X-location': 'All', 'X-Cache-Key': 'httpsweb.archive.org/web/20210102094009/https://www.scribbr.com/citing-sources/et-al/IN', 'X-RL': '0', 'X-Page-Cache': 'MISS', 'X-Archive-Screenname': '0', 'Content-Encoding': 'gzip'}
-    """
-
-    assert (
-        save_api.archive_url_parser()
-        == "https://web.archive.org/web/20210102094009/https://www.scribbr.com/citing-sources/et-al/"
+    expected_url = (
+        "https://web.archive.org/web/20201126185327/"
+        "https://www.scribbr.com/citing-sources/et-al"
    )
+    assert save_api.archive_url_parser() == expected_url

-    save_api.headers = """
-    START
-    X-Cache-Key: https://web.archive.org/web/20171128185327/https://www.scribbr.com/citing-sources/et-al/US
-    END
-    """
+    headers = {
+        "Server": "nginx/1.15.8",
+        "Date": "Sat, 02 Jan 2021 09:40:25 GMT",
+        "Content-Type": "text/html; charset=UTF-8",
+        "Transfer-Encoding": "chunked",
+        "Connection": "keep-alive",
+        "X-Archive-Orig-Server": "nginx",
+        "X-Archive-Orig-Date": "Sat, 02 Jan 2021 09:40:09 GMT",
+        "X-Archive-Orig-Transfer-Encoding": "chunked",
+        "X-Archive-Orig-Connection": "keep-alive",
+        "X-Archive-Orig-Vary": "Accept-Encoding",
+        "X-Archive-Orig-Last-Modified": "Fri, 01 Jan 2021 12:19:00 GMT",
+        "X-Archive-Orig-Strict-Transport-Security": "max-age=31536000, max-age=0;",
+        "X-Archive-Guessed-Content-Type": "text/html",
+        "X-Archive-Guessed-Charset": "utf-8",
+        "Memento-Datetime": "Sat, 02 Jan 2021 09:40:09 GMT",
+        "Link": (
+            '<https://www.scribbr.com/citing-sources/et-al/>; rel="original", '
+            "<https://web.archive.org/web/timemap/link/https://www.scribbr.com/"
+            'citing-sources/et-al/>; rel="timemap"; type="application/link-format", '
+            "<https://web.archive.org/web/https://www.scribbr.com/citing-sources/"
+            'et-al/>; rel="timegate", <https://web.archive.org/web/20200601082911/'
+            'https://www.scribbr.com/citing-sources/et-al/>; rel="first memento"; '
+            'datetime="Mon, 01 Jun 2020 08:29:11 GMT", <https://web.archive.org/web/'
+            "20201126185327/https://www.scribbr.com/citing-sources/et-al/>; "
+            'rel="prev memento"; datetime="Thu, 26 Nov 2020 18:53:27 GMT", '
+            "<https://web.archive.org/web/20210102094009/https://www.scribbr.com/"
+            'citing-sources/et-al/>; rel="memento"; datetime="Sat, 02 Jan 2021 '
+            '09:40:09 GMT", <https://web.archive.org/web/20210102094009/'
+            "https://www.scribbr.com/citing-sources/et-al/>; "
+            'rel="last memento"; datetime="Sat, 02 Jan 2021 09:40:09 GMT"'
+        ),
+        "Content-Security-Policy": (
+            "default-src 'self' 'unsafe-eval' 'unsafe-inline' "
+            "data: blob: archive.org web.archive.org analytics.archive.org "
+            "pragma.archivelab.org",
+        ),
+        "X-Archive-Src": "spn2-20210102092956-wwwb-spn20.us.archive.org-8001.warc.gz",
+        "Server-Timing": (
+            "captures_list;dur=112.646325, exclusion.robots;dur=0.172010, "
+            "exclusion.robots.policy;dur=0.158205, RedisCDXSource;dur=2.205932, "
+            "esindex;dur=0.014647, LoadShardBlock;dur=82.205012, "
+            "PetaboxLoader3.datanode;dur=70.750239, CDXLines.iter;dur=24.306278, "
+            "load_resource;dur=26.520179"
+        ),
+        "X-App-Server": "wwwb-app200",
+        "X-ts": "200",
+        "X-location": "All",
+        "X-Cache-Key": (
+            "httpsweb.archive.org/web/20210102094009/"
+            "https://www.scribbr.com/citing-sources/et-al/IN",
+        ),
+        "X-RL": "0",
+        "X-Page-Cache": "MISS",
+        "X-Archive-Screenname": "0",
+        "Content-Encoding": "gzip",
+    }

-    assert (
-        save_api.archive_url_parser()
-        == "https://web.archive.org/web/20171128185327/https://www.scribbr.com/citing-sources/et-al/"
+    save_api.headers = cast(CaseInsensitiveDict[str], headers)
+
+    expected_url2 = (
+        "https://web.archive.org/web/20210102094009/"
+        "https://www.scribbr.com/citing-sources/et-al/"
    )
+    assert save_api.archive_url_parser() == expected_url2

-    save_api.headers = "TEST TEST TEST AND NO MATCH - TEST FOR RESPONSE URL MATCHING"
-    save_api.response_url = "https://web.archive.org/web/20171128185327/https://www.scribbr.com/citing-sources/et-al"
-    assert (
-        save_api.archive_url_parser()
-        == "https://web.archive.org/web/20171128185327/https://www.scribbr.com/citing-sources/et-al"
+    expected_url_3 = (
+        "https://web.archive.org/web/20171128185327/"
+        "https://www.scribbr.com/citing-sources/et-al/US"
    )
+    h = f"START\nX-Cache-Key: {expected_url_3}\nEND\n"
+    save_api.headers = h  # type: ignore[assignment]
+
+    expected_url4 = (
+        "https://web.archive.org/web/20171128185327/"
+        "https://www.scribbr.com/citing-sources/et-al/"
+    )
+    assert save_api.archive_url_parser() == expected_url4
+
+    h = "TEST TEST TEST AND NO MATCH - TEST FOR RESPONSE URL MATCHING"
+    save_api.headers = h  # type: ignore[assignment]
+    save_api.response_url = (
+        "https://web.archive.org/web/20171128185327/"
+        "https://www.scribbr.com/citing-sources/et-al"
+    )
+    expected_url5 = (
+        "https://web.archive.org/web/20171128185327/"
+        "https://www.scribbr.com/citing-sources/et-al"
+    )
+    assert save_api.archive_url_parser() == expected_url5


-def test_archive_url():
+def test_archive_url() -> None:
    """
    Checks the attribute archive_url's value when the save method was not
    explicitly invoked by the end-user but the save method was invoked implicitly
    by the archive_url method which is an attribute due to @property.
    """
    url = "https://example.com"
-    user_agent = "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.1 Safari/604.1"
+    user_agent = (
+        "Mozilla/5.0 (MacBook Air; M1 Mac OS X 11_4) AppleWebKit/605.1.15 "
+        "(KHTML, like Gecko) Version/14.1.1 Safari/604.1"
+    )
    save_api = WaybackMachineSaveAPI(url, user_agent)
    save_api.saved_archive = (
        "https://web.archive.org/web/20220124063056/https://example.com/"
    )
+    save_api._archive_url = save_api.saved_archive
    assert save_api.archive_url == save_api.saved_archive
--- a/tests/test_utils.py
+++ b/tests/test_utils.py
@ -1,9 +1,9 @@
-from waybackpy.utils import latest_version, DEFAULT_USER_AGENT
-from waybackpy.__version__ import __version__
+from waybackpy import __version__
+from waybackpy.utils import DEFAULT_USER_AGENT


-def test_default_user_agent():
+def test_default_user_agent() -> None:
    assert (
        DEFAULT_USER_AGENT
-        == "waybackpy %s - https://github.com/akamhy/waybackpy" % __version__
+        == f"waybackpy {__version__} - https://github.com/akamhy/waybackpy"
    )
--- a/tests/test_wrapper.py
+++ b/tests/test_wrapper.py
@ -0,0 +1,45 @@
+from waybackpy.wrapper import Url
+
+
+def test_oldest() -> None:
+    url = "https://bing.com"
+    oldest_archive = (
+        "https://web.archive.org/web/20030726111100/http://www.bing.com:80/"
+    )
+    wayback = Url(url).oldest()
+    assert wayback.archive_url == oldest_archive
+    assert str(wayback) == oldest_archive
+    assert len(wayback) > 365 * 15  # days in a year times years
+
+
+def test_newest() -> None:
+    url = "https://www.youtube.com/"
+    wayback = Url(url).newest()
+    assert "youtube" in str(wayback.archive_url)
+    assert "archived_snapshots" in str(wayback.json)
+
+
+def test_near() -> None:
+    url = "https://www.google.com"
+    wayback = Url(url).near(year=2010, month=10, day=10, hour=10, minute=10)
+    assert "20101010" in str(wayback.archive_url)
+
+
+def test_total_archives() -> None:
+    wayback = Url("https://akamhy.github.io")
+    assert wayback.total_archives() > 10
+
+    wayback = Url("https://gaha.ef4i3n.m5iai3kifp6ied.cima/gahh2718gs/ahkst63t7gad8")
+    assert wayback.total_archives() == 0
+
+
+def test_known_urls() -> None:
+    wayback = Url("akamhy.github.io")
+    assert len(list(wayback.known_urls(subdomain=True))) > 40
+
+
+def test_Save() -> None:
+    wayback = Url("https://en.wikipedia.org/wiki/Asymptotic_equipartition_property")
+    wayback.save()
+    archive_url = str(wayback.archive_url)
+    assert archive_url.find("Asymptotic_equipartition_property") != -1
--- a/waybackpy/init.py
+++ b/waybackpy/init.py
@ -1,14 +1,16 @@
-from .wrapper import Url
+"""Module initializer and provider of static information."""
+
+__version__ = "3.0.6"
+
+from .availability_api import WaybackMachineAvailabilityAPI
 from .cdx_api import WaybackMachineCDXServerAPI
 from .save_api import WaybackMachineSaveAPI
-from .availability_api import WaybackMachineAvailabilityAPI
-from .__version__ import (
-    __title__,
-    __description__,
-    __url__,
-    __version__,
-    __author__,
-    __author_email__,
-    __license__,
-    __copyright__,
-)
+from .wrapper import Url
+
+__all__ = [
+    "__version__",
+    "WaybackMachineAvailabilityAPI",
+    "WaybackMachineCDXServerAPI",
+    "WaybackMachineSaveAPI",
+    "Url",
+]
--- a/waybackpy/version.py
+++ b/waybackpy/version.py
@ -1,11 +0,0 @@
-__title__ = "waybackpy"
-__description__ = (
-    "Python package that interfaces with the Internet Archive's Wayback Machine APIs. "
-    "Archive pages and retrieve archived pages easily."
-)
-__url__ = "https://akamhy.github.io/waybackpy/"
-__version__ = "3.0.2"
-__author__ = "Akash Mahanty"
-__author_email__ = "akamhy@yahoo.com"
-__license__ = "MIT"
-__copyright__ = "Copyright 2020-2022 Akash Mahanty et al."
--- a/waybackpy/availability_api.py
+++ b/waybackpy/availability_api.py
@ -1,61 +1,100 @@
-import time
+"""
+This module interfaces the Wayback Machine's availability API.
+
+The interface is useful for looking up archives and finding archives
+that are close to a specific date and time.
+
+It has a class WaybackMachineAvailabilityAPI, and the class has
+methods like:
+
+near() for retrieving archives close to a specific date and time.
+
+oldest() for retrieving the first archive URL of the webpage.
+
+newest() for retrieving the latest archive of the webpage.
+
+The Wayback Machine Availability API response must be a valid JSON and
+if it is not then an exception, InvalidJSONInAvailabilityAPIResponse is raised.
+
+If the Availability API returned valid JSON but archive URL could not be found
+it it then ArchiveNotInAvailabilityAPIResponse is raised.
+"""
+
 import json
-import requests
+import time
 from datetime import datetime
-from .utils import DEFAULT_USER_AGENT
+from typing import Any, Dict, Optional
+
+import requests
+from requests.models import Response
+
 from .exceptions import (
    ArchiveNotInAvailabilityAPIResponse,
    InvalidJSONInAvailabilityAPIResponse,
 )
+from .utils import (
+    DEFAULT_USER_AGENT,
+    unix_timestamp_to_wayback_timestamp,
+    wayback_timestamp,
+)
+
+ResponseJSON = Dict[str, Any]


 class WaybackMachineAvailabilityAPI:
    """
-    Class that interfaces the availability API of the Wayback Machine.
+    Class that interfaces the Wayback Machine's availability API.
    """

-    def __init__(self, url, user_agent=DEFAULT_USER_AGENT, max_tries=3):
+    def __init__(
+        self, url: str, user_agent: str = DEFAULT_USER_AGENT, max_tries: int = 3
+    ) -> None:
+
        self.url = str(url).strip().replace(" ", "%20")
        self.user_agent = user_agent
-        self.headers = {"User-Agent": self.user_agent}
-        self.payload = {"url": "{url}".format(url=self.url)}
-        self.endpoint = "https://archive.org/wayback/available"
-        self.max_tries = max_tries
-        self.tries = 0
-        self.last_api_call_unix_time = int(time.time())
-        self.api_call_time_gap = 5
-        self.JSON = None
+        self.headers: Dict[str, str] = {"User-Agent": self.user_agent}
+        self.payload: Dict[str, str] = {"url": self.url}
+        self.endpoint: str = "https://archive.org/wayback/available"
+        self.max_tries: int = max_tries
+        self.tries: int = 0
+        self.last_api_call_unix_time: int = int(time.time())
+        self.api_call_time_gap: int = 5
+        self.json: Optional[ResponseJSON] = None
+        self.response: Optional[Response] = None

-    def unix_timestamp_to_wayback_timestamp(self, unix_timestamp):
-        """
-        Converts Unix time to wayback Machine timestamp.
-        """
-        return datetime.utcfromtimestamp(int(unix_timestamp)).strftime("%Y%m%d%H%M%S")
-
-    def __repr__(self):
+    def __repr__(self) -> str:
        """
        Same as string representation, just return the archive URL as a string.
        """
        return str(self)

-    def __str__(self):
+    def __str__(self) -> str:
        """
-        String representation of the class. If atleast one API call was successfully
-        made then return the archive URL as a string. Else returns None.
+        String representation of the class. If atleast one API
+        call was successfully made then return the archive URL
+        as a string. Else returns "" (empty string literal).
        """
-
-        # String must not return anything other than a string object
-        # So, if some asks for string repr before making the API requests
+        # __str__ can not return anything other than a string object
+        # So, if a string repr is asked even before making a API request
        # just return ""
-        if not self.JSON:
+        if not self.json:
            return ""

        return self.archive_url

-    def json(self):
+    def setup_json(self) -> Optional[ResponseJSON]:
        """
-        Makes the API call to the availability API can set the JSON response
-        to the JSON attribute of the instance and also returns the JSON attribute.
+        Makes the API call to the availability API and set the JSON response
+        to the JSON attribute of the instance and also returns the JSON
+        attribute.
+
+        time_diff and sleep_time makes sure that you are not making too many
+        requests in a short interval of item, making too many requests is bad
+        as Wayback Machine may reject them above a certain threshold.
+
+        The end-user can change the api_call_time_gap attribute of the instance
+        to increase or decrease the default time gap between two successive API
+        calls, but it is not recommended to increase it.
        """
        time_diff = int(time.time()) - self.last_api_call_unix_time
        sleep_time = self.api_call_time_gap - time_diff
@ -69,44 +108,56 @@ class WaybackMachineAvailabilityAPI:
        self.last_api_call_unix_time = int(time.time())
        self.tries += 1
        try:
-            self.JSON = self.response.json()
-        except json.decoder.JSONDecodeError:
+            self.json = None if self.response is None else self.response.json()
+        except json.decoder.JSONDecodeError as json_decode_error:
            raise InvalidJSONInAvailabilityAPIResponse(
-                "Response data:\n{text}".format(text=self.response.text)
-            )
+                f"Response data:\n{self.response.text}"
+            ) from json_decode_error

-        return self.JSON
+        return self.json

-    def timestamp(self):
+    def timestamp(self) -> datetime:
        """
        Converts the timestamp form the JSON response to datetime object.
        If JSON attribute of the instance is None it implies that the either
        the the last API call failed or one was never made.

-        If not JSON or if JSON but no timestamp in the JSON response then returns
-        the maximum value for datetime object that is possible.
+        If not JSON or if JSON but no timestamp in the JSON response then
+        returns the maximum value for datetime object that is possible.

-        If you get an URL as a response form the availability API it is guaranteed
-        that you can get the datetime object from the timestamp.
+        If you get an URL as a response form the availability API it is
+        guaranteed that you can get the datetime object from the timestamp.
        """
-        if not self.JSON or not self.JSON["archived_snapshots"]:
+        if self.json is None or "archived_snapshots" not in self.json:
            return datetime.max

-        return datetime.strptime(
-            self.JSON["archived_snapshots"]["closest"]["timestamp"], "%Y%m%d%H%M%S"
-        )
+        if (
+            self.json is not None
+            and "archived_snapshots" in self.json
+            and self.json["archived_snapshots"] is not None
+            and "closest" in self.json["archived_snapshots"]
+            and self.json["archived_snapshots"]["closest"] is not None
+            and "timestamp" in self.json["archived_snapshots"]["closest"]
+        ):
+            return datetime.strptime(
+                self.json["archived_snapshots"]["closest"]["timestamp"], "%Y%m%d%H%M%S"
+            )
+
+        raise ValueError("Timestamp not found in the Availability API's JSON response.")

    @property
-    def archive_url(self):
+    def archive_url(self) -> str:
        """
-        Reads the the JSON response data and tries to get the timestamp and returns
-        the timestamp if found else returns None.
+        Reads the the JSON response data and returns
+        the timestamp if found and if not found raises
+        ArchiveNotInAvailabilityAPIResponse.
        """
-        data = self.JSON
+        archive_url = ""
+        data = self.json

-        # If the user didn't used oldest, newest or near but tries to access the
-        # archive_url attribute then, we assume they are fine with any archive
-        # and invoke the oldest archive function.
+        # If the user didn't invoke oldest, newest or near but tries to access
+        # archive_url attribute then assume they that are fine with any archive
+        # and invoke the oldest method.
        if not data:
            self.oldest()

@ -116,18 +167,21 @@ class WaybackMachineAvailabilityAPI:
            while (self.tries < self.max_tries) and (
                not data or not data["archived_snapshots"]
            ):
-                self.json()  # It makes a new API call
-                data = self.JSON  # json() updated the value of JSON attribute
+                self.setup_json()  # It makes a new API call
+                data = self.json  # setup_json() updates value of json attribute

-            # Even if after we exhausted teh max_tries, then we give up and
-            # raise exception.
+            # If exhausted max_tries, then give up and
+            # raise ArchiveNotInAvailabilityAPIResponse.

            if not data or not data["archived_snapshots"]:
                raise ArchiveNotInAvailabilityAPIResponse(
                    "Archive not found in the availability "
-                    + "API response, the URL you requested may not have any "
-                    + "archives yet. You may retry after some time or archive the webpage now."
-                    + "\nResponse data:\n{response}".format(response=self.response.text)
+                    "API response, the URL you requested may not have any archives "
+                    "yet. You may retry after some time or archive the webpage now.\n"
+                    "Response data:\n"
+                    ""
+                    if self.response is None
+                    else self.response.text
                )
        else:
            archive_url = data["archived_snapshots"]["closest"]["url"]
@ -136,63 +190,57 @@ class WaybackMachineAvailabilityAPI:
            )
        return archive_url

-    def wayback_timestamp(self, **kwargs):
+    def oldest(self) -> "WaybackMachineAvailabilityAPI":
        """
-        Prepends zero before the year, month, day, hour and minute so that they
-        are conformable with the YYYYMMDDhhmmss wayback machine timestamp format.
+        Passes the date 1994-01-01 to near which should return the oldest archive
+        because Wayback Machine was started in May, 1996 and it is assumed that
+        there would be no archive older than January 1, 1994.
        """
-        return "".join(
-            str(kwargs[key]).zfill(2)
-            for key in ["year", "month", "day", "hour", "minute"]
-        )
+        return self.near(year=1994, month=1, day=1)

-    def oldest(self):
+    def newest(self) -> "WaybackMachineAvailabilityAPI":
        """
-        Passing the year 1994 should return the oldest archive because
-        wayback machine was started in May, 1996 and there should be no archive
-        before the year 1994.
-        """
-        return self.near(year=1994)
+        Passes the current UNIX time to near() for retrieving the newest archive
+        from the availability API.

-    def newest(self):
-        """
-        Passing the current UNIX time should be sufficient to get the newest
-        archive considering the API request-response time delay and also the
-        database lags on Wayback machine.
+        Remember UNIX time is UTC and Wayback Machine is also UTC based.
        """
        return self.near(unix_timestamp=int(time.time()))

    def near(
        self,
-        year=None,
-        month=None,
-        day=None,
-        hour=None,
-        minute=None,
-        unix_timestamp=None,
-    ):
+        year: Optional[int] = None,
+        month: Optional[int] = None,
+        day: Optional[int] = None,
+        hour: Optional[int] = None,
+        minute: Optional[int] = None,
+        unix_timestamp: Optional[int] = None,
+    ) -> "WaybackMachineAvailabilityAPI":
        """
-        The main method for this Class, oldest and newest methods are dependent on this
-        method.
+        The most important method of this Class, oldest() and newest() are
+        dependent on it.

        It generates the timestamp based on the input either by calling the
        unix_timestamp_to_wayback_timestamp or wayback_timestamp method with
        appropriate arguments for their respective parameters.
+
        Adds the timestamp to the payload dictionary.
-        And finally invoking the json method to make the API call then returns the instance.
+
+        And finally invokes the setup_json method to make the API call then
+        finally returns the instance.
        """
        if unix_timestamp:
-            timestamp = self.unix_timestamp_to_wayback_timestamp(unix_timestamp)
+            timestamp = unix_timestamp_to_wayback_timestamp(unix_timestamp)
        else:
            now = datetime.utcnow().timetuple()
-            timestamp = self.wayback_timestamp(
-                year=year if year else now.tm_year,
-                month=month if month else now.tm_mon,
-                day=day if day else now.tm_mday,
-                hour=hour if hour else now.tm_hour,
-                minute=minute if minute else now.tm_min,
+            timestamp = wayback_timestamp(
+                year=now.tm_year if year is None else year,
+                month=now.tm_mon if month is None else month,
+                day=now.tm_mday if day is None else day,
+                hour=now.tm_hour if hour is None else hour,
+                minute=now.tm_min if minute is None else minute,
            )

        self.payload["timestamp"] = timestamp
-        self.json()
+        self.setup_json()
        return self
--- a/waybackpy/cdx_api.py
+++ b/waybackpy/cdx_api.py
@ -1,88 +1,144 @@
-from .exceptions import WaybackError
+"""
+This module interfaces the Wayback Machine's CDX server API.
+
+The module has WaybackMachineCDXServerAPI which should be used by the users of
+this module to consume the CDX server API.
+
+WaybackMachineCDXServerAPI has a snapshot method that yields the snapshots, and
+the snapshots are yielded as instances of the CDXSnapshot class.
+"""
+
+
+import time
+from datetime import datetime
+from typing import Dict, Generator, List, Optional, Union, cast
+
 from .cdx_snapshot import CDXSnapshot
 from .cdx_utils import (
-    get_total_pages,
-    get_response,
-    check_filters,
    check_collapses,
+    check_filters,
    check_match_type,
+    check_sort,
    full_url,
+    get_response,
+    get_total_pages,
+)
+from .exceptions import NoCDXRecordFound, WaybackError
+from .utils import (
+    DEFAULT_USER_AGENT,
+    unix_timestamp_to_wayback_timestamp,
+    wayback_timestamp,
 )
-
-from .utils import DEFAULT_USER_AGENT


 class WaybackMachineCDXServerAPI:
    """
    Class that interfaces the CDX server API of the Wayback Machine.
+
+    snapshot() returns a generator that can be iterated upon by the end-user,
+    the generator returns the snapshots/entries as instance of CDXSnapshot to
+    make the usage easy, just use '.' to get any attribute as the attributes are
+    accessible via a dot ".".
    """

+    # start_timestamp: from, can not use from as it's a keyword
+    # end_timestamp: to, not using to as can not use from
    def __init__(
        self,
-        url,
-        user_agent=DEFAULT_USER_AGENT,
-        start_timestamp=None,  # from, can not use from as it's a keyword
-        end_timestamp=None,  # to, not using to as can not use from
-        filters=[],
-        match_type=None,
-        gzip=None,
-        collapses=[],
-        limit=None,
-        max_tries=3,
-    ):
+        url: str,
+        user_agent: str = DEFAULT_USER_AGENT,
+        start_timestamp: Optional[str] = None,
+        end_timestamp: Optional[str] = None,
+        filters: Optional[List[str]] = None,
+        match_type: Optional[str] = None,
+        sort: Optional[str] = None,
+        gzip: Optional[str] = None,
+        collapses: Optional[List[str]] = None,
+        limit: Optional[str] = None,
+        max_tries: int = 3,
+        use_pagination: bool = False,
+        closest: Optional[str] = None,
+    ) -> None:
        self.url = str(url).strip().replace(" ", "%20")
        self.user_agent = user_agent
-        self.start_timestamp = str(start_timestamp) if start_timestamp else None
-        self.end_timestamp = str(end_timestamp) if end_timestamp else None
-        self.filters = filters
+        self.start_timestamp = None if start_timestamp is None else str(start_timestamp)
+        self.end_timestamp = None if end_timestamp is None else str(end_timestamp)
+        self.filters = [] if filters is None else filters
        check_filters(self.filters)
-        self.match_type = str(match_type).strip() if match_type else None
+        self.match_type = None if match_type is None else str(match_type).strip()
        check_match_type(self.match_type, self.url)
-        self.gzip = gzip if gzip else True
-        self.collapses = collapses
+        self.sort = None if sort is None else str(sort).strip()
+        check_sort(self.sort)
+        self.gzip = gzip
+        self.collapses = [] if collapses is None else collapses
        check_collapses(self.collapses)
-        self.limit = limit if limit else 5000
+        self.limit = 25000 if limit is None else limit
        self.max_tries = max_tries
-        self.last_api_request_url = None
-        self.use_page = False
+        self.use_pagination = use_pagination
+        self.closest = None if closest is None else str(closest)
+        self.last_api_request_url: Optional[str] = None
        self.endpoint = "https://web.archive.org/cdx/search/cdx"

-    def cdx_api_manager(self, payload, headers, use_page=False):
+    def cdx_api_manager(
+        self, payload: Dict[str, str], headers: Dict[str, str]
+    ) -> Generator[str, None, None]:
+        """
+        This method uses the pagination API of the CDX server if
+        use_pagination attribute is True else uses the standard
+        CDX server response data.
+        """
+
+        # When using the pagination API of the CDX server.
+        if self.use_pagination is True:
+
+            total_pages = get_total_pages(self.url, self.user_agent)
+            successive_blank_pages = 0

-        total_pages = get_total_pages(self.url, self.user_agent)
-        # If we only have two or less pages of archives then we care for more accuracy
-        # pagination API is lagged sometimes
-        if use_page is True and total_pages >= 2:
-            blank_pages = 0
            for i in range(total_pages):
                payload["page"] = str(i)

                url = full_url(self.endpoint, params=payload)
                res = get_response(url, headers=headers)

+                if isinstance(res, Exception):
+                    raise res
+
                self.last_api_request_url = url
                text = res.text
-                if len(text) == 0:
-                    blank_pages += 1

-                if blank_pages >= 2:
+                # Reset the counter if the last page was blank
+                # but the current page is not.
+                if successive_blank_pages == 1:
+                    if len(text) != 0:
+                        successive_blank_pages = 0
+
+                # Increase the succesive page counter on encountering
+                # blank page.
+                if len(text) == 0:
+                    successive_blank_pages += 1
+
+                # If two succesive pages are blank
+                # then we don't have any more pages left to
+                # iterate.
+                if successive_blank_pages >= 2:
                    break

                yield text
-        else:

+        # When not using the pagination API of the CDX server
+        else:
            payload["showResumeKey"] = "true"
            payload["limit"] = str(self.limit)
-            resumeKey = None
-
+            resume_key = None
            more = True
            while more:
-
-                if resumeKey:
-                    payload["resumeKey"] = resumeKey
+                if resume_key:
+                    payload["resumeKey"] = resume_key

                url = full_url(self.endpoint, params=payload)
                res = get_response(url, headers=headers)
+                if isinstance(res, Exception):
+                    raise res

                self.last_api_request_url = url

@ -97,63 +153,235 @@ class WaybackMachineCDXServerAPI:

                    if len(second_last_line) == 0:

-                        resumeKey = lines[-1].strip()
-                        text = text.replace(resumeKey, "", 1).strip()
+                        resume_key = lines[-1].strip()
+                        text = text.replace(resume_key, "", 1).strip()
                        more = True

                yield text

-    def add_payload(self, payload):
+    def add_payload(self, payload: Dict[str, str]) -> None:
+        """
+        Adds the payload to the payload dictionary.
+        """
        if self.start_timestamp:
            payload["from"] = self.start_timestamp

        if self.end_timestamp:
            payload["to"] = self.end_timestamp

-        if self.gzip is not True:
+        if self.gzip is None:
            payload["gzip"] = "false"

+        if self.closest:
+            payload["closest"] = self.closest
+
        if self.match_type:
            payload["matchType"] = self.match_type

+        if self.sort:
+            payload["sort"] = self.sort
+
        if self.filters and len(self.filters) > 0:
-            for i, f in enumerate(self.filters):
-                payload["filter" + str(i)] = f
+            for i, _filter in enumerate(self.filters):
+                payload["filter" + str(i)] = _filter

        if self.collapses and len(self.collapses) > 0:
-            for i, f in enumerate(self.collapses):
-                payload["collapse" + str(i)] = f
+            for i, collapse in enumerate(self.collapses):
+                payload["collapse" + str(i)] = collapse

-        # Don't need to return anything as it's dictionary.
        payload["url"] = self.url

-    def snapshots(self):
-        payload = {}
+    def before(
+        self,
+        year: Optional[int] = None,
+        month: Optional[int] = None,
+        day: Optional[int] = None,
+        hour: Optional[int] = None,
+        minute: Optional[int] = None,
+        unix_timestamp: Optional[int] = None,
+        wayback_machine_timestamp: Optional[Union[int, str]] = None,
+    ) -> CDXSnapshot:
+        """
+        Gets the nearest archive before the given datetime.
+        """
+        if unix_timestamp:
+            timestamp = unix_timestamp_to_wayback_timestamp(unix_timestamp)
+        elif wayback_machine_timestamp:
+            timestamp = str(wayback_machine_timestamp)
+        else:
+            now = datetime.utcnow().timetuple()
+            timestamp = wayback_timestamp(
+                year=now.tm_year if year is None else year,
+                month=now.tm_mon if month is None else month,
+                day=now.tm_mday if day is None else day,
+                hour=now.tm_hour if hour is None else hour,
+                minute=now.tm_min if minute is None else minute,
+            )
+        self.closest = timestamp
+        self.sort = "closest"
+        self.limit = 25000
+        for snapshot in self.snapshots():
+            if snapshot.timestamp < timestamp:
+                return snapshot
+
+        # If a snapshot isn't returned, then none were found.
+        raise NoCDXRecordFound(
+            "No records were found before the given date for the query."
+            + "Either there are no archives before the given date,"
+            + " the URL may not have any archived, or the URL may have been"
+            + " recently archived and is still not available on the CDX server."
+        )
+
+    def after(
+        self,
+        year: Optional[int] = None,
+        month: Optional[int] = None,
+        day: Optional[int] = None,
+        hour: Optional[int] = None,
+        minute: Optional[int] = None,
+        unix_timestamp: Optional[int] = None,
+        wayback_machine_timestamp: Optional[Union[int, str]] = None,
+    ) -> CDXSnapshot:
+        """
+        Gets the nearest archive after the given datetime.
+        """
+        if unix_timestamp:
+            timestamp = unix_timestamp_to_wayback_timestamp(unix_timestamp)
+        elif wayback_machine_timestamp:
+            timestamp = str(wayback_machine_timestamp)
+        else:
+            now = datetime.utcnow().timetuple()
+            timestamp = wayback_timestamp(
+                year=now.tm_year if year is None else year,
+                month=now.tm_mon if month is None else month,
+                day=now.tm_mday if day is None else day,
+                hour=now.tm_hour if hour is None else hour,
+                minute=now.tm_min if minute is None else minute,
+            )
+        self.closest = timestamp
+        self.sort = "closest"
+        self.limit = 25000
+        for snapshot in self.snapshots():
+            if snapshot.timestamp > timestamp:
+                return snapshot
+
+        # If a snapshot isn't returned, then none were found.
+        raise NoCDXRecordFound(
+            "No records were found after the given date for the query."
+            + "Either there are no archives after the given date,"
+            + " the URL may not have any archives, or the URL may have been"
+            + " recently archived and is still not available on the CDX server."
+        )
+
+    def near(
+        self,
+        year: Optional[int] = None,
+        month: Optional[int] = None,
+        day: Optional[int] = None,
+        hour: Optional[int] = None,
+        minute: Optional[int] = None,
+        unix_timestamp: Optional[int] = None,
+        wayback_machine_timestamp: Optional[Union[int, str]] = None,
+    ) -> CDXSnapshot:
+        """
+        Fetch archive close to a datetime, it can only return
+        a single URL. If you want more do not use this method
+        instead use the class.
+        """
+        if unix_timestamp:
+            timestamp = unix_timestamp_to_wayback_timestamp(unix_timestamp)
+        elif wayback_machine_timestamp:
+            timestamp = str(wayback_machine_timestamp)
+        else:
+            now = datetime.utcnow().timetuple()
+            timestamp = wayback_timestamp(
+                year=now.tm_year if year is None else year,
+                month=now.tm_mon if month is None else month,
+                day=now.tm_mday if day is None else day,
+                hour=now.tm_hour if hour is None else hour,
+                minute=now.tm_min if minute is None else minute,
+            )
+        self.closest = timestamp
+        self.sort = "closest"
+        self.limit = 1
+        first_snapshot = None
+        for snapshot in self.snapshots():
+            first_snapshot = snapshot
+            break
+
+        if not first_snapshot:
+            raise NoCDXRecordFound(
+                "Wayback Machine's CDX server did not return any records "
+                + "for the query. The URL may not have any archives "
+                + " on the Wayback Machine or the URL may have been recently "
+                + "archived and is still not available on the CDX server."
+            )
+
+        return first_snapshot
+
+    def newest(self) -> CDXSnapshot:
+        """
+        Passes the current UNIX time to near() for retrieving the newest archive
+        from the availability API.
+
+        Remember UNIX time is UTC and Wayback Machine is also UTC based.
+        """
+        return self.near(unix_timestamp=int(time.time()))
+
+    def oldest(self) -> CDXSnapshot:
+        """
+        Passes the date 1994-01-01 to near which should return the oldest archive
+        because Wayback Machine was started in May, 1996 and it is assumed that
+        there would be no archive older than January 1, 1994.
+        """
+        return self.near(year=1994, month=1, day=1)
+
+    def snapshots(self) -> Generator[CDXSnapshot, None, None]:
+        """
+        This function yields the CDX data lines as snapshots.
+
+        As it is a generator it exhaustible, the reason that this is
+        a generator and not a list are:
+
+        a) CDX server API can return millions of entries for a query and list
+        is not suitable for such cases.
+
+        b) Preventing memory usage issues, as told before this method may yield
+        millions of records for some queries and your system may not have enough
+        memory for such a big list. Also Remember this if outputing to Jupyter
+        Notebooks.
+
+        The objects yielded by this method are instance of CDXSnapshot class,
+        you can access the attributes of the entries as the attribute of the instance
+        itself.
+        """
+        payload: Dict[str, str] = {}
        headers = {"User-Agent": self.user_agent}

        self.add_payload(payload)

-        if not self.start_timestamp or self.end_timestamp:
-            self.use_page = True
+        entries = self.cdx_api_manager(payload, headers)

-        if self.collapses != []:
-            self.use_page = False
+        for entry in entries:

-        texts = self.cdx_api_manager(payload, headers, use_page=self.use_page)
-
-        for text in texts:
-
-            if text.isspace() or len(text) <= 1 or not text:
+            if entry.isspace() or len(entry) <= 1 or not entry:
                continue

-            snapshot_list = text.split("\n")
+            # each line is a snapshot aka entry of the CDX server API.
+            # We are able to split the page by lines because it only
+            # splits the lines on a sinlge page and not all the entries
+            # at once, thus there should be no issues of too much memory usage.
+            snapshot_list = entry.split("\n")

            for snapshot in snapshot_list:

-                if len(snapshot) < 46:  # 14 + 32 (timestamp+digest)
+                # 14 + 32 == 46 ( timestamp + digest ), ignore the invalid entries.
+                # they are invalid if their length is smaller than sum of length
+                # of a standard wayback_timestamp and standard digest of an entry.
+                if len(snapshot) < 46:
                    continue

-                properties = {
+                properties: Dict[str, Optional[str]] = {
                    "urlkey": None,
                    "timestamp": None,
                    "original": None,
@ -163,22 +391,16 @@ class WaybackMachineCDXServerAPI:
                    "length": None,
                }

-                prop_values = snapshot.split(" ")
+                property_value = snapshot.split(" ")

-                prop_values_len = len(prop_values)
-                properties_len = len(properties)
+                total_property_values = len(property_value)
+                warranted_total_property_values = len(properties)

-                if prop_values_len != properties_len:
+                if total_property_values != warranted_total_property_values:
                    raise WaybackError(
-                        "Snapshot returned by Cdx API has {prop_values_len} properties".format(
-                            prop_values_len=prop_values_len
-                        )
-                        + " instead of expected {properties_len} ".format(
-                            properties_len=properties_len
-                        )
-                        + "properties.\nProblematic Snapshot : {snapshot}".format(
-                            snapshot=snapshot
-                        )
+                        f"Snapshot returned by CDX API has {total_property_values} prop"
+                        f"erties instead of expected {warranted_total_property_values} "
+                        f"properties.\nProblematic Snapshot: {snapshot}"
                    )

                (
@ -189,6 +411,6 @@ class WaybackMachineCDXServerAPI:
                    properties["statuscode"],
                    properties["digest"],
                    properties["length"],
-                ) = prop_values
+                ) = property_value

-                yield CDXSnapshot(properties)
+                yield CDXSnapshot(cast(Dict[str, str], properties))
--- a/waybackpy/cdx_snapshot.py
+++ b/waybackpy/cdx_snapshot.py
@ -1,35 +1,90 @@
+"""
+Module that contains the CDXSnapshot class, CDX records/lines are casted
+to CDXSnapshot objects for easier access.
+
+The CDX index format is plain text data. Each line ('record') indicates a
+crawled document. And these lines are casted to CDXSnapshot.
+"""
+
+
 from datetime import datetime
+from typing import Dict


 class CDXSnapshot:
    """
-    Class for the CDX snapshot lines returned by the CDX API,
+    Class for the CDX snapshot lines('record') returned by the CDX API,
    Each valid line of the CDX API is casted to an CDXSnapshot object
-    by the CDX API interface.
+    by the CDX API interface, just use "." to access any attribute of the
+    CDX server API snapshot.
+
    This provides the end-user the ease of using the data as attributes
    of the CDXSnapshot.
+
+    The string representation of the class is identical to the line returned
+    by the CDX server API.
+
+    Besides all the attributes of the CDX server API this class also provides
+    archive_url attribute, yes it is the archive url of the snapshot.
+
+    Attributes of the this class and what they represents and are useful for:
+
+    urlkey: The document captured, expressed as a SURT
+            SURT stands for Sort-friendly URI Reordering Transform, and is a
+            transformation applied to URIs which makes their left-to-right
+            representation better match the natural hierarchy of domain names.
+            A URI <scheme://domain.tld/path?query> has SURT
+            form <scheme://(tld,domain,)/path?query>.
+
+    timestamp: The timestamp of the archive, format is yyyyMMddhhmmss and type
+               is string.
+
+    datetime_timestamp: The timestamp as a datetime object.
+
+    original: The original URL of the archive. If archive_url is
+    https://web.archive.org/web/20220113130051/https://google.com then the
+    original URL is https://google.com
+
+    mimetype: The document’s file type. e.g. text/html
+
+    statuscode: HTTP response code for the document at the time of its crawling
+
+    digest: Base32-encoded SHA-1 checksum of the document for discriminating
+            with others
+
+    length: Document’s volume of bytes in the WARC file
+
+    archive_url: The archive url of the snapshot, this is not returned by the
+                 CDX server API but created by this class on init.
    """

-    def __init__(self, properties):
-        self.urlkey = properties["urlkey"]
-        self.timestamp = properties["timestamp"]
-        self.datetime_timestamp = datetime.strptime(self.timestamp, "%Y%m%d%H%M%S")
-        self.original = properties["original"]
-        self.mimetype = properties["mimetype"]
-        self.statuscode = properties["statuscode"]
-        self.digest = properties["digest"]
-        self.length = properties["length"]
-        self.archive_url = (
-            "https://web.archive.org/web/" + self.timestamp + "/" + self.original
+    def __init__(self, properties: Dict[str, str]) -> None:
+        self.urlkey: str = properties["urlkey"]
+        self.timestamp: str = properties["timestamp"]
+        self.datetime_timestamp: datetime = datetime.strptime(
+            self.timestamp, "%Y%m%d%H%M%S"
+        )
+        self.original: str = properties["original"]
+        self.mimetype: str = properties["mimetype"]
+        self.statuscode: str = properties["statuscode"]
+        self.digest: str = properties["digest"]
+        self.length: str = properties["length"]
+        self.archive_url: str = (
+            f"https://web.archive.org/web/{self.timestamp}/{self.original}"
        )

-    def __str__(self):
-        return "{urlkey} {timestamp} {original} {mimetype} {statuscode} {digest} {length}".format(
-            urlkey=self.urlkey,
-            timestamp=self.timestamp,
-            original=self.original,
-            mimetype=self.mimetype,
-            statuscode=self.statuscode,
-            digest=self.digest,
-            length=self.length,
+    def __repr__(self) -> str:
+        """
+        Same as __str__()
+        """
+        return str(self)
+
+    def __str__(self) -> str:
+        """
+        The string representation is same as the line returned by the
+        CDX server API for the snapshot.
+        """
+        return (
+            f"{self.urlkey} {self.timestamp} {self.original} "
+            f"{self.mimetype} {self.statuscode} {self.digest} {self.length}"
        )
--- a/waybackpy/cdx_utils.py
+++ b/waybackpy/cdx_utils.py
@ -1,128 +1,201 @@
+"""
+Utility functions required for accessing the CDX server API.
+
+These are here in this module so that we don’t make any module too
+long.
+"""
+
 import re
+from typing import Any, Dict, List, Optional, Union
+from urllib.parse import quote
+
 import requests
-from urllib3.util.retry import Retry
 from requests.adapters import HTTPAdapter
-from .exceptions import WaybackError
+from urllib3.util.retry import Retry
+
+from .exceptions import BlockedSiteError, WaybackError
 from .utils import DEFAULT_USER_AGENT


-def get_total_pages(url, user_agent=DEFAULT_USER_AGENT):
+def get_total_pages(url: str, user_agent: str = DEFAULT_USER_AGENT) -> int:
+    """
+    When using the pagination use adding showNumPages=true to the request
+    URL makes the CDX server return an integer which is the number of pages
+    of CDX pages available for us to query using the pagination API.
+    """
    endpoint = "https://web.archive.org/cdx/search/cdx?"
    payload = {"showNumPages": "true", "url": str(url)}
    headers = {"User-Agent": user_agent}
    request_url = full_url(endpoint, params=payload)
    response = get_response(request_url, headers=headers)
-    return int(response.text.strip())
+    check_for_blocked_site(response, url)
+    if isinstance(response, requests.Response):
+        return int(response.text.strip())
+    raise response


-def full_url(endpoint, params):
+def check_for_blocked_site(
+    response: Union[requests.Response, Exception], url: Optional[str] = None
+) -> None:
+    """
+    Checks that the URL can be archived by wayback machine or not.
+    robots.txt policy of the site may prevent the wayback machine.
+    """
+    # see https://github.com/akamhy/waybackpy/issues/157
+
+    # the following if block is to make mypy happy.
+    if isinstance(response, Exception):
+        raise response
+
+    if not url:
+        url = "The requested content"
+    if (
+        "org.archive.util.io.RuntimeIOException: "
+        + "org.archive.wayback.exception.AdministrativeAccessControlException: "
+        + "Blocked Site Error"
+        in response.text.strip()
+    ):
+        raise BlockedSiteError(
+            f"{url} is excluded from Wayback Machine by the site's robots.txt policy."
+        )
+
+
+def full_url(endpoint: str, params: Dict[str, Any]) -> str:
+    """
+    As the function's name already implies that it returns
+    full URL, but why we need a function for generating full URL?
+    The CDX server can support multiple arguments for parameters
+    such as filter and collapse and this function adds them without
+    overwriting earlier added arguments.
+    """
    if not params:
        return endpoint
-    full_url = endpoint if endpoint.endswith("?") else (endpoint + "?")
+    _full_url = endpoint if endpoint.endswith("?") else (endpoint + "?")
+
    for key, val in params.items():
        key = "filter" if key.startswith("filter") else key
        key = "collapse" if key.startswith("collapse") else key
-        amp = "" if full_url.endswith("?") else "&"
-        full_url = (
-            full_url
-            + amp
-            + "{key}={val}".format(key=key, val=requests.utils.quote(str(val)))
-        )
-    return full_url
+        amp = "" if _full_url.endswith("?") else "&"
+        val = quote(str(val), safe="")
+        _full_url += f"{amp}{key}={val}"
+
+    return _full_url


 def get_response(
-    url,
-    headers=None,
-    retries=5,
-    backoff_factor=0.5,
-    no_raise_on_redirects=False,
-):
+    url: str,
+    headers: Optional[Dict[str, str]] = None,
+    retries: int = 5,
+    backoff_factor: float = 0.5,
+) -> Union[requests.Response, Exception]:
+    """
+    Makes get request to the CDX server and returns the response.
+    """
    session = requests.Session()
-    retries = Retry(
+
+    retries_ = Retry(
        total=retries,
        backoff_factor=backoff_factor,
        status_forcelist=[500, 502, 503, 504],
    )
-    session.mount("https://", HTTPAdapter(max_retries=retries))

-    try:
-        response = session.get(url, headers=headers)
-        session.close()
-        return response
-    except Exception as e:
-        reason = str(e)
-        exc_message = "Error while retrieving {url}.\n{reason}".format(
-            url=url, reason=reason
-        )
-        exc = WaybackError(exc_message)
-        exc.__cause__ = e
-        raise exc
+    session.mount("https://", HTTPAdapter(max_retries=retries_))
+    response = session.get(url, headers=headers)
+    session.close()
+    check_for_blocked_site(response)
+    return response


-def check_filters(filters):
+def check_filters(filters: List[str]) -> None:
+    """
+    Check that the filter arguments passed by the end-user are valid.
+    If not valid then raise WaybackError.
+    """
    if not isinstance(filters, list):
        raise WaybackError("filters must be a list.")

    # [!]field:regex
    for _filter in filters:
-        try:
+        match = re.search(
+            r"(\!?(?:urlkey|timestamp|original|mimetype|statuscode|digest|length)):"
+            r"(.*)",
+            _filter,
+        )

-            match = re.search(
-                r"(\!?(?:urlkey|timestamp|original|mimetype|statuscode|digest|length)):(.*)",
-                _filter,
-            )
+        if match is None or len(match.groups()) != 2:

-            match.group(1)
-            match.group(2)
-
-        except Exception:
-
-            exc_message = (
-                "Filter '{_filter}' is not following the cdx filter syntax.".format(
-                    _filter=_filter
-                )
-            )
+            exc_message = f"Filter '{_filter}' is not following the cdx filter syntax."
            raise WaybackError(exc_message)


-def check_collapses(collapses):
-
+def check_collapses(collapses: List[str]) -> bool:
+    """
+    Check that the collapse arguments passed by the end-user are valid.
+    If not valid then raise WaybackError.
+    """
    if not isinstance(collapses, list):
        raise WaybackError("collapses must be a list.")

    if len(collapses) == 0:
-        return
+        return True

    for collapse in collapses:
-        try:
-            match = re.search(
-                r"(urlkey|timestamp|original|mimetype|statuscode|digest|length)(:?[0-9]{1,99})?",
-                collapse,
-            )
-            match.group(1)
-            if 2 == len(match.groups()):
-                match.group(2)
-        except Exception:
-            exc_message = "collapse argument '{collapse}' is not following the cdx collapse syntax.".format(
-                collapse=collapse
+        match = re.search(
+            r"(urlkey|timestamp|original|mimetype|statuscode|digest|length)"
+            r"(:?[0-9]{1,99})?",
+            collapse,
+        )
+        if match is None or len(match.groups()) != 2:
+            exc_message = (
+                f"collapse argument '{collapse}' "
+                "is not following the cdx collapse syntax."
            )
            raise WaybackError(exc_message)

+    return True
+
+
+def check_match_type(match_type: Optional[str], url: str) -> bool:
+    """
+    Check that the match_type argument passed by the end-user is valid.
+    If not valid then raise WaybackError.
+    """
+    legal_match_type = ["exact", "prefix", "host", "domain"]

-def check_match_type(match_type, url):
    if not match_type:
-        return
+        return True

    if "*" in url:
        raise WaybackError(
            "Can not use wildcard in the URL along with the match_type arguments."
        )

-    legal_match_type = ["exact", "prefix", "host", "domain"]
-
    if match_type not in legal_match_type:
-        exc_message = "{match_type} is not an allowed match type.\nUse one from 'exact', 'prefix', 'host' or 'domain'".format(
-            match_type=match_type
+        exc_message = (
+            f"{match_type} is not an allowed match type.\n"
+            "Use one from 'exact', 'prefix', 'host' or 'domain'"
        )
        raise WaybackError(exc_message)
+
+    return True
+
+
+def check_sort(sort: Optional[str]) -> bool:
+    """
+    Check that the sort argument passed by the end-user is valid.
+    If not valid then raise WaybackError.
+    """
+
+    legal_sort = ["default", "closest", "reverse"]
+
+    if not sort:
+        return True
+
+    if sort not in legal_sort:
+        exc_message = (
+            f"{sort} is not an allowed argument for sort.\n"
+            "Use one from 'default', 'closest' or 'reverse'"
+        )
+        raise WaybackError(exc_message)
+
+    return True
--- a/waybackpy/cli.py
+++ b/waybackpy/cli.py
@ -1,17 +1,168 @@
-import click
-import re
+"""
+Module responsible for enabling waybackpy to function as a CLI tool.
+"""
+
 import os
-import json as JSON
 import random
+import re
 import string
-from .__version__ import __version__
-from .utils import DEFAULT_USER_AGENT
+from typing import Any, Dict, Generator, List, Optional
+
+import click
+import requests
+
+from . import __version__
 from .cdx_api import WaybackMachineCDXServerAPI
+from .exceptions import BlockedSiteError, NoCDXRecordFound
 from .save_api import WaybackMachineSaveAPI
-from .availability_api import WaybackMachineAvailabilityAPI
+from .utils import DEFAULT_USER_AGENT
 from .wrapper import Url


+def handle_cdx_closest_derivative_methods(
+    cdx_api: "WaybackMachineCDXServerAPI",
+    oldest: bool,
+    near: bool,
+    newest: bool,
+    near_args: Optional[Dict[str, int]] = None,
+) -> None:
+    """
+    Handles the closest parameter derivative methods.
+
+    near, newest and oldest use the closest parameter with active
+    closest based sorting.
+    """
+    try:
+        if near:
+            if near_args:
+                archive_url = cdx_api.near(**near_args).archive_url
+            else:
+                archive_url = cdx_api.near().archive_url
+        elif newest:
+            archive_url = cdx_api.newest().archive_url
+        elif oldest:
+            archive_url = cdx_api.oldest().archive_url
+        click.echo("Archive URL:")
+        click.echo(archive_url)
+    except NoCDXRecordFound as exc:
+        click.echo(click.style("NoCDXRecordFound: ", fg="red") + str(exc), err=True)
+    except BlockedSiteError as exc:
+        click.echo(click.style("BlockedSiteError: ", fg="red") + str(exc), err=True)
+
+
+def handle_cdx(data: List[Any]) -> None:
+    """
+    Handles the CDX CLI options and output format.
+    """
+    url = data[0]
+    user_agent = data[1]
+    start_timestamp = data[2]
+    end_timestamp = data[3]
+    cdx_filter = data[4]
+    collapse = data[5]
+    cdx_print = data[6]
+    limit = data[7]
+    gzip = data[8]
+    match_type = data[9]
+    sort = data[10]
+    use_pagination = data[11]
+    closest = data[12]
+
+    filters = list(cdx_filter)
+    collapses = list(collapse)
+    cdx_print = list(cdx_print)
+
+    cdx_api = WaybackMachineCDXServerAPI(
+        url,
+        user_agent=user_agent,
+        start_timestamp=start_timestamp,
+        end_timestamp=end_timestamp,
+        closest=closest,
+        filters=filters,
+        match_type=match_type,
+        sort=sort,
+        use_pagination=use_pagination,
+        gzip=gzip,
+        collapses=collapses,
+        limit=limit,
+    )
+
+    snapshots = cdx_api.snapshots()
+
+    for snapshot in snapshots:
+        if len(cdx_print) == 0:
+            click.echo(snapshot)
+        else:
+            output_string = []
+            if any(val in cdx_print for val in ["urlkey", "url-key", "url_key"]):
+                output_string.append(snapshot.urlkey)
+            if any(
+                val in cdx_print for val in ["timestamp", "time-stamp", "time_stamp"]
+            ):
+                output_string.append(snapshot.timestamp)
+            if "original" in cdx_print:
+                output_string.append(snapshot.original)
+            if any(val in cdx_print for val in ["mimetype", "mime-type", "mime_type"]):
+                output_string.append(snapshot.mimetype)
+            if any(
+                val in cdx_print for val in ["statuscode", "status-code", "status_code"]
+            ):
+                output_string.append(snapshot.statuscode)
+            if "digest" in cdx_print:
+                output_string.append(snapshot.digest)
+            if "length" in cdx_print:
+                output_string.append(snapshot.length)
+            if any(
+                val in cdx_print for val in ["archiveurl", "archive-url", "archive_url"]
+            ):
+                output_string.append(snapshot.archive_url)
+
+            click.echo(" ".join(output_string))
+
+
+def save_urls_on_file(url_gen: Generator[str, None, None]) -> None:
+    """
+    Save output of CDX API on file.
+    Mainly here because of backwards compatibility.
+    """
+    domain = None
+    sys_random = random.SystemRandom()
+    uid = "".join(
+        sys_random.choice(string.ascii_lowercase + string.digits) for _ in range(6)
+    )
+    url_count = 0
+    file_name = None
+
+    for url in url_gen:
+        url_count += 1
+        if not domain:
+            match = re.search("https?://([A-Za-z_0-9.-]+).*", url)
+
+            domain = "domain-unknown"
+
+            if match:
+                domain = match.group(1)
+
+            file_name = f"{domain}-urls-{uid}.txt"
+            file_path = os.path.join(os.getcwd(), file_name)
+            if not os.path.isfile(file_path):
+                with open(file_path, "w+", encoding="utf-8") as file:
+                    file.close()
+
+        with open(file_path, "a", encoding="utf-8") as file:
+            file.write(f"{url}\n")
+
+        click.echo(url)
+
+    if url_count > 0:
+        click.echo(
+            f"\n\n{url_count} URLs saved inside '{file_name}' in the current "
+            + "working directory."
+        )
+    else:
+        click.echo("No known URLs found. Please try a diffrent input!")
+
+
@click.command()
@click.option(
    "-u", "--url", help="URL on which Wayback machine operations are to be performed."
@ -21,10 +172,17 @@ from .wrapper import Url
    "--user-agent",
    "--user_agent",
    default=DEFAULT_USER_AGENT,
-    help="User agent, default user agent is '%s' " % DEFAULT_USER_AGENT,
+    help=f"User agent, default value is '{DEFAULT_USER_AGENT}'.",
 )
+@click.option("-v", "--version", is_flag=True, default=False, help="waybackpy version.")
@click.option(
-    "-v", "--version", is_flag=True, default=False, help="Print waybackpy version."
+    "-l",
+    "--show-license",
+    "--show_license",
+    "--license",
+    is_flag=True,
+    default=False,
+    help="Show license of Waybackpy.",
 )
@click.option(
    "-n",
@ -34,24 +192,21 @@ from .wrapper import Url
    "--archive-url",
    default=False,
    is_flag=True,
-    help="Fetch the newest archive of the specified URL",
+    help="Retrieve the newest archive of URL.",
 )
@click.option(
    "-o",
    "--oldest",
    default=False,
    is_flag=True,
-    help="Fetch the oldest archive of the specified URL",
+    help="Retrieve the oldest archive of URL.",
 )
@click.option(
-    "-j",
-    "--json",
+    "-N",
+    "--near",
    default=False,
    is_flag=True,
-    help="Spit out the JSON data for availability_api commands.",
-)
-@click.option(
-    "-N", "--near", default=False, is_flag=True, help="Archive near specified time."
+    help="Archive close to a specified time.",
 )
@click.option("-Y", "--year", type=click.IntRange(1994, 9999), help="Year in integer.")
@click.option("-M", "--month", type=click.IntRange(1, 12), help="Month in integer.")
@ -70,7 +225,7 @@ from .wrapper import Url
    "--headers",
    default=False,
    is_flag=True,
-    help="Spit out the headers data for save_api commands.",
+    help="Headers data of the SavePageNow API.",
 )
@click.option(
    "-ku",
@ -95,153 +250,178 @@ from .wrapper import Url
    help="Use with '--known_urls' to save the URLs in file at current directory.",
 )
@click.option(
-    "-c",
    "--cdx",
    default=False,
    is_flag=True,
-    help="Spit out the headers data for save_api commands.",
+    help="Flag for using CDX API.",
 )
@click.option(
    "-st",
    "--start-timestamp",
    "--start_timestamp",
+    "--from",
+    help="Start timestamp for CDX API in yyyyMMddhhmmss format.",
 )
@click.option(
    "-et",
    "--end-timestamp",
    "--end_timestamp",
+    "--to",
+    help="End timestamp for CDX API in yyyyMMddhhmmss format.",
+)
+@click.option(
+    "-C",
+    "--closest",
+    help="Archive that are closest the timestamp passed as arguments to this "
+    + "parameter.",
 )
@click.option(
    "-f",
-    "--filters",
+    "--cdx-filter",
+    "--cdx_filter",
+    "--filter",
    multiple=True,
+    help="Filter on a specific field or all the CDX fields.",
 )
@click.option(
    "-mt",
    "--match-type",
    "--match_type",
+    help="The default behavior is to return matches for an exact URL. "
+    + "However, the CDX server can also return results matching a certain prefix, "
+    + "a certain host, or all sub-hosts by using the match_type",
+)
+@click.option(
+    "-st",
+    "--sort",
+    help="Choose one from default, closest or reverse. It returns sorted CDX entries "
+    + "in the response.",
+)
+@click.option(
+    "-up",
+    "--use-pagination",
+    "--use_pagination",
+    default=False,
+    is_flag=True,
+    help="Use the pagination API of the CDX server instead of the default one.",
 )
@click.option(
    "-gz",
    "--gzip",
+    help="To disable gzip compression pass false as argument to this parameter. "
+    + "The default behavior is gzip compression enabled.",
 )
@click.option(
    "-c",
-    "--collapses",
+    "--collapse",
    multiple=True,
+    help="Filtering or 'collapse' results based on a field, or a substring of a field.",
 )
@click.option(
    "-l",
    "--limit",
+    help="Number of maximum record that CDX API is asked to return per API call, "
+    + "default value is 25000 records.",
 )
@click.option(
    "-cp",
    "--cdx-print",
    "--cdx_print",
    multiple=True,
+    help="Print only certain fields of the CDX API response, "
+    + "if this parameter is not used then the plain text response of the CDX API "
+    + "will be printed.",
 )
-def main(
-    url,
-    user_agent,
-    version,
-    newest,
-    oldest,
-    json,
-    near,
-    year,
-    month,
-    day,
-    hour,
-    minute,
-    save,
-    headers,
-    known_urls,
-    subdomain,
-    file,
-    cdx,
-    start_timestamp,
-    end_timestamp,
-    filters,
-    match_type,
-    gzip,
-    collapses,
-    limit,
-    cdx_print,
-):
-    """
-
-                     _                _
-                    | |              | |
+def main(  # pylint: disable=no-value-for-parameter
+    user_agent: str,
+    version: bool,
+    show_license: bool,
+    newest: bool,
+    oldest: bool,
+    near: bool,
+    save: bool,
+    headers: bool,
+    known_urls: bool,
+    subdomain: bool,
+    file: bool,
+    cdx: bool,
+    use_pagination: bool,
+    cdx_filter: List[str],
+    collapse: List[str],
+    cdx_print: List[str],
+    url: Optional[str] = None,
+    year: Optional[int] = None,
+    month: Optional[int] = None,
+    day: Optional[int] = None,
+    hour: Optional[int] = None,
+    minute: Optional[int] = None,
+    start_timestamp: Optional[str] = None,
+    end_timestamp: Optional[str] = None,
+    closest: Optional[str] = None,
+    match_type: Optional[str] = None,
+    sort: Optional[str] = None,
+    gzip: Optional[str] = None,
+    limit: Optional[str] = None,
+) -> None:
+    """\b
+                         _                _
+                        | |              | |
    __      ____ _ _   _| |__   __ _  ___| | ___ __  _   _
-    \ \ /\ / / _` | | | | '_ \ / _` |/ __| |/ / '_ \| | | |
-     \ V  V / (_| | |_| | |_) | (_| | (__|   <| |_) | |_| |
-      \_/\_/ \__,_|\__, |_.__/ \__,_|\___|_|\_\ .__/ \__, |
+    \\ \\ /\\ / / _` | | | | '_ \\ / _` |/ __| |/ / '_ \\| | | |
+     \\ V  V / (_| | |_| | |_) | (_| | (__|   <| |_) | |_| |
+      \\_/\\_/ \\__,_|\\__, |_.__/ \\__,_|\\___|_|\\_\\ .__/ \\__, |
                    __/ |                     | |     __/ |
                   |___/                      |_|    |___/

+    Python package & CLI tool that interfaces the Wayback Machine APIs

-    waybackpy : Python package & CLI tool that interfaces the Wayback Machine API
+    Repository: https://github.com/akamhy/waybackpy

-    Released under the MIT License.
-    License @ https://github.com/akamhy/waybackpy/blob/master/LICENSE
+    Documentation: https://github.com/akamhy/waybackpy/wiki/CLI-docs

-    Copyright (c) 2020 waybackpy contributors. Contributors list @
-    https://github.com/akamhy/waybackpy/graphs/contributors
+    waybackpy - CLI usage(Demo video): https://asciinema.org/a/469890

-    https://github.com/akamhy/waybackpy
-
-    https://pypi.org/project/waybackpy
+    Released under the MIT License. Use the flag --license for license.

    """
-
    if version:
-        click.echo("waybackpy version %s" % __version__)
-        return
+        click.echo(f"waybackpy version {__version__}")

-    if not url:
-        click.echo("No URL detected. Please pass an URL.")
-        return
+    elif show_license:
+        click.echo(
+            requests.get(
+                url="https://raw.githubusercontent.com/akamhy/waybackpy/master/LICENSE"
+            ).text
+        )
+    elif url is None:
+        click.echo(
+            click.style("NoURLDetected: ", fg="red")
+            + "No URL detected. "
+            + "Please provide an URL.",
+            err=True,
+        )

-    def echo_availability_api(availability_api_instance):
-        click.echo("Archive URL:")
-        if not availability_api_instance.archive_url:
-            archive_url = (
-                "NO ARCHIVE FOUND - The requested URL is probably "
-                + "not yet archived or if the URL was recently archived then it is "
-                + "not yet available via the Wayback Machine's availability API "
-                + "because of database lag and should be available after some time."
-            )
-        else:
-            archive_url = availability_api_instance.archive_url
-        click.echo(archive_url)
-        if json:
-            click.echo("JSON response:")
-            click.echo(JSON.dumps(availability_api_instance.JSON))
+    elif oldest:
+        cdx_api = WaybackMachineCDXServerAPI(url, user_agent=user_agent)
+        handle_cdx_closest_derivative_methods(cdx_api, oldest, near, newest)

-    availability_api = WaybackMachineAvailabilityAPI(url, user_agent=user_agent)
+    elif newest:
+        cdx_api = WaybackMachineCDXServerAPI(url, user_agent=user_agent)
+        handle_cdx_closest_derivative_methods(cdx_api, oldest, near, newest)

-    if oldest:
-        availability_api.oldest()
-        echo_availability_api(availability_api)
-        return
-
-    if newest:
-        availability_api.newest()
-        echo_availability_api(availability_api)
-        return
-
-    if near:
+    elif near:
+        cdx_api = WaybackMachineCDXServerAPI(url, user_agent=user_agent)
        near_args = {}
        keys = ["year", "month", "day", "hour", "minute"]
        args_arr = [year, month, day, hour, minute]
        for key, arg in zip(keys, args_arr):
            if arg:
                near_args[key] = arg
-        availability_api.near(**near_args)
-        echo_availability_api(availability_api)
-        return
+        handle_cdx_closest_derivative_methods(
+            cdx_api, oldest, near, newest, near_args=near_args
+        )

-    if save:
+    elif save:
        save_api = WaybackMachineSaveAPI(url, user_agent=user_agent)
        save_api.save()
        click.echo("Archive URL:")
@ -251,99 +431,44 @@ def main(
        if headers:
            click.echo("Save API headers:")
            click.echo(save_api.headers)
-        return

-    def save_urls_on_file(url_gen):
-        domain = None
-        sys_random = random.SystemRandom()
-        uid = "".join(
-            sys_random.choice(string.ascii_lowercase + string.digits) for _ in range(6)
-        )
-        url_count = 0
-
-        for url in url_gen:
-            url_count += 1
-            if not domain:
-                match = re.search("https?://([A-Za-z_0-9.-]+).*", url)
-
-                domain = "domain-unknown"
-
-                if match:
-                    domain = match.group(1)
-
-                file_name = "{domain}-urls-{uid}.txt".format(domain=domain, uid=uid)
-                file_path = os.path.join(os.getcwd(), file_name)
-                if not os.path.isfile(file_path):
-                    open(file_path, "w+").close()
-
-            with open(file_path, "a") as f:
-                f.write("{url}\n".format(url=url))
-
-            click.echo(url)
-
-        if url_count > 0:
-            click.echo(
-                "\n\n'{file_name}' saved in current working directory".format(
-                    file_name=file_name
-                )
-            )
-        else:
-            click.echo("No known URLs found. Please try a diffrent input!")
-
-    if known_urls:
+    elif known_urls:
        wayback = Url(url, user_agent)
        url_gen = wayback.known_urls(subdomain=subdomain)

        if file:
-            return save_urls_on_file(url_gen)
+            save_urls_on_file(url_gen)
        else:
-            for url in url_gen:
-                click.echo(url)
+            for url_ in url_gen:
+                click.echo(url_)

-    if cdx:
-        filters = list(filters)
-        collapses = list(collapses)
-        cdx_print = list(cdx_print)
-
-        cdx_api = WaybackMachineCDXServerAPI(
+    elif cdx:
+        data = [
            url,
-            user_agent=user_agent,
-            start_timestamp=start_timestamp,
-            end_timestamp=end_timestamp,
-            filters=filters,
-            match_type=match_type,
-            gzip=gzip,
-            collapses=collapses,
-            limit=limit,
+            user_agent,
+            start_timestamp,
+            end_timestamp,
+            cdx_filter,
+            collapse,
+            cdx_print,
+            limit,
+            gzip,
+            match_type,
+            sort,
+            use_pagination,
+            closest,
+        ]
+        handle_cdx(data)
+
+    else:
+
+        click.echo(
+            click.style("NoCommandFound: ", fg="red")
+            + "Only URL passed, but did not specify what to do with the URL. "
+            + "Use --help flag for help using waybackpy.",
+            err=True,
        )

-        snapshots = cdx_api.snapshots()
-
-        for snapshot in snapshots:
-            if len(cdx_print) == 0:
-                click.echo(snapshot)
-            else:
-                output_string = ""
-                if "urlkey" or "url-key" or "url_key" in cdx_print:
-                    output_string = output_string + snapshot.urlkey + " "
-                if "timestamp" or "time-stamp" or "time_stamp" in cdx_print:
-                    output_string = output_string + snapshot.timestamp + " "
-                if "original" in cdx_print:
-                    output_string = output_string + snapshot.original + " "
-                if "original" in cdx_print:
-                    output_string = output_string + snapshot.original + " "
-                if "mimetype" or "mime-type" or "mime_type" in cdx_print:
-                    output_string = output_string + snapshot.mimetype + " "
-                if "statuscode" or "status-code" or "status_code" in cdx_print:
-                    output_string = output_string + snapshot.statuscode + " "
-                if "digest" in cdx_print:
-                    output_string = output_string + snapshot.digest + " "
-                if "length" in cdx_print:
-                    output_string = output_string + snapshot.length + " "
-                if "archiveurl" or "archive-url" or "archive_url" in cdx_print:
-                    output_string = output_string + snapshot.archive_url + " "
-                click.echo(output_string)
-

 if __name__ == "__main__":
-    main()
+    main()  # type: ignore # pylint: disable=no-value-for-parameter
--- a/waybackpy/exceptions.py
+++ b/waybackpy/exceptions.py
@ -8,23 +8,35 @@ This module contains the set of Waybackpy's exceptions.
 class WaybackError(Exception):
    """
    Raised when Waybackpy can not return what you asked for.
-     1) Wayback Machine API Service is unreachable/down.
-     2) You passed illegal arguments.

-     All other exceptions are inherited from this class.
+    1) Wayback Machine API Service is unreachable/down.
+    2) You passed illegal arguments.
+
+    All other exceptions are inherited from this main exception.
    """


-class RedirectSaveError(WaybackError):
+class NoCDXRecordFound(WaybackError):
    """
-    Raised when the original URL is redirected and the
-    redirect URL is archived but not the original URL.
+    No records returned by the CDX server for a query.
+    Raised when the user invokes near(), newest() or oldest() methods
+    and there are no archives.
    """


-class URLError(Exception):
+class BlockedSiteError(WaybackError):
    """
-    Raised when malformed URLs are passed as arguments.
+    Raised when the archives for website/URLs that was excluded from Wayback
+    Machine are requested via the CDX server API.
+    """
+
+
+class TooManyRequestsError(WaybackError):
+    """
+    Raised when you make more than 15 requests per
+    minute and the Wayback Machine returns 429.
+
+    See https://github.com/akamhy/waybackpy/issues/131
    """


--- a/waybackpy/py.typed
+++ b/waybackpy/py.typed
--- a/waybackpy/save_api.py
+++ b/waybackpy/save_api.py
@ -1,48 +1,69 @@
+"""
+This module interfaces the Wayback Machine's SavePageNow (SPN) API.
+
+The module has WaybackMachineSaveAPI class which should be used by the users of
+this module to use the SavePageNow API.
+"""
+
 import re
 import time
-import requests
-
 from datetime import datetime
-from urllib3.util.retry import Retry
-from requests.adapters import HTTPAdapter
+from typing import Dict, Optional

+import requests
+from requests.adapters import HTTPAdapter
+from requests.models import Response
+from requests.structures import CaseInsensitiveDict
+from urllib3.util.retry import Retry
+
+from .exceptions import MaximumSaveRetriesExceeded, TooManyRequestsError, WaybackError
 from .utils import DEFAULT_USER_AGENT
-from .exceptions import MaximumSaveRetriesExceeded


 class WaybackMachineSaveAPI:
-
    """
    WaybackMachineSaveAPI class provides an interface for saving URLs on the
    Wayback Machine.
    """

-    def __init__(self, url, user_agent=DEFAULT_USER_AGENT, max_tries=8):
+    def __init__(
+        self,
+        url: str,
+        user_agent: str = DEFAULT_USER_AGENT,
+        max_tries: int = 8,
+    ) -> None:
        self.url = str(url).strip().replace(" ", "%20")
        self.request_url = "https://web.archive.org/save/" + self.url
        self.user_agent = user_agent
-        self.request_headers = {"User-Agent": self.user_agent}
+        self.request_headers: Dict[str, str] = {"User-Agent": self.user_agent}
+        if max_tries < 1:
+            raise ValueError("max_tries should be positive")
        self.max_tries = max_tries
        self.total_save_retries = 5
        self.backoff_factor = 0.5
        self.status_forcelist = [500, 502, 503, 504]
-        self._archive_url = None
+        self._archive_url: Optional[str] = None
        self.instance_birth_time = datetime.utcnow()
+        self.response: Optional[Response] = None
+        self.headers: Optional[CaseInsensitiveDict[str]] = None
+        self.status_code: Optional[int] = None
+        self.response_url: Optional[str] = None
+        self.cached_save: Optional[bool] = None
+        self.saved_archive: Optional[str] = None

    @property
-    def archive_url(self):
+    def archive_url(self) -> str:
        """
        Returns the archive URL is already cached by _archive_url
        else invoke the save method to save the archive which returns the
        archive thus we return the methods return value.
        """
-
        if self._archive_url:
            return self._archive_url
-        else:
-            return self.save()

-    def get_save_request_headers(self):
+        return self.save()
+
+    def get_save_request_headers(self) -> None:
        """
        Creates a session and tries 'retries' number of times to
        retrieve the archive.
@ -66,20 +87,37 @@ class WaybackMachineSaveAPI:
        )
        session.mount("https://", HTTPAdapter(max_retries=retries))
        self.response = session.get(self.request_url, headers=self.request_headers)
-        self.headers = (
-            self.response.headers
-        )  # <class 'requests.structures.CaseInsensitiveDict'>
+        # requests.response.headers is requests.structures.CaseInsensitiveDict
+        self.headers = self.response.headers
        self.status_code = self.response.status_code
        self.response_url = self.response.url
        session.close()

-    def archive_url_parser(self):
+        if self.status_code == 429:
+            # why wait 5 minutes and 429?
+            # see https://github.com/akamhy/waybackpy/issues/97
+            raise TooManyRequestsError(
+                f"Can not save '{self.url}'. "
+                f"Save request refused by the server. "
+                f"Save Page Now limits saving 15 URLs per minutes. "
+                f"Try waiting for 5 minutes and then try again."
+            )
+
+        # why 509?
+        # see https://github.com/akamhy/waybackpy/pull/99
+        # also https://t.co/xww4YJ0Iwc
+        if self.status_code == 509:
+            raise WaybackError(
+                f"Can not save '{self.url}'. You have probably reached the "
+                f"limit of active sessions."
+            )
+
+    def archive_url_parser(self) -> Optional[str]:
        """
        Three regexen (like oxen?) are used to search for the
        archive URL in the headers and finally look in the response URL
        for the archive URL.
        """
-
        regex1 = r"Content-Location: (/web/[0-9]{14}/.*)"
        match = re.search(regex1, str(self.headers))
        if match:
@ -87,23 +125,26 @@ class WaybackMachineSaveAPI:

        regex2 = r"rel=\"memento.*?(web\.archive\.org/web/[0-9]{14}/.*?)>"
        match = re.search(regex2, str(self.headers))
-        if match:
+        if match is not None and len(match.groups()) == 1:
            return "https://" + match.group(1)

        regex3 = r"X-Cache-Key:\shttps(.*)[A-Z]{2}"
        match = re.search(regex3, str(self.headers))
-        if match:
+        if match is not None and len(match.groups()) == 1:
            return "https" + match.group(1)

-        if self.response_url:
-            self.response_url = self.response_url.strip()
-            if "web.archive.org/web" in self.response_url:
-                regex = r"web\.archive\.org/web/(?:[0-9]*?)/(?:.*)$"
-                match = re.search(regex, self.response_url)
-                if match:
-                    return "https://" + match.group(0)
+        self.response_url = (
+            "" if self.response_url is None else self.response_url.strip()
+        )
+        regex4 = r"web\.archive\.org/web/(?:[0-9]*?)/(?:.*)$"
+        match = re.search(regex4, self.response_url)
+        if match is not None:
+            return "https://" + match.group(0)

-    def sleep(self, tries):
+        return None
+
+    @staticmethod
+    def sleep(tries: int) -> None:
        """
        Ensure that the we wait some time before succesive retries so that we
        don't waste the retries before the page is even captured by the Wayback
@ -112,13 +153,12 @@ class WaybackMachineSaveAPI:

        If tries are multiple of 3 sleep 10 seconds else sleep 5 seconds.
        """
-
        sleep_seconds = 5
        if tries % 3 == 0:
            sleep_seconds = 10
        time.sleep(sleep_seconds)

-    def timestamp(self):
+    def timestamp(self) -> datetime:
        """
        Read the timestamp off the archive URL and convert the Wayback Machine
        timestamp to datetime object.
@ -126,17 +166,22 @@ class WaybackMachineSaveAPI:
        Also check if the time on archive is URL and compare it to instance birth
        time.

-        If time on the archive is older than the instance creation time set the cached_save
-        to True else set it to False. The flag can be used to check if the Wayback Machine
-        didn't serve a Cached URL. It is quite common for the Wayback Machine to serve
-        cached archive if last archive was captured before last 45 minutes.
+        If time on the archive is older than the instance creation time set the
+        cached_save to True else set it to False. The flag can be used to check
+        if the Wayback Machine didn't serve a Cached URL. It is quite common for
+        the Wayback Machine to serve cached archive if last archive was captured
+        before last 45 minutes.
        """
-        m = re.search(
-            r"https?://web\.archive.org/web/([0-9]{14})/http", self._archive_url
-        )
-        string_timestamp = m.group(1)
-        timestamp = datetime.strptime(string_timestamp, "%Y%m%d%H%M%S")
+        regex = r"https?://web\.archive.org/web/([0-9]{14})/http"
+        match = re.search(regex, str(self._archive_url))

+        if match is None or len(match.groups()) != 1:
+            raise ValueError(
+                f"Can not parse timestamp from archive URL, '{self._archive_url}'."
+            )
+
+        string_timestamp = match.group(1)
+        timestamp = datetime.strptime(string_timestamp, "%Y%m%d%H%M%S")
        timestamp_unixtime = time.mktime(timestamp.timetuple())
        instance_birth_time_unixtime = time.mktime(self.instance_birth_time.timetuple())

@ -147,7 +192,7 @@ class WaybackMachineSaveAPI:

        return timestamp

-    def save(self):
+    def save(self) -> str:
        """
        Calls the SavePageNow API of the Wayback Machine with required parameters
        and headers to save the URL.
@ -155,32 +200,26 @@ class WaybackMachineSaveAPI:
        Raises MaximumSaveRetriesExceeded is maximum retries are exhausted but still
        we were unable to retrieve the archive from the Wayback Machine.
        """
-
        self.saved_archive = None
        tries = 0

        while True:
+            if tries >= 1:
+                self.sleep(tries)
+
+            self.get_save_request_headers()
+            self.saved_archive = self.archive_url_parser()
+
+            if isinstance(self.saved_archive, str):
+                self._archive_url = self.saved_archive
+                self.timestamp()
+                return self.saved_archive

            tries += 1
-
            if tries >= self.max_tries:
                raise MaximumSaveRetriesExceeded(
-                    "Tried %s times but failed to save and retrieve the" % str(tries)
-                    + " archive for %s.\nResponse URL:\n%s \nResponse Header:\n%s\n"
-                    % (self.url, self.response_url, str(self.headers)),
+                    f"Tried {tries} times but failed to save "
+                    f"and retrieve the archive for {self.url}.\n"
+                    f"Response URL:\n{self.response_url}\n"
+                    f"Response Header:\n{self.headers}"
                )
-
-            if not self.saved_archive:
-
-                if tries > 1:
-                    self.sleep(tries)
-
-                self.get_save_request_headers()
-                self.saved_archive = self.archive_url_parser()
-
-                if not self.saved_archive:
-                    continue
-                else:
-                    self._archive_url = self.saved_archive
-                    self.timestamp()
-                    return self.saved_archive
--- a/waybackpy/utils.py
+++ b/waybackpy/utils.py
@ -1,12 +1,29 @@
-import requests
-from .__version__ import __version__
+"""
+Utility functions and shared variables like DEFAULT_USER_AGENT are here.
+"""

-DEFAULT_USER_AGENT = "waybackpy %s - https://github.com/akamhy/waybackpy" % __version__
+from datetime import datetime
+
+from . import __version__
+
+DEFAULT_USER_AGENT: str = (
+    f"waybackpy {__version__} - https://github.com/akamhy/waybackpy"
+)


-def latest_version(package_name, user_agent=DEFAULT_USER_AGENT):
-    request_url = "https://pypi.org/pypi/" + package_name + "/json"
-    headers = {"User-Agent": user_agent}
-    response = requests.get(request_url, headers=headers)
-    data = response.json()
-    return data["info"]["version"]
+def unix_timestamp_to_wayback_timestamp(unix_timestamp: int) -> str:
+    """
+    Converts Unix time to Wayback Machine timestamp, Wayback Machine
+    timestamp format is yyyyMMddhhmmss.
+    """
+    return datetime.utcfromtimestamp(int(unix_timestamp)).strftime("%Y%m%d%H%M%S")
+
+
+def wayback_timestamp(**kwargs: int) -> str:
+    """
+    Prepends zero before the year, month, day, hour and minute so that they
+    are conformable with the YYYYMMDDhhmmss Wayback Machine timestamp format.
+    """
+    return "".join(
+        str(kwargs[key]).zfill(2) for key in ["year", "month", "day", "hour", "minute"]
+    )
--- a/waybackpy/wrapper.py
+++ b/waybackpy/wrapper.py
@ -1,51 +1,73 @@
-from .save_api import WaybackMachineSaveAPI
-from .availability_api import WaybackMachineAvailabilityAPI
-from .cdx_api import WaybackMachineCDXServerAPI
-from .utils import DEFAULT_USER_AGENT
+"""
+This module exists because backwards compatibility matters.
+Don't touch this or add any new functionality here and don't use
+the Url class.
+"""
+
 from datetime import datetime, timedelta
+from typing import Generator, Optional

-"""
-The Url class is not recommended to be used anymore, instead use the
-WaybackMachineSaveAPI, WaybackMachineAvailabilityAPI and WaybackMachineCDXServerAPI.
+from requests.structures import CaseInsensitiveDict

-The reason it is still in the code is backwards compatibility with 2.x.x versions.
-
-If were are using the Url before the update to version 3.x.x, your code should still be
-working fine and there is no hurry to update the interface but is recommended that you
-do not use the Url class for new code as it would be removed after 2025 also the first
-3.x.x versions was released in January 2022 and three years are more than enough to update
-the older interface code.
-"""
+from .availability_api import ResponseJSON, WaybackMachineAvailabilityAPI
+from .cdx_api import WaybackMachineCDXServerAPI
+from .save_api import WaybackMachineSaveAPI
+from .utils import DEFAULT_USER_AGENT


 class Url:
-    def __init__(self, url, user_agent=DEFAULT_USER_AGENT):
+    """
+    The Url class is not recommended to be used anymore, instead use:
+
+    - WaybackMachineSaveAPI
+    - WaybackMachineAvailabilityAPI
+    - WaybackMachineCDXServerAPI
+
+    The reason it is still in the code is backwards compatibility with 2.x.x
+    versions.
+
+    If were are using the Url before the update to version 3.x.x, your code should
+    still be working fine and there is no hurry to update the interface but is
+    recommended that you do not use the Url class for new code as it would be
+    removed after 2025 also the first 3.x.x versions was released in January 2022
+    and three years are more than enough to update the older interface code.
+    """
+
+    def __init__(self, url: str, user_agent: str = DEFAULT_USER_AGENT) -> None:
        self.url = url
        self.user_agent = str(user_agent)
-        self.archive_url = None
+        self.archive_url: Optional[str] = None
+        self.timestamp: Optional[datetime] = None
        self.wayback_machine_availability_api = WaybackMachineAvailabilityAPI(
            self.url, user_agent=self.user_agent
        )
+        self.wayback_machine_save_api: Optional[WaybackMachineSaveAPI] = None
+        self.headers: Optional[CaseInsensitiveDict[str]] = None
+        self.json: Optional[ResponseJSON] = None

-    def __str__(self):
+    def __str__(self) -> str:
        if not self.archive_url:
            self.newest()
-        return self.archive_url
+        return str(self.archive_url)

-    def __len__(self):
+    def __len__(self) -> int:
        td_max = timedelta(
            days=999999999, hours=23, minutes=59, seconds=59, microseconds=999999
        )

-        if not self.timestamp:
+        if not isinstance(self.timestamp, datetime):
            self.oldest()

+        if not isinstance(self.timestamp, datetime):
+            raise TypeError("timestamp must be a datetime")
+
        if self.timestamp == datetime.max:
            return td_max.days

        return (datetime.utcnow() - self.timestamp).days

-    def save(self):
+    def save(self) -> "Url":
+        """Save the URL on wayback machine."""
        self.wayback_machine_save_api = WaybackMachineSaveAPI(
            self.url, user_agent=self.user_agent
        )
@ -56,14 +78,14 @@ class Url:

    def near(
        self,
-        year=None,
-        month=None,
-        day=None,
-        hour=None,
-        minute=None,
-        unix_timestamp=None,
-    ):
-
+        year: Optional[int] = None,
+        month: Optional[int] = None,
+        day: Optional[int] = None,
+        hour: Optional[int] = None,
+        minute: Optional[int] = None,
+        unix_timestamp: Optional[int] = None,
+    ) -> "Url":
+        """Returns the archive of the URL close to a date and time."""
        self.wayback_machine_availability_api.near(
            year=year,
            month=month,
@ -75,22 +97,32 @@ class Url:
        self.set_availability_api_attrs()
        return self

-    def oldest(self):
+    def oldest(self) -> "Url":
+        """Returns the oldest archive of the URL."""
        self.wayback_machine_availability_api.oldest()
        self.set_availability_api_attrs()
        return self

-    def newest(self):
+    def newest(self) -> "Url":
+        """Returns the newest archive of the URL."""
        self.wayback_machine_availability_api.newest()
        self.set_availability_api_attrs()
        return self

-    def set_availability_api_attrs(self):
+    def set_availability_api_attrs(self) -> None:
+        """Set the attributes for total backwards compatibility."""
        self.archive_url = self.wayback_machine_availability_api.archive_url
-        self.JSON = self.wayback_machine_availability_api.JSON
+        self.json = self.wayback_machine_availability_api.json
+        self.JSON = self.json  # for backwards compatibility, do not remove it.
        self.timestamp = self.wayback_machine_availability_api.timestamp()

-    def total_archives(self, start_timestamp=None, end_timestamp=None):
+    def total_archives(
+        self, start_timestamp: Optional[str] = None, end_timestamp: Optional[str] = None
+    ) -> int:
+        """
+        Returns an integer which indicates total number of archives for an URL.
+        Useless in my opinion, only here because of backwards compatibility.
+        """
        cdx = WaybackMachineCDXServerAPI(
            self.url,
            user_agent=self.user_agent,
@ -105,12 +137,13 @@ class Url:

    def known_urls(
        self,
-        subdomain=False,
-        host=False,
-        start_timestamp=None,
-        end_timestamp=None,
-        match_type="prefix",
-    ):
+        subdomain: bool = False,
+        host: bool = False,
+        start_timestamp: Optional[str] = None,
+        end_timestamp: Optional[str] = None,
+        match_type: str = "prefix",
+    ) -> Generator[str, None, None]:
+        """Yields known URLs for any URL."""
        if subdomain:
            match_type = "domain"
        if host:
@ -126,4 +159,4 @@ class Url:
        )

        for snapshot in cdx.snapshots():
-            yield (snapshot.original)
+            yield snapshot.original
Author	SHA1	Message	Date
ArztKlein	3b3e78d901	Before and After methods (#175 ) * Added before and after functions * add tests * formatting	2022-11-17 07:58:46 +05:30
Rishav Kundu	0202efd39d	Add Python 3.11 to setup.cfg classifiers list (#179 )	2022-11-17 07:56:19 +05:30
Akash Mahanty	25c0adacb0	create CONTRIBUTING.md	2022-03-29 11:30:43 +05:30
Akash Mahanty	5bd16a42e7	lint	2022-03-29 10:42:57 +05:30
Akash Mahanty	57f4be53d5	ignore line 474 beacause 'error: <nothing> not callable'.	2022-03-29 10:24:55 +05:30
Akash Mahanty	64a4ce88af	Minor copyediting and also deleted CONTRIBUTORS.md moved content to README.md	2022-03-29 03:39:50 +05:30
Akash Mahanty	5407681c34	v3.0.6 (#170 ) * remove the license section from readme This does not mean that I'm waving the copyrights rather just formatting the README * remove useless external links form the README lead and also added a line about the recentness of the newest method between the availability and CDX server API. * incr version to 3.0.6 and change date to todays da -te that is 15th of March, 2022. * update secsi and DI section * v3.0.5 --> v3.0.6	2022-03-15 20:33:51 +05:30
Akash Mahanty	cfd977135d	Update CITATION.cff (#169 )	2022-03-04 11:48:49 +05:30
eggplants	7a5e0bfdaf	fix: cff format (#168 )	2022-03-04 03:10:27 +05:30
eggplants	48dcda8020	add: typed marker (PEP561) (#167 )	2022-03-03 19:05:43 +05:30
eggplants	3ed2170a32	add: CITATION.cff (#166 )	2022-03-03 19:05:23 +05:30
Akash Mahanty	d6ef55020c	undo drop python3.6, see #162 (#163 )	2022-02-18 21:54:33 +05:30
Akash Mahanty	2650943f9d	v3.0.4 (#160 ) * Update README.md * Update README.md * update asciinema link * v3.0.4 * update video link	2022-02-18 16:05:58 +05:30
Akash Mahanty	4b218d35cb	Cdx based oldest newest and near (#159 ) * implement oldest newest and near methods in the cdx interface class, now cli uses the cdx methods instead of availablity api methods. * handle the closest parameter derivative methods more efficiently and also handle exceptions gracefully. * update test code	2022-02-18 13:17:40 +05:30
Akash Mahanty	f990b93f8a	Add sort, use_pagination and closest (#158 ) * add sort param support in CDX API class see https://nla.github.io/outbackcdx/api.html#operation/query sort takes string input which must be one of the follwoing: - default - closest - reverse This commit shall help in closing issue at https://github.com/akamhy/waybackpy/issues/155 * add BlockedSiteError for cases when archiving is blocked by site's robots.txt * create check_for_blocked_site for handling the BlockedSiteError for sites that are blocking wayback machine by their robots.txt policy * add attrs use_pagination and closest, which are can be used to use the pagination API and lookup archive close to a timestamp respectively. And now to get out of infinte blank pages loop just check for two succesive black and not total two blank pages while using the CDX server API. * added cli support for sort, use-pagination and closest * added tests * fix codeql warnings, nothing to worry about here. * fix save test for archive_url	2022-02-18 00:24:14 +05:30
Akash Mahanty	3a44a710d3	add sort param support in CDX API class (#156 ) see https://nla.github.io/outbackcdx/api.html#operation/query sort takes string input which must be one of the follwoing: - default - closest - reverse This commit shall help in closing issue at https://github.com/akamhy/waybackpy/issues/155	2022-02-17 12:17:23 +05:30
Akash Mahanty	f63c6adf79	Trigger Build	2022-02-09 17:29:19 +05:30
eggplants	b4d3393ef1	fix: move metadata from __init__.py into setup.cfg (#153 )	2022-02-09 17:20:23 +05:30
Akash Mahanty	cd5c3c61a5	fix imports with isort	2022-02-09 16:18:25 +05:30
Akash Mahanty	87fb5ecd58	remove latest version funcs from utils, they were unused.	2022-02-09 16:12:30 +05:30
Akash Mahanty	5954fcc646	format with black	2022-02-09 15:51:11 +05:30
Akash Mahanty	89016d433c	added trove Typing :: Typed and Development Status :: 5 - Production/Stable	2022-02-09 15:47:38 +05:30
Akash Mahanty	edaa1d5d54	update value to the new limit.	2022-02-09 15:40:38 +05:30
Akash Mahanty	16f94db144	incr version to v3.0.3	2022-02-09 14:33:16 +05:30
Akash Mahanty	25eb709ade	improve doc strings and comments and remove useless exceptions.	2022-02-09 14:32:15 +05:30
Akash Mahanty	6d233f24fc	apply isort	2022-02-09 11:20:59 +05:30
Akash Mahanty	ec341fa8b3	refactor code in cli module	2022-02-09 11:20:10 +05:30
Akash Mahanty	cf18090f90	fix typo	2022-02-09 09:52:20 +05:30
Akash Mahanty	81162eebd0	issues with HN	2022-02-08 21:28:25 +05:30
Akash Mahanty	ca4f79a2e3	+ jfinkhaeuser and rafael (#150 )	2022-02-08 20:34:34 +05:30
Akash Mahanty	27f2727049	add cli alias for --start-timestamp(--from) and --end-timestamp(--to) to conform with the CDX API docs.	2022-02-08 20:12:19 +05:30
Akash Mahanty	118dc6c523	add test for wrapper module	2022-02-08 20:08:44 +05:30
Akash Mahanty	1216ffbc70	lint and refactor cli module	2022-02-08 20:06:17 +05:30
Akash Mahanty	d58a5f0ee5	explicitly exculde some dirs from flake8 check	2022-02-08 18:59:13 +05:30
Akash Mahanty	7e7412d9d1	remove deepsource, LGTM is better and has fewer False Postives.	2022-02-08 18:49:44 +05:30
Akash Mahanty	70c38c5a60	+ codecov badge	2022-02-08 17:49:05 +05:30
Akash Mahanty	f8bf9c16f9	Add tests (#149 ) * enable codecov * fix save_urls_on_file * increase the limit of CDX to 25000 from 5000. 5X increase. * added test for the CLI module * make flake 8 happy * make mypy happy	2022-02-08 17:46:59 +05:30
Akash Mahanty	2bbfee7b2f	replace non-ASCII emojis with GitHub hosted equivalent images (#148 )	2022-02-08 11:43:32 +05:30
deepsource-autofix[bot]	7317bd7183	Remove blank lines after docstring (#146 ) Co-authored-by: deepsource-autofix[bot] <62050782+deepsource-autofix[bot]@users.noreply.github.com>	2022-02-08 10:14:20 +05:30
deepsource-autofix[bot]	e0dfbe0b7d	Fix comparison constant position (#145 ) * Fix comparison constant position * format with black Co-authored-by: deepsource-autofix[bot] <62050782+deepsource-autofix[bot]@users.noreply.github.com> Co-authored-by: Akash Mahanty <akamhy@yahoo.com>	2022-02-08 10:06:23 +05:30
eggplants	0b631592ea	Improve pylint score (#142 ) * fix: errors to improve pylint scores * fix: test * fix * add: flake ignore rule to pip8speaks conf * fix * add: test patterns to deepsource conf	2022-02-08 06:42:20 +09:00
Akash Mahanty	d3a8f343f8	+ [eggplants](https://github.com/eggplants ) (#143 )	2022-02-08 01:41:10 +05:30
Akash Mahanty	97f8b96411	added docstrings, added some static type hints and also lint. (#141 ) * added docstrings, added some static type hints and also lint. * added doc strings and changed some internal variable names for more clarity. * make flake8 happy * add descriptive docstrings and type hints in waybackpy/cdx_snapshot.py * remove useless code and add docstrings and also lint using pylint. * remove unwarented test * added docstrings, lint using pylint and add a raise on 509 SC * added docstrings and lint with pylint * lint * add doc strings and lint * add docstrings and lint	2022-02-07 19:40:37 +05:30
DeepSource Bot	004ff26196	Add .deepsource.toml	2022-02-07 12:55:57 +00:00
Akash Mahanty	a772c22431	explicitly tell pep8speaks that mll is 88.	2022-02-06 21:00:15 +05:30
Akash Mahanty	b79f1c471e	Merge pull request #135 from eggplants/fix_cli Fix cli.py	2022-02-05 16:54:36 +05:30
Akash Mahanty	f49d67a411	Merge pull request #136 from eggplants/429_error Add TooManyRequestsError	2022-02-05 11:28:27 +05:30
Akash Mahanty	ad8bd25633	added badge of codacy (#139 )	2022-02-05 10:05:17 +05:30
eggplants	d2a3946425	fix: escape banner	2022-02-05 10:12:27 +09:00
eggplants	7b6401d59b	fix: delete useless conds	2022-02-05 06:20:03 +09:00
eggplants	ed6160c54f	add: TooManyRequestsError	2022-02-05 06:19:02 +09:00
eggplants	fcab19a40a	fix: cli print error message to stderr and specify defaults of url	2022-02-05 05:55:04 +09:00
eggplants	5f3cd28046	Fix Pylint errors were pointed out by codacy (#133 ) * fix: pylint errors were pointed out by codacy * fix: line length * fix: help text * fix: revert https://stackoverflow.com/a/64477857 makes cli unusable * fix: cli error and refactor codes	2022-02-05 05:25:40 +09:00
Akash Mahanty	9d9cc3328b	add .pep8speaks.yml, override deafult	2022-02-05 00:53:38 +05:30
Akash Mahanty	b69e4dff37	rename params of main in cli.py to avoid using built-ins (#132 ) * rename params of main in cli.py to avoid using built-ins * Fix Line 32:80: E501 line too long (102 > 79 characters)	2022-02-05 00:30:35 +05:30
eggplants	d8cabdfdb5	Typing (#128 ) * fix: CI yml name * add: mypy configuraion * add: type annotation to waybackpy modules * add: type annotation to test modules * fix: mypy command * add: types-requests to dev deps * fix: disable max-line-length * fix: move pytest.ini into setup.cfg * add: urllib3 to deps * fix: Retry (ref: https://github.com/python/typeshed/issues/6893) * fix: f-string * fix: shorten long lines * add: staticmethod decorator to no-self-use methods * fix: str(headers)->headers_str * fix: error message * fix: revert "str(headers)->headers_str" and ignore assignment CaseInsensitiveDict with str * fix: mypy error	2022-02-05 03:23:36 +09:00
eggplants	320ef30371	fix: format md and yml (#129 )	2022-02-04 22:31:46 +05:30
eggplants	e61447effd	Format and lint codes and fix packaging (#125 ) * add: configure files (setup.py->setup.py+setup.cfg+pyproject.toml) * add: __download_url__ * format with black and isort * fix: flake8 section in setup.cfg * add: E501 to flake ignore * fix: metadata.name does not accept attr * fix: merge __version__.py into __init__.py * fix: flake8 errors in tests/ * fix: datetime.datetime -> datetime * fix: banner * fix: ignore W605 for banner * fix: way to install deps in CI * add: versem to setuptools * fix: drop python<=3.6 (#126) from package and CI	2022-02-03 19:13:39 +05:30
Akash Mahanty	947647f2e7	Merge pull request #124 from eggplants/fix_save_retry Fix save retry mechanism	2022-02-03 18:01:51 +05:30
eggplants	bc1dc4dc96	fix: save retry mechanism	2022-02-03 19:45:16 +09:00
Akash Mahanty	5cbdfc040b	waybackpy/cli.py : remove duplicate original_string from output_string in cdx	2022-01-30 21:02:25 +05:30
Akash Mahanty	3be6ac01fc	created tests/test_cdx_api.py: added tests for cdx_api.py	2022-01-30 20:03:40 +05:30
Akash Mahanty	b8b9bc098f	tests/test_utils.py: test latest_version_pypi and latest_version_github of waybackpy.utils	2022-01-30 20:02:17 +05:30
Akash Mahanty	946c28eddf	waybackpy/cli.py: Added help text, fix bug in the cdx_print parameter and lots of other stuff parameter --filters is now --filter parameter --collapses is now --collapse added a new --license flag for fetching the license from GitHub repo and printing it.	2022-01-30 20:00:50 +05:30
Akash Mahanty	004027f73b	waybackpy/utils.py : Add a new function(latest_version_github) to fetch the latest release from github api and renamed latest_version to latest_version_pypi as now we have two functions to get the latest release.	2022-01-30 13:28:13 +05:30
Akash Mahanty	e86dd93b29	Delete custom.md	2022-01-30 11:45:51 +05:30
Akash Mahanty	988568e8f0	Update issue templates	2022-01-30 11:44:30 +05:30
Akash Mahanty	f4c32a44fd	Merge pull request #123 from akamhy/add-code-of-conduct-1 Create CODE_OF_CONDUCT.md	2022-01-30 11:39:22 +05:30
Akash Mahanty	7755e6391c	Create CODE_OF_CONDUCT.md	2022-01-30 11:39:11 +05:30
Akash Mahanty	9dbe3b3bf4	In waybackpy/wrapper.py set self.timestamp to None on init. In older interface(2.x.x) we had timestamp set to none in the constructer, so maybe it should be best to set it to None in the backwards compatiblliy module.)	2022-01-29 22:12:02 +05:30
Akash Mahanty	e84ba9f2c3	Merge pull request #122 from akamhy/update-readme add conda install and related links and tell users that they can copy…	2022-01-27 00:25:49 +05:30
Akash Mahanty	1250d105b4	update install command for conda and replace the link to conda-forge.org with https://anaconda.org/conda-forge/waybackpy	2022-01-27 00:17:36 +05:30
Akash Mahanty	f03b2cb6cb	fix formatting of ASCII art	2022-01-26 18:24:24 +05:30
Akash Mahanty	5e0ea023e6	update CLI help text	2022-01-26 16:23:24 +05:30
Akash Mahanty	8dff6f349e	add maintainers	2022-01-26 15:45:03 +05:30
Akash Mahanty	e04cfdfeaf	add conda install and related links and tell users that they can copy text from asciinema.org	2022-01-26 15:40:33 +05:30
Akash Mahanty	0e2cc8f5ba	+ asciicast https://asciinema.org/a/464367 [![asciicast](https://asciinema.org/a/464367.svg)](https://asciinema.org/a/464367)	2022-01-26 14:51:06 +05:30