Update README.md (#21)
* Update README.md * example bash oldest newest * total archives bash example * near bash example * format the list * ce * get bash example * pip git install example * Update index.rst * + argparse * + argparse
This commit is contained in:
parent
b8b2d6dfa9
commit
e231228721
64
README.md
64
README.md
@ -28,13 +28,20 @@ Table of contents
|
|||||||
* [Installation](#installation)
|
* [Installation](#installation)
|
||||||
|
|
||||||
* [Usage](#usage)
|
* [Usage](#usage)
|
||||||
|
* [As a python package](#as-a-python-package)
|
||||||
* [Saving an url using save()](#capturing-aka-saving-an-url-using-save)
|
* [Saving an url using save()](#capturing-aka-saving-an-url-using-save)
|
||||||
* [Receiving the oldest archive for an URL Using oldest()](#receiving-the-oldest-archive-for-an-url-using-oldest)
|
* [Receiving the oldest archive for an URL Using oldest()](#receiving-the-oldest-archive-for-an-url-using-oldest)
|
||||||
* [Receiving the recent most/newest archive for an URL using newest()](#receiving-the-newest-archive-for-an-url-using-newest)
|
* [Receiving the recent most/newest archive for an URL using newest()](#receiving-the-newest-archive-for-an-url-using-newest)
|
||||||
* [Receiving archive close to a specified year, month, day, hour, and minute using near()](#receiving-archive-close-to-a-specified-year-month-day-hour-and-minute-using-near)
|
* [Receiving archive close to a specified year, month, day, hour, and minute using near()](#receiving-archive-close-to-a-specified-year-month-day-hour-and-minute-using-near)
|
||||||
* [Get the content of webpage using get()](#get-the-content-of-webpage-using-get)
|
* [Get the content of webpage using get()](#get-the-content-of-webpage-using-get)
|
||||||
* [Count total archives for an URL using total_archives()](#count-total-archives-for-an-url-using-total_archives)
|
* [Count total archives for an URL using total_archives()](#count-total-archives-for-an-url-using-total_archives)
|
||||||
|
* [With CLI](#with-the-cli)
|
||||||
|
* [Save](#save)
|
||||||
|
* [Oldest archive](#oldest-archive)
|
||||||
|
* [Newest archive](#newest-archive)
|
||||||
|
* [Total archives](#total-number-of-archives)
|
||||||
|
* [Archive near a time](#archive-near-time)
|
||||||
|
* [Get the source code](#get-the-source-code)
|
||||||
|
|
||||||
* [Tests](#tests)
|
* [Tests](#tests)
|
||||||
|
|
||||||
@ -49,10 +56,15 @@ Using [pip](https://en.wikipedia.org/wiki/Pip_(package_manager)):
|
|||||||
```bash
|
```bash
|
||||||
pip install waybackpy
|
pip install waybackpy
|
||||||
```
|
```
|
||||||
|
or direct from this repository using git.
|
||||||
|
```bash
|
||||||
|
pip install git+https://github.com/akamhy/waybackpy.git
|
||||||
|
```
|
||||||
|
|
||||||
## Usage
|
## Usage
|
||||||
|
|
||||||
|
### As a python package
|
||||||
|
|
||||||
#### Capturing aka Saving an url using save()
|
#### Capturing aka Saving an url using save()
|
||||||
```python
|
```python
|
||||||
import waybackpy
|
import waybackpy
|
||||||
@ -218,12 +230,58 @@ print(archive_count) # total_archives() returns an int
|
|||||||
```
|
```
|
||||||
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyTotalArchivesExample></sub>
|
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyTotalArchivesExample></sub>
|
||||||
|
|
||||||
|
### With the CLI
|
||||||
|
|
||||||
|
#### Save
|
||||||
|
```bash
|
||||||
|
$ waybackpy --url "https://en.wikipedia.org/wiki/Social_media" --user_agent "my-unique-user-agent" --save
|
||||||
|
https://web.archive.org/web/20200719062108/https://en.wikipedia.org/wiki/Social_media
|
||||||
|
```
|
||||||
|
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashSave></sub>
|
||||||
|
|
||||||
|
#### Oldest archive
|
||||||
|
```bash
|
||||||
|
$ waybackpy --url "https://en.wikipedia.org/wiki/SpaceX" --user_agent "my-unique-user-agent" --oldest
|
||||||
|
https://web.archive.org/web/20040803000845/http://en.wikipedia.org:80/wiki/SpaceX
|
||||||
|
```
|
||||||
|
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashOldest></sub>
|
||||||
|
|
||||||
|
#### Newest archive
|
||||||
|
```bash
|
||||||
|
$ waybackpy --url "https://en.wikipedia.org/wiki/YouTube" --user_agent "my-unique-user-agent" --newest
|
||||||
|
https://web.archive.org/web/20200606044708/https://en.wikipedia.org/wiki/YouTube
|
||||||
|
```
|
||||||
|
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashNewest></sub>
|
||||||
|
|
||||||
|
#### Total number of archives
|
||||||
|
```bash
|
||||||
|
$ waybackpy --url "https://en.wikipedia.org/wiki/Linux_kernel" --user_agent "my-unique-user-agent" --total
|
||||||
|
853
|
||||||
|
```
|
||||||
|
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashTotal></sub>
|
||||||
|
|
||||||
|
#### Archive near time
|
||||||
|
```bash
|
||||||
|
$ waybackpy --url facebook.com --user_agent "my-unique-user-agent" --near --year 2012 --month 5 --day 12
|
||||||
|
https://web.archive.org/web/20120512142515/https://www.facebook.com/
|
||||||
|
```
|
||||||
|
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashNear></sub>
|
||||||
|
|
||||||
|
#### Get the source code
|
||||||
|
```bash
|
||||||
|
$ waybackpy --url google.com --user_agent "my-unique-user-agent" --get url # Prints the source code of the url
|
||||||
|
$ waybackpy --url google.com --user_agent "my-unique-user-agent" --get oldest # Prints the source code of the oldest archive
|
||||||
|
$ waybackpy --url google.com --user_agent "my-unique-user-agent" --get newest # Prints the source code of the newest archive
|
||||||
|
$ waybackpy --url google.com --user_agent "my-unique-user-agent" --get save # Save a new archive on wayback machine then print the source code of this archive.
|
||||||
|
```
|
||||||
|
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashGet></sub>
|
||||||
|
|
||||||
## Tests
|
## Tests
|
||||||
* [Here](https://github.com/akamhy/waybackpy/tree/master/tests)
|
* [Here](https://github.com/akamhy/waybackpy/tree/master/tests)
|
||||||
|
|
||||||
|
|
||||||
## Dependency
|
## Dependency
|
||||||
* None, just python standard libraries (re, json, urllib and datetime). Both python 2 and 3 are supported :)
|
* None, just python standard libraries (re, json, urllib, argparse and datetime). Both python 2 and 3 are supported :)
|
||||||
|
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
97
index.rst
97
index.rst
@ -22,20 +22,31 @@ Table of contents
|
|||||||
- `Installation <#installation>`__
|
- `Installation <#installation>`__
|
||||||
|
|
||||||
- `Usage <#usage>`__
|
- `Usage <#usage>`__
|
||||||
|
- `As a python package <#as-a-python-package>`__
|
||||||
|
|
||||||
- `Saving an url using
|
- `Saving an url using
|
||||||
save() <#capturing-aka-saving-an-url-using-save>`__
|
save() <#capturing-aka-saving-an-url-using-save>`__
|
||||||
- `Receiving the oldest archive for an URL Using
|
- `Receiving the oldest archive for an URL Using
|
||||||
oldest() <#receiving-the-oldest-archive-for-an-url-using-oldest>`__
|
oldest() <#receiving-the-oldest-archive-for-an-url-using-oldest>`__
|
||||||
- `Receiving the recent most/newest archive for an URL using
|
- `Receiving the recent most/newest archive for an URL using
|
||||||
newest() <#receiving-the-newest-archive-for-an-url-using-newest>`__
|
newest() <#receiving-the-newest-archive-for-an-url-using-newest>`__
|
||||||
- `Receiving archive close to a specified year, month, day, hour, and
|
- `Receiving archive close to a specified year, month, day, hour,
|
||||||
minute using
|
and minute using
|
||||||
near() <#receiving-archive-close-to-a-specified-year-month-day-hour-and-minute-using-near>`__
|
near() <#receiving-archive-close-to-a-specified-year-month-day-hour-and-minute-using-near>`__
|
||||||
- `Get the content of webpage using
|
- `Get the content of webpage using
|
||||||
get() <#get-the-content-of-webpage-using-get>`__
|
get() <#get-the-content-of-webpage-using-get>`__
|
||||||
- `Count total archives for an URL using
|
- `Count total archives for an URL using
|
||||||
total\_archives() <#count-total-archives-for-an-url-using-total_archives>`__
|
total\_archives() <#count-total-archives-for-an-url-using-total_archives>`__
|
||||||
|
|
||||||
|
- `With CLI <#with-the-cli>`__
|
||||||
|
|
||||||
|
- `Save <#save>`__
|
||||||
|
- `Oldest archive <#oldest-archive>`__
|
||||||
|
- `Newest archive <#newest-archive>`__
|
||||||
|
- `Total archives <#total-number-of-archives>`__
|
||||||
|
- `Archive near a time <#archive-near-time>`__
|
||||||
|
- `Get the source code <#get-the-source-code>`__
|
||||||
|
|
||||||
- `Tests <#tests>`__
|
- `Tests <#tests>`__
|
||||||
|
|
||||||
- `Dependency <#dependency>`__
|
- `Dependency <#dependency>`__
|
||||||
@ -55,9 +66,18 @@ Using `pip <https://en.wikipedia.org/wiki/Pip_(package_manager)>`__:
|
|||||||
|
|
||||||
pip install waybackpy
|
pip install waybackpy
|
||||||
|
|
||||||
|
or direct from this repository using git.
|
||||||
|
|
||||||
|
.. code:: bash
|
||||||
|
|
||||||
|
pip install git+https://github.com/akamhy/waybackpy.git
|
||||||
|
|
||||||
Usage
|
Usage
|
||||||
-----
|
-----
|
||||||
|
|
||||||
|
As a python package
|
||||||
|
~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
Capturing aka Saving an url using save()
|
Capturing aka Saving an url using save()
|
||||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
@ -249,6 +269,77 @@ Count total archives for an URL using total\_archives()
|
|||||||
Try this out in your browser @
|
Try this out in your browser @
|
||||||
https://repl.it/@akamhy/WaybackPyTotalArchivesExample\
|
https://repl.it/@akamhy/WaybackPyTotalArchivesExample\
|
||||||
|
|
||||||
|
With the CLI
|
||||||
|
~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Save
|
||||||
|
^^^^
|
||||||
|
|
||||||
|
.. code:: bash
|
||||||
|
|
||||||
|
$ waybackpy --url "https://en.wikipedia.org/wiki/Social_media" --user_agent "my-unique-user-agent" --save
|
||||||
|
https://web.archive.org/web/20200719062108/https://en.wikipedia.org/wiki/Social_media
|
||||||
|
|
||||||
|
Try this out in your browser @
|
||||||
|
https://repl.it/@akamhy/WaybackPyBashSave\
|
||||||
|
|
||||||
|
Oldest archive
|
||||||
|
^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
.. code:: bash
|
||||||
|
|
||||||
|
$ waybackpy --url "https://en.wikipedia.org/wiki/SpaceX" --user_agent "my-unique-user-agent" --oldest
|
||||||
|
https://web.archive.org/web/20040803000845/http://en.wikipedia.org:80/wiki/SpaceX
|
||||||
|
|
||||||
|
Try this out in your browser @
|
||||||
|
https://repl.it/@akamhy/WaybackPyBashOldest\
|
||||||
|
|
||||||
|
Newest archive
|
||||||
|
^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
.. code:: bash
|
||||||
|
|
||||||
|
$ waybackpy --url "https://en.wikipedia.org/wiki/YouTube" --user_agent "my-unique-user-agent" --newest
|
||||||
|
https://web.archive.org/web/20200606044708/https://en.wikipedia.org/wiki/YouTube
|
||||||
|
|
||||||
|
Try this out in your browser @
|
||||||
|
https://repl.it/@akamhy/WaybackPyBashNewest\
|
||||||
|
|
||||||
|
Total number of archives
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
.. code:: bash
|
||||||
|
|
||||||
|
$ waybackpy --url "https://en.wikipedia.org/wiki/Linux_kernel" --user_agent "my-unique-user-agent" --total
|
||||||
|
853
|
||||||
|
|
||||||
|
Try this out in your browser @
|
||||||
|
https://repl.it/@akamhy/WaybackPyBashTotal\
|
||||||
|
|
||||||
|
Archive near time
|
||||||
|
^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
.. code:: bash
|
||||||
|
|
||||||
|
$ waybackpy --url facebook.com --user_agent "my-unique-user-agent" --near --year 2012 --month 5 --day 12
|
||||||
|
https://web.archive.org/web/20120512142515/https://www.facebook.com/
|
||||||
|
|
||||||
|
Try this out in your browser @
|
||||||
|
https://repl.it/@akamhy/WaybackPyBashNear\
|
||||||
|
|
||||||
|
Get the source code
|
||||||
|
^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
.. code:: bash
|
||||||
|
|
||||||
|
$ waybackpy --url google.com --user_agent "my-unique-user-agent" --get url # Prints the source code of the url
|
||||||
|
$ waybackpy --url google.com --user_agent "my-unique-user-agent" --get oldest # Prints the source code of the oldest archive
|
||||||
|
$ waybackpy --url google.com --user_agent "my-unique-user-agent" --get newest # Prints the source code of the newest archive
|
||||||
|
$ waybackpy --url google.com --user_agent "my-unique-user-agent" --get save # Save a new archive on wayback machine then print the source code of this archive.
|
||||||
|
|
||||||
|
Try this out in your browser @
|
||||||
|
https://repl.it/@akamhy/WaybackPyBashGet\
|
||||||
|
|
||||||
Tests
|
Tests
|
||||||
-----
|
-----
|
||||||
|
|
||||||
@ -257,7 +348,7 @@ Tests
|
|||||||
Dependency
|
Dependency
|
||||||
----------
|
----------
|
||||||
|
|
||||||
- None, just python standard libraries (re, json, urllib and datetime).
|
- None, just python standard libraries (re, json, urllib, argparse and datetime).
|
||||||
Both python 2 and 3 are supported :)
|
Both python 2 and 3 are supported :)
|
||||||
|
|
||||||
License
|
License
|
||||||
|
Loading…
Reference in New Issue
Block a user