lint markdown

This commit is contained in:
Akash Mahanty 2020-10-02 23:34:06 +05:30
parent ede251afb3
commit 2cd991a54e

View File

@ -50,11 +50,15 @@ Table of contents
<!--te--> <!--te-->
## Installation ## Installation
Using [pip](https://en.wikipedia.org/wiki/Pip_(package_manager)): Using [pip](https://en.wikipedia.org/wiki/Pip_(package_manager)):
```bash ```bash
pip install waybackpy pip install waybackpy
``` ```
or direct from this repository using git. or direct from this repository using git.
```bash ```bash
pip install git+https://github.com/akamhy/waybackpy.git pip install git+https://github.com/akamhy/waybackpy.git
``` ```
@ -64,6 +68,7 @@ pip install git+https://github.com/akamhy/waybackpy.git
### As a Python package ### As a Python package
#### Capturing aka Saving an url using save() #### Capturing aka Saving an url using save()
```python ```python
import waybackpy import waybackpy
@ -76,14 +81,15 @@ new_archive_url = waybackpy.Url(
print(new_archive_url) print(new_archive_url)
``` ```
```bash ```bash
https://web.archive.org/web/20200504141153/https://github.com/akamhy/waybackpy https://web.archive.org/web/20200504141153/https://github.com/akamhy/waybackpy
``` ```
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPySaveExample></sub> <sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPySaveExample></sub>
#### Retrieving the oldest archive for an URL using oldest() #### Retrieving the oldest archive for an URL using oldest()
```python ```python
import waybackpy import waybackpy
@ -91,19 +97,19 @@ oldest_archive_url = waybackpy.Url(
"https://www.google.com/", "https://www.google.com/",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:40.0) Gecko/20100101 Firefox/40.0" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:40.0) Gecko/20100101 Firefox/40.0"
).oldest() ).oldest()
print(oldest_archive_url) print(oldest_archive_url)
``` ```
```bash ```bash
http://web.archive.org/web/19981111184551/http://google.com:80/ http://web.archive.org/web/19981111184551/http://google.com:80/
``` ```
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyOldestExample></sub> <sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyOldestExample></sub>
#### Retrieving the newest archive for an URL using newest() #### Retrieving the newest archive for an URL using newest()
```python ```python
import waybackpy import waybackpy
@ -116,14 +122,15 @@ newest_archive_url = waybackpy.Url(
print(newest_archive_url) print(newest_archive_url)
``` ```
```bash ```bash
https://web.archive.org/web/20200714013225/https://www.facebook.com/ https://web.archive.org/web/20200714013225/https://www.facebook.com/
``` ```
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyNewestExample></sub> <sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyNewestExample></sub>
#### Retrieving archive close to a specified year, month, day, hour, and minute using near() #### Retrieving archive close to a specified year, month, day, hour, and minute using near()
```python ```python
from waybackpy import Url from waybackpy import Url
@ -135,35 +142,43 @@ github_wayback_obj = Url(github_url, user_agent)
# Do not pad (don't use zeros in the month, year, day, minute, and hour arguments). e.g. For January, set month = 1 and not month = 01. # Do not pad (don't use zeros in the month, year, day, minute, and hour arguments). e.g. For January, set month = 1 and not month = 01.
``` ```
```python ```python
github_archive_near_2010 = github_wayback_obj.near(year=2010) github_archive_near_2010 = github_wayback_obj.near(year=2010)
print(github_archive_near_2010) print(github_archive_near_2010)
``` ```
```bash ```bash
https://web.archive.org/web/20100719134402/http://github.com/ https://web.archive.org/web/20100719134402/http://github.com/
``` ```
```python ```python
github_archive_near_2011_may = github_wayback_obj.near(year=2011, month=5) github_archive_near_2011_may = github_wayback_obj.near(year=2011, month=5)
print(github_archive_near_2011_may) print(github_archive_near_2011_may)
``` ```
```bash ```bash
https://web.archive.org/web/20110519185447/https://github.com/ https://web.archive.org/web/20110519185447/https://github.com/
``` ```
```python ```python
github_archive_near_2015_january_26 = github_wayback_obj.near( github_archive_near_2015_january_26 = github_wayback_obj.near(
year=2015, month=1, day=26 year=2015, month=1, day=26
) )
print(github_archive_near_2015_january_26) print(github_archive_near_2015_january_26)
``` ```
```bash ```bash
https://web.archive.org/web/20150127031159/https://github.com https://web.archive.org/web/20150127031159/https://github.com
``` ```
```python ```python
github_archive_near_2018_4_july_9_2_am = github_wayback_obj.near( github_archive_near_2018_4_july_9_2_am = github_wayback_obj.near(
year=2018, month=7, day=4, hour = 9, minute = 2 year=2018, month=7, day=4, hour = 9, minute = 2
) )
print(github_archive_near_2018_4_july_9_2_am) print(github_archive_near_2018_4_july_9_2_am)
``` ```
```bash ```bash
https://web.archive.org/web/20180704090245/https://github.com/ https://web.archive.org/web/20180704090245/https://github.com/
@ -173,9 +188,8 @@ https://web.archive.org/web/20180704090245/https://github.com/
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyNearExample></sub> <sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyNearExample></sub>
#### Get the content of webpage using get() #### Get the content of webpage using get()
```python ```python
import waybackpy import waybackpy
@ -205,10 +219,11 @@ google_oldest_archive_source = waybackpy_url_object.get(
) )
print(google_oldest_archive_source) print(google_oldest_archive_source)
``` ```
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyGetExample#main.py></sub> <sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyGetExample#main.py></sub>
#### Count total archives for an URL using total_archives() #### Count total archives for an URL using total_archives()
```python ```python
import waybackpy import waybackpy
@ -223,63 +238,79 @@ archive_count = waybackpy.Url(
print(archive_count) # total_archives() returns an int print(archive_count) # total_archives() returns an int
``` ```
```bash ```bash
2440 2440
``` ```
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyTotalArchivesExample></sub> <sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyTotalArchivesExample></sub>
### With the Command-line interface ### With the Command-line interface
#### Save #### Save
```bash ```bash
$ waybackpy --url "https://en.wikipedia.org/wiki/Social_media" --user_agent "my-unique-user-agent" --save $ waybackpy --url "https://en.wikipedia.org/wiki/Social_media" --user_agent "my-unique-user-agent" --save
https://web.archive.org/web/20200719062108/https://en.wikipedia.org/wiki/Social_media https://web.archive.org/web/20200719062108/https://en.wikipedia.org/wiki/Social_media
``` ```
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashSave></sub> <sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashSave></sub>
#### Oldest archive #### Oldest archive
```bash ```bash
$ waybackpy --url "https://en.wikipedia.org/wiki/SpaceX" --user_agent "my-unique-user-agent" --oldest $ waybackpy --url "https://en.wikipedia.org/wiki/SpaceX" --user_agent "my-unique-user-agent" --oldest
https://web.archive.org/web/20040803000845/http://en.wikipedia.org:80/wiki/SpaceX https://web.archive.org/web/20040803000845/http://en.wikipedia.org:80/wiki/SpaceX
``` ```
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashOldest></sub> <sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashOldest></sub>
#### Newest archive #### Newest archive
```bash ```bash
$ waybackpy --url "https://en.wikipedia.org/wiki/YouTube" --user_agent "my-unique-user-agent" --newest $ waybackpy --url "https://en.wikipedia.org/wiki/YouTube" --user_agent "my-unique-user-agent" --newest
https://web.archive.org/web/20200606044708/https://en.wikipedia.org/wiki/YouTube https://web.archive.org/web/20200606044708/https://en.wikipedia.org/wiki/YouTube
``` ```
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashNewest></sub> <sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashNewest></sub>
#### Total number of archives #### Total number of archives
```bash ```bash
$ waybackpy --url "https://en.wikipedia.org/wiki/Linux_kernel" --user_agent "my-unique-user-agent" --total $ waybackpy --url "https://en.wikipedia.org/wiki/Linux_kernel" --user_agent "my-unique-user-agent" --total
853 853
``` ```
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashTotal></sub> <sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashTotal></sub>
#### Archive near time #### Archive near time
```bash ```bash
$ waybackpy --url facebook.com --user_agent "my-unique-user-agent" --near --year 2012 --month 5 --day 12 $ waybackpy --url facebook.com --user_agent "my-unique-user-agent" --near --year 2012 --month 5 --day 12
https://web.archive.org/web/20120512142515/https://www.facebook.com/ https://web.archive.org/web/20120512142515/https://www.facebook.com/
``` ```
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashNear></sub> <sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashNear></sub>
#### Get the source code #### Get the source code
```bash ```bash
$ waybackpy --url google.com --user_agent "my-unique-user-agent" --get url # Prints the source code of the url waybackpy --url google.com --user_agent "my-unique-user-agent" --get url # Prints the source code of the url
$ waybackpy --url google.com --user_agent "my-unique-user-agent" --get oldest # Prints the source code of the oldest archive waybackpy --url google.com --user_agent "my-unique-user-agent" --get oldest # Prints the source code of the oldest archive
$ waybackpy --url google.com --user_agent "my-unique-user-agent" --get newest # Prints the source code of the newest archive waybackpy --url google.com --user_agent "my-unique-user-agent" --get newest # Prints the source code of the newest archive
$ waybackpy --url google.com --user_agent "my-unique-user-agent" --get save # Save a new archive on wayback machine then print the source code of this archive. waybackpy --url google.com --user_agent "my-unique-user-agent" --get save # Save a new archive on wayback machine then print the source code of this archive.
``` ```
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashGet></sub> <sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashGet></sub>
## Tests ## Tests
* [Here](https://github.com/akamhy/waybackpy/tree/master/tests)
[Here](https://github.com/akamhy/waybackpy/tree/master/tests)
## Dependency ## Dependency
* None, just python standard libraries (re, json, urllib, argparse and datetime). Both python 2 and 3 are supported :)
None, just python standard libraries (re, json, urllib, argparse and datetime). Both python 2 and 3 are supported :)
## Packaging ## Packaging
@ -290,7 +321,6 @@ $ waybackpy --url google.com --user_agent "my-unique-user-agent" --get save # Sa
3. Sign & upload the package ``twine upload -s dist/*``. 3. Sign & upload the package ``twine upload -s dist/*``.
## License ## License
Released under the MIT License. See Released under the MIT License. See
[license](https://github.com/akamhy/waybackpy/blob/master/LICENSE) for details. [license](https://github.com/akamhy/waybackpy/blob/master/LICENSE) for details.