lint markdown
This commit is contained in:
parent
ede251afb3
commit
2cd991a54e
66
README.md
66
README.md
@ -50,11 +50,15 @@ Table of contents
|
|||||||
<!--te-->
|
<!--te-->
|
||||||
|
|
||||||
## Installation
|
## Installation
|
||||||
|
|
||||||
Using [pip](https://en.wikipedia.org/wiki/Pip_(package_manager)):
|
Using [pip](https://en.wikipedia.org/wiki/Pip_(package_manager)):
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
pip install waybackpy
|
pip install waybackpy
|
||||||
```
|
```
|
||||||
|
|
||||||
or direct from this repository using git.
|
or direct from this repository using git.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
pip install git+https://github.com/akamhy/waybackpy.git
|
pip install git+https://github.com/akamhy/waybackpy.git
|
||||||
```
|
```
|
||||||
@ -64,6 +68,7 @@ pip install git+https://github.com/akamhy/waybackpy.git
|
|||||||
### As a Python package
|
### As a Python package
|
||||||
|
|
||||||
#### Capturing aka Saving an url using save()
|
#### Capturing aka Saving an url using save()
|
||||||
|
|
||||||
```python
|
```python
|
||||||
import waybackpy
|
import waybackpy
|
||||||
|
|
||||||
@ -76,14 +81,15 @@ new_archive_url = waybackpy.Url(
|
|||||||
|
|
||||||
print(new_archive_url)
|
print(new_archive_url)
|
||||||
```
|
```
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
https://web.archive.org/web/20200504141153/https://github.com/akamhy/waybackpy
|
https://web.archive.org/web/20200504141153/https://github.com/akamhy/waybackpy
|
||||||
```
|
```
|
||||||
|
|
||||||
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPySaveExample></sub>
|
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPySaveExample></sub>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Retrieving the oldest archive for an URL using oldest()
|
#### Retrieving the oldest archive for an URL using oldest()
|
||||||
|
|
||||||
```python
|
```python
|
||||||
import waybackpy
|
import waybackpy
|
||||||
|
|
||||||
@ -91,19 +97,19 @@ oldest_archive_url = waybackpy.Url(
|
|||||||
|
|
||||||
"https://www.google.com/",
|
"https://www.google.com/",
|
||||||
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:40.0) Gecko/20100101 Firefox/40.0"
|
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:40.0) Gecko/20100101 Firefox/40.0"
|
||||||
|
|
||||||
).oldest()
|
).oldest()
|
||||||
|
|
||||||
print(oldest_archive_url)
|
print(oldest_archive_url)
|
||||||
```
|
```
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
http://web.archive.org/web/19981111184551/http://google.com:80/
|
http://web.archive.org/web/19981111184551/http://google.com:80/
|
||||||
```
|
```
|
||||||
|
|
||||||
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyOldestExample></sub>
|
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyOldestExample></sub>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Retrieving the newest archive for an URL using newest()
|
#### Retrieving the newest archive for an URL using newest()
|
||||||
|
|
||||||
```python
|
```python
|
||||||
import waybackpy
|
import waybackpy
|
||||||
|
|
||||||
@ -116,14 +122,15 @@ newest_archive_url = waybackpy.Url(
|
|||||||
|
|
||||||
print(newest_archive_url)
|
print(newest_archive_url)
|
||||||
```
|
```
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
https://web.archive.org/web/20200714013225/https://www.facebook.com/
|
https://web.archive.org/web/20200714013225/https://www.facebook.com/
|
||||||
```
|
```
|
||||||
|
|
||||||
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyNewestExample></sub>
|
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyNewestExample></sub>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Retrieving archive close to a specified year, month, day, hour, and minute using near()
|
#### Retrieving archive close to a specified year, month, day, hour, and minute using near()
|
||||||
|
|
||||||
```python
|
```python
|
||||||
from waybackpy import Url
|
from waybackpy import Url
|
||||||
|
|
||||||
@ -135,35 +142,43 @@ github_wayback_obj = Url(github_url, user_agent)
|
|||||||
|
|
||||||
# Do not pad (don't use zeros in the month, year, day, minute, and hour arguments). e.g. For January, set month = 1 and not month = 01.
|
# Do not pad (don't use zeros in the month, year, day, minute, and hour arguments). e.g. For January, set month = 1 and not month = 01.
|
||||||
```
|
```
|
||||||
|
|
||||||
```python
|
```python
|
||||||
github_archive_near_2010 = github_wayback_obj.near(year=2010)
|
github_archive_near_2010 = github_wayback_obj.near(year=2010)
|
||||||
print(github_archive_near_2010)
|
print(github_archive_near_2010)
|
||||||
```
|
```
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
https://web.archive.org/web/20100719134402/http://github.com/
|
https://web.archive.org/web/20100719134402/http://github.com/
|
||||||
```
|
```
|
||||||
|
|
||||||
```python
|
```python
|
||||||
github_archive_near_2011_may = github_wayback_obj.near(year=2011, month=5)
|
github_archive_near_2011_may = github_wayback_obj.near(year=2011, month=5)
|
||||||
print(github_archive_near_2011_may)
|
print(github_archive_near_2011_may)
|
||||||
```
|
```
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
https://web.archive.org/web/20110519185447/https://github.com/
|
https://web.archive.org/web/20110519185447/https://github.com/
|
||||||
```
|
```
|
||||||
|
|
||||||
```python
|
```python
|
||||||
github_archive_near_2015_january_26 = github_wayback_obj.near(
|
github_archive_near_2015_january_26 = github_wayback_obj.near(
|
||||||
year=2015, month=1, day=26
|
year=2015, month=1, day=26
|
||||||
)
|
)
|
||||||
print(github_archive_near_2015_january_26)
|
print(github_archive_near_2015_january_26)
|
||||||
```
|
```
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
https://web.archive.org/web/20150127031159/https://github.com
|
https://web.archive.org/web/20150127031159/https://github.com
|
||||||
```
|
```
|
||||||
|
|
||||||
```python
|
```python
|
||||||
github_archive_near_2018_4_july_9_2_am = github_wayback_obj.near(
|
github_archive_near_2018_4_july_9_2_am = github_wayback_obj.near(
|
||||||
year=2018, month=7, day=4, hour = 9, minute = 2
|
year=2018, month=7, day=4, hour = 9, minute = 2
|
||||||
)
|
)
|
||||||
print(github_archive_near_2018_4_july_9_2_am)
|
print(github_archive_near_2018_4_july_9_2_am)
|
||||||
```
|
```
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
https://web.archive.org/web/20180704090245/https://github.com/
|
https://web.archive.org/web/20180704090245/https://github.com/
|
||||||
|
|
||||||
@ -173,9 +188,8 @@ https://web.archive.org/web/20180704090245/https://github.com/
|
|||||||
|
|
||||||
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyNearExample></sub>
|
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyNearExample></sub>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Get the content of webpage using get()
|
#### Get the content of webpage using get()
|
||||||
|
|
||||||
```python
|
```python
|
||||||
import waybackpy
|
import waybackpy
|
||||||
|
|
||||||
@ -205,10 +219,11 @@ google_oldest_archive_source = waybackpy_url_object.get(
|
|||||||
)
|
)
|
||||||
print(google_oldest_archive_source)
|
print(google_oldest_archive_source)
|
||||||
```
|
```
|
||||||
|
|
||||||
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyGetExample#main.py></sub>
|
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyGetExample#main.py></sub>
|
||||||
|
|
||||||
|
|
||||||
#### Count total archives for an URL using total_archives()
|
#### Count total archives for an URL using total_archives()
|
||||||
|
|
||||||
```python
|
```python
|
||||||
import waybackpy
|
import waybackpy
|
||||||
|
|
||||||
@ -223,63 +238,79 @@ archive_count = waybackpy.Url(
|
|||||||
|
|
||||||
print(archive_count) # total_archives() returns an int
|
print(archive_count) # total_archives() returns an int
|
||||||
```
|
```
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
2440
|
2440
|
||||||
```
|
```
|
||||||
|
|
||||||
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyTotalArchivesExample></sub>
|
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyTotalArchivesExample></sub>
|
||||||
|
|
||||||
### With the Command-line interface
|
### With the Command-line interface
|
||||||
|
|
||||||
#### Save
|
#### Save
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
$ waybackpy --url "https://en.wikipedia.org/wiki/Social_media" --user_agent "my-unique-user-agent" --save
|
$ waybackpy --url "https://en.wikipedia.org/wiki/Social_media" --user_agent "my-unique-user-agent" --save
|
||||||
https://web.archive.org/web/20200719062108/https://en.wikipedia.org/wiki/Social_media
|
https://web.archive.org/web/20200719062108/https://en.wikipedia.org/wiki/Social_media
|
||||||
```
|
```
|
||||||
|
|
||||||
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashSave></sub>
|
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashSave></sub>
|
||||||
|
|
||||||
#### Oldest archive
|
#### Oldest archive
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
$ waybackpy --url "https://en.wikipedia.org/wiki/SpaceX" --user_agent "my-unique-user-agent" --oldest
|
$ waybackpy --url "https://en.wikipedia.org/wiki/SpaceX" --user_agent "my-unique-user-agent" --oldest
|
||||||
https://web.archive.org/web/20040803000845/http://en.wikipedia.org:80/wiki/SpaceX
|
https://web.archive.org/web/20040803000845/http://en.wikipedia.org:80/wiki/SpaceX
|
||||||
```
|
```
|
||||||
|
|
||||||
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashOldest></sub>
|
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashOldest></sub>
|
||||||
|
|
||||||
#### Newest archive
|
#### Newest archive
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
$ waybackpy --url "https://en.wikipedia.org/wiki/YouTube" --user_agent "my-unique-user-agent" --newest
|
$ waybackpy --url "https://en.wikipedia.org/wiki/YouTube" --user_agent "my-unique-user-agent" --newest
|
||||||
https://web.archive.org/web/20200606044708/https://en.wikipedia.org/wiki/YouTube
|
https://web.archive.org/web/20200606044708/https://en.wikipedia.org/wiki/YouTube
|
||||||
```
|
```
|
||||||
|
|
||||||
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashNewest></sub>
|
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashNewest></sub>
|
||||||
|
|
||||||
#### Total number of archives
|
#### Total number of archives
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
$ waybackpy --url "https://en.wikipedia.org/wiki/Linux_kernel" --user_agent "my-unique-user-agent" --total
|
$ waybackpy --url "https://en.wikipedia.org/wiki/Linux_kernel" --user_agent "my-unique-user-agent" --total
|
||||||
853
|
853
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashTotal></sub>
|
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashTotal></sub>
|
||||||
|
|
||||||
#### Archive near time
|
#### Archive near time
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
$ waybackpy --url facebook.com --user_agent "my-unique-user-agent" --near --year 2012 --month 5 --day 12
|
$ waybackpy --url facebook.com --user_agent "my-unique-user-agent" --near --year 2012 --month 5 --day 12
|
||||||
https://web.archive.org/web/20120512142515/https://www.facebook.com/
|
https://web.archive.org/web/20120512142515/https://www.facebook.com/
|
||||||
```
|
```
|
||||||
|
|
||||||
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashNear></sub>
|
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashNear></sub>
|
||||||
|
|
||||||
#### Get the source code
|
#### Get the source code
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
$ waybackpy --url google.com --user_agent "my-unique-user-agent" --get url # Prints the source code of the url
|
waybackpy --url google.com --user_agent "my-unique-user-agent" --get url # Prints the source code of the url
|
||||||
$ waybackpy --url google.com --user_agent "my-unique-user-agent" --get oldest # Prints the source code of the oldest archive
|
waybackpy --url google.com --user_agent "my-unique-user-agent" --get oldest # Prints the source code of the oldest archive
|
||||||
$ waybackpy --url google.com --user_agent "my-unique-user-agent" --get newest # Prints the source code of the newest archive
|
waybackpy --url google.com --user_agent "my-unique-user-agent" --get newest # Prints the source code of the newest archive
|
||||||
$ waybackpy --url google.com --user_agent "my-unique-user-agent" --get save # Save a new archive on wayback machine then print the source code of this archive.
|
waybackpy --url google.com --user_agent "my-unique-user-agent" --get save # Save a new archive on wayback machine then print the source code of this archive.
|
||||||
```
|
```
|
||||||
|
|
||||||
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashGet></sub>
|
<sub>Try this out in your browser @ <https://repl.it/@akamhy/WaybackPyBashGet></sub>
|
||||||
|
|
||||||
## Tests
|
## Tests
|
||||||
* [Here](https://github.com/akamhy/waybackpy/tree/master/tests)
|
|
||||||
|
|
||||||
|
[Here](https://github.com/akamhy/waybackpy/tree/master/tests)
|
||||||
|
|
||||||
## Dependency
|
## Dependency
|
||||||
* None, just python standard libraries (re, json, urllib, argparse and datetime). Both python 2 and 3 are supported :)
|
|
||||||
|
None, just python standard libraries (re, json, urllib, argparse and datetime). Both python 2 and 3 are supported :)
|
||||||
|
|
||||||
## Packaging
|
## Packaging
|
||||||
|
|
||||||
@ -290,7 +321,6 @@ $ waybackpy --url google.com --user_agent "my-unique-user-agent" --get save # Sa
|
|||||||
3. Sign & upload the package ``twine upload -s dist/*``.
|
3. Sign & upload the package ``twine upload -s dist/*``.
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
|
||||||
Released under the MIT License. See
|
Released under the MIT License. See
|
||||||
[license](https://github.com/akamhy/waybackpy/blob/master/LICENSE) for details.
|
[license](https://github.com/akamhy/waybackpy/blob/master/LICENSE) for details.
|
||||||
|
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user