From 654d5c0b7da594ee7f2508a970c3a11ce81e8e81 Mon Sep 17 00:00:00 2001 From: The <149513282+the9655a@users.noreply.github.com> Date: Fri, 3 May 2024 10:51:53 +0100 Subject: [PATCH] Update --- Internet-Tools.md | 1 - STORAGE.md | 35 ++++++++++++++++++++--------------- 2 files changed, 20 insertions(+), 16 deletions(-) diff --git a/Internet-Tools.md b/Internet-Tools.md index a123c5155..412405fdb 100644 --- a/Internet-Tools.md +++ b/Internet-Tools.md @@ -303,7 +303,6 @@ * [AnyImage](https://anyimage.io/) - Create Social Card Links * [urlportal](https://raw.githubusercontent.com/gotbletu/shownotes/master/urlportal.sh) - Custom URL Handler * [Web Check](https://web-check.xyz/), [NSLookup](https://www.nslookup.io/) or [dog](https://github.com/ogham/dog) - DNS Information Tool -* [Awesome Piracy Bot](https://github.com/Igglybuff/awesome-piracy-bot) - URL Scraping Tools * [Site Worth Traffic](https://www.siteworthtraffic.com/) - Calculate Website Worth * [XML-Sitemaps](https://www.xml-sitemaps.com/) - Sitemap Creator * [CarbonDates](https://carbondate.cs.odu.edu/) - Check Site Creation Date diff --git a/STORAGE.md b/STORAGE.md index 07324beb9..cc760d468 100644 --- a/STORAGE.md +++ b/STORAGE.md @@ -1151,27 +1151,32 @@ *** -## Web Archiving +## Web Scraping / Crawling -* 🌐 **[awesome-web-scraping](https://github.com/lorien/awesome-web-scraping)** / [2](https://github.com/iipc/awesome-web-archiving) / [3](https://github.com/BruceDone/awesome-crawler) +* 🌐 **[Awesome Web Scraping](https://github.com/lorien/awesome-web-scraping)** - Web Scraping Tools +* 🌐 **[Awesome-crawler](https://github.com/BruceDone/awesome-crawler)** - Crawling Resources +* [Heritrix](https://heritrix.readthedocs.io/) / [GitHub](https://github.com/internetarchive/heritrix3) - Internet Archive's Web Crawler +* [80legs](https://80legs.com/) - Cloud-Based +* [Crawly](https://crawly.diffbot.com/) - Online Scraper + +## Web Archiving Tools + +* 🌐 **[Awesome Web Archiving](https://github.com/iipc/awesome-web-archiving)** - Web Archiving Tools +* 🌐 **[Webrecorder](https://webrecorder.net/)** - Open source Archiving Tools * ⭐ **[datahoarder-website-to-markdown](https://github.com/evilsh3ll/datahoarder-website-to-markdown)** - Index to Markdown Archiving Tool -* [webrecorder](https://webrecorder.net/) -* [Heritrix](https://heritrix.readthedocs.io/) / [GitHub](https://github.com/internetarchive/heritrix3) -* [wail](https://matkelly.com/wail) / [GitHub](https://github.com/machawk1/wail) -* [80legs](https://80legs.com/) -* [crawly](https://crawly.diffbot.com/) -* [replayweb](https://replayweb.page/) - View Archive Format Files +* [WAIL](https://matkelly.com/wail) / [GitHub](https://github.com/machawk1/wail) - GUI For Archiving Tools +* [ReplayWeb.page](https://replayweb.page/) - View Web Archive Files ### Archiving Services -* ⭐ **[Wayback Machine](https://web.archive.org/)** -* ⭐ **Wayback Machine Tools** - [ArchiveTeam Contribute](https://tracker.archiveteam.org/) / [Downloader](https://github.com/hartator/wayback-machine-downloader), [2](https://github.com/jsvine/waybackpack) / [Classic Frontend](https://wayback-classic.net/) / [Extension](https://github.com/internetarchive/wayback-machine-webextension), [2](https://vegetableman.github.io/vandal/) / [Addon](https://www.reddit.com/r/FREEMEDIAHECKYEAH/wiki/storage#wiki_wayback_machine_extension) / [Script](https://github.com/overcast07/wayback-machine-spn-scripts) / [Toolkit](https://docs.wabarc.eu.org/) / [Multi-URL](https://liamswayne.github.io/Super-Archiver/) / [Auto Load](https://gitlab.com/gkrishnaks/WaybackEverywhere-Firefox) -* ⭐ **[Archive.is](https://archive.is/)** / [.li](https://archive.li/) / [.ph](https://archive.ph/) / [.vn](https://archive.vn/) / [.fo](https://archive.fo/) / [.md](https://archive.md/) +* ⭐ **[Archive.org](https://archive.org/)** - Internet Archive +* ⭐ **[Wayback Machine](https://web.archive.org/)** - Archive Web Pages +* ⭐ **Wayback Machine Tools** - [Downloader](https://github.com/jsvine/waybackpack) / [Classic Frontend](https://wayback-classic.net/) / [Extension](https://github.com/internetarchive/wayback-machine-webextension), [2](https://vegetableman.github.io/vandal/) / [Addon](https://www.reddit.com/r/FREEMEDIAHECKYEAH/wiki/storage#wiki_wayback_machine_extension) / [Script](https://github.com/overcast07/wayback-machine-spn-scripts) / [Toolkit](https://docs.wabarc.eu.org/) / [Multi-URL](https://liamswayne.github.io/Super-Archiver/) / [Auto Load](https://gitlab.com/gkrishnaks/WaybackEverywhere-Firefox) +* ⭐ **[Archive.is](https://archive.is/)** / [.li](https://archive.li/) / [.ph](https://archive.ph/) / [.vn](https://archive.vn/) / [.fo](https://archive.fo/) / [.md](https://archive.md/) - Archive Web Pages * ⭐ **[cachedview](https://cachedview.nl/)**, **[Web Archives](https://github.com/dessant/web-archives)**, [quickcache](https://cipher387.github.io/quickcacheandarchivesearch/), [resurrect-pages](https://github.com/Albirew/resurrect-pages-isup-edition) - Aggregate Cache Results -* [Perma.cc](https://perma.cc/) -* [archiveforever](https://www.archiveforever.xyz/) -* [ghostarchive](https://ghostarchive.org/) -* [hozon](https://hozon.site/) +* [ArchiveTeam](https://wiki.archiveteam.org/index.php/Main_Page) - Archive Projects +* [Perma.cc](https://perma.cc/) - Create Permalinks + * [Arquivo.pt](https://arquivo.pt/?l=en) ### Local Archiving