GitHub - oduwsdl/archivenow: A Tool To Push Web Resources Into Web Archives

A Tool To Push Web Resources Into Web Archives. Contribute to oduwsdl/archivenow development by creating an account on GitHub.

Loading Stats

Last Updated: 9 October 2025

Loading Readme

36 Projects and apps Similar to "GitHub - oduwsdl/archivenow: A Tool To Push Web Resources Into Web Archives" in October 2025

22120 - A non-WARC-based tool which hooks into the Chrome browser and archives everything you browse making it available for offline replay.
ArchiveBox - A tool which maintains an additive archive from RSS feeds, bookmarks, and links using wget, Chrome headless, and other methods (formerly
archiveweb.page
GitHub - webrecorder/browsertrix-crawler: Run a high-fidelity browser-based crawler in a single Docker container
Run a high fidelity browser based crawler in a single docker container git hub webrecorder browsertrix crawler run a high fidelity browser based crawler in a single docker container
GitHub - internetarchive/brozzler: brozzler - distributed browser-based web crawler
Brozzler distributed browser based web crawler contribute to internetarchive brozzler development by creating an account on git hub
GitHub - wabarc/cairn: NPM package and CLI tool for saving webpages
Npm package and cli tool for saving webpages contribute to wabarc cairn development by creating an account on git hub
GitHub - CGamesPlay/chronicler: Offline-first web browser
Offline first web browser contribute to c games play chronicler development by creating an account on git hub
ale / crawl
Simple web crawler in go
GitHub - PromyLOPh/crocoite: Web archiving using Google Chrome
Web archiving using google chrome contribute to promy lo ph crocoite development by creating an account on git hub
GitHub - justinlittman/fbarc: A commandline tool and Python library for archiving data from Facebook using the Graph API.
A commandline tool and python library for archiving data from facebook using the graph api git hub justinlittman fbarc a commandline tool and python library for archiving data from facebook us
GitHub - WebMemex/freeze-dry: Snapshots a web page to get it as a static, self-contained HTML document.
Snapshots a web page to get it as a static self contained html document git hub web memex freeze dry snapshots a web page to get it as a static self contained html document
GitHub - ArchiveTeam/grab-site: The archivist’s web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
The archivist 39 s web crawler warc output dashboard for all crawls dynamic ignore patterns git hub archive team grab site the archivist 39 s web crawler warc output dashboard for all cra
Home · internetarchive/heritrix3 Wiki
Heritrix is the internet archive s open source extensible web scale archival quality web crawler project home internetarchive heritrix3 wiki
GitHub - web-archive-group/heritrix-walkthrough
Contribute to web archive group heritrix walkthrough development by creating an account on git hub
GitHub - steffenfritz/html2warc: simple script to convert web resources to a single warc file
Simple script to convert web resources to a single warc file git hub steffenfritz html2warc simple script to convert web resources to a single warc file
HTTrack Website Copier - Free Software Offline Browser (GNU GPL)
Ht track is a free gpl libre free software and easy to use offline browser utility it allows you to download a world wide web site from the internet to a local directory building recursively all directories getting html images and other files from the server to your computer ht track arranges
GitHub - Y2Z/monolith: ⬛️ CLI tool for saving complete web pages as a single HTML file
Cli tool for saving complete web pages as a single html file git hub y2 z monolith cli tool for saving complete web pages as a single html file
GitHub - go-shiori/obelisk: Go package and CLI tool for saving web page as single HTML file
Go package and cli tool for saving web page as single html file git hub go shiori obelisk go package and cli tool for saving web page as single html file
GitHub - gildas-lormeau/SingleFile: Web Extension for Firefox/Chrome/MS Edge and CLI tool to save a faithful copy of an entire web page in a single HTML file
Web extension for firefox chrome ms edge and cli tool to save a faithful copy of an entire web page in a single html file git hub gildas lormeau single file web extension for firefox chrome ms e
SiteStory - A transactional archive that selectively captures and stores transactions that take place between a web client (browser) and a web server.
Home
Social feed manager is open source software that harvests social media data and related content from twitter tumblr flickr and sina weibo
GitHub - N0taN3rd/Squidwarc: Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head
Squidwarc is a high fidelity user scriptable archival crawler that uses chrome or chromium with or without a head git hub n0ta n3rd squidwarc squidwarc is a high fidelity user scriptable arc
StormCrawler
Storm crawler is collection of resources for building low latency scalable web crawlers on apache storm
GitHub - DocNow/twarc: A command line tool (and Python library) for archiving Twitter JSON
A command line tool and python library for archiving twitter json git hub doc now twarc a command line tool and python library for archiving twitter json
GitHub - machawk1/wail: :whale2: Web Archiving Integration Layer: One-Click User Instigated Preservation
Whale2 web archiving integration layer one click user instigated preservation git hub machawk1 wail whale2 web archiving integration layer one click user instigated preservation
GitHub - internetarchive/warcprox: WARC writing MITM HTTP/S proxy
Warc writing mitm http s proxy contribute to internetarchive warcprox development by creating an account on git hub
WARCreate - Create WARC files from any webpage!
GitHub - peterk/warcworker: A dockerized, queued high fidelity web archiver based on Squidwarc
A dockerized queued high fidelity web archiver based on squidwarc git hub peterk warcworker a dockerized queued high fidelity web archiver based on squidwarc
GitHub - wabarc/wayback: A toolkit for snapshot webpage to Internet Archive, archive.today, IPFS and beyond
A toolkit for snapshot webpage to internet archive archive today ipfs and beyond git hub wabarc wayback a toolkit for snapshot webpage to internet archive archive today ipfs and beyond
GitHub - helgeho/Web2Warc: An easy-to-use and highly customizable crawler that enables you to create your own little Web archives (WARC/CDX)
An easy to use and highly customizable crawler that enables you to create your own little web archives warc cdx git hub helgeho web2 warc an easy to use and highly customizable crawler that en
Web Curator Tool
Open source workflow management for selective web archiving
WebMemex
Web memex has 3 repositories available follow their code on git hub
Conifer
Collect and revisit web pages free open source web archiving service
Wget - GNU Project - Free Software Foundation
GitHub - alard/wget-lua: Wget with Lua extension
Wget with lua extension contribute to alard wget lua development by creating an account on git hub
Wpull - A Wget-compatible (or remake/clone/replacement/alternative) web downloader and crawler.

Subscribe to our Newsletter

Subscribe to get resources directly to your inbox. You won't receive any spam! ✌️

Rackpiper Technology Inc

Company

About Us Blog Contact

Instagram Youtube Twitter Reddit Facebook LinkedIn

Subscribe to our Newsletter

Subscribe to get resources directly to your inbox. You won't receive any spam! ✌️

GitHub - oduwsdl/archivenow: A Tool To Push Web Resources Into Web Archives

A Tool To Push Web Resources Into Web Archives. Contribute to oduwsdl/archivenow development by creating an account on GitHub.

Loading Stats

Last Updated: 9 October 2025

Loading Readme

36 Projects and apps Similar to "GitHub - oduwsdl/archivenow: A Tool To Push Web Resources Into Web Archives" in October 2025

22120 - A non-WARC-based tool which hooks into the Chrome browser and archives everything you browse making it available for offline replay.

ArchiveBox - A tool which maintains an additive archive from RSS feeds, bookmarks, and links using wget, Chrome headless, and other methods (formerly

archiveweb.page

GitHub - webrecorder/browsertrix-crawler: Run a high-fidelity browser-based crawler in a single Docker container

GitHub - internetarchive/brozzler: brozzler - distributed browser-based web crawler

GitHub - wabarc/cairn: NPM package and CLI tool for saving webpages

GitHub - CGamesPlay/chronicler: Offline-first web browser

ale / crawl

GitHub - PromyLOPh/crocoite: Web archiving using Google Chrome

GitHub - justinlittman/fbarc: A commandline tool and Python library for archiving data from Facebook using the Graph API.

GitHub - WebMemex/freeze-dry: Snapshots a web page to get it as a static, self-contained HTML document.

GitHub - ArchiveTeam/grab-site: The archivist’s web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns

Home · internetarchive/heritrix3 Wiki

GitHub - web-archive-group/heritrix-walkthrough

GitHub - steffenfritz/html2warc: simple script to convert web resources to a single warc file

HTTrack Website Copier - Free Software Offline Browser (GNU GPL)

GitHub - Y2Z/monolith: ⬛️ CLI tool for saving complete web pages as a single HTML file

GitHub - go-shiori/obelisk: Go package and CLI tool for saving web page as single HTML file

GitHub - gildas-lormeau/SingleFile: Web Extension for Firefox/Chrome/MS Edge and CLI tool to save a faithful copy of an entire web page in a single HTML file

SiteStory - A transactional archive that selectively captures and stores transactions that take place between a web client (browser) and a web server.

Home

GitHub - N0taN3rd/Squidwarc: Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head

StormCrawler

GitHub - DocNow/twarc: A command line tool (and Python library) for archiving Twitter JSON

GitHub - machawk1/wail: :whale2: Web Archiving Integration Layer: One-Click User Instigated Preservation

GitHub - internetarchive/warcprox: WARC writing MITM HTTP/S proxy

WARCreate - Create WARC files from any webpage!

GitHub - peterk/warcworker: A dockerized, queued high fidelity web archiver based on Squidwarc

GitHub - wabarc/wayback: A toolkit for snapshot webpage to Internet Archive, archive.today, IPFS and beyond

GitHub - helgeho/Web2Warc: An easy-to-use and highly customizable crawler that enables you to create your own little Web archives (WARC/CDX)

Web Curator Tool

WebMemex

Conifer

Wget - GNU Project - Free Software Foundation

GitHub - alard/wget-lua: Wget with Lua extension

Wpull - A Wget-compatible (or remake/clone/replacement/alternative) web downloader and crawler.

Subscribe to our Newsletter

Company

Follow Us

Subscribe to our Newsletter