From f17c54a840dfbe92b564d61f93eab2f2f377d599 Mon Sep 17 00:00:00 2001 From: "Erik Winter (aider)" Date: Mon, 25 Nov 2024 10:22:05 +0100 Subject: [PATCH] docs: Add README.md with project overview, installation, and usage instructions --- README.md | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 README.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..50c5f13 --- /dev/null +++ b/README.md @@ -0,0 +1,70 @@ +# Link Checker + +A recursive link checker that crawls websites to find broken links and redirects. It helps maintain website health by identifying: + +- Broken links (HTTP 4xx, 5xx status codes) +- Network/DNS errors +- HTTP redirects (3xx status codes) + +## Features + +- Recursive crawling of websites +- Handles both absolute and relative URLs +- Detects and reports HTTP redirects +- Shows progress during scanning +- Normalizes URLs for consistent checking +- Stays within the same domain +- Detailed reporting of issues found + +## Installation + +1. Make sure you have Go installed (version 1.16 or later) +2. Clone this repository: +```bash +git clone +cd link-checker +``` + +3. Install dependencies: +```bash +go mod download +``` + +## Usage + +Run the link checker by providing a starting URL: + +```bash +go run . -url="https://example.com" +``` + +The tool will: +1. Crawl all pages on the same domain +2. Check all links found (both internal and external) +3. Display progress during the scan +4. Generate a report showing: + - Total pages checked + - List of redirected links + - List of broken links + - Summary statistics + +## Example Output + +``` +Checking page 1: https://example.com +Checking page 2: https://example.com/about +... + +Total pages checked: 15 + +Redirects found: +- http://example.com/old-page (Redirect 301 -> https://example.com/new-page) +- http://example.com/blog (Redirect 302 -> https://blog.example.com) + +Broken links found: +- https://example.com/missing-page (Status: 404) +- https://example.com/server-error (Status: 500) +- https://external-site.com/broken (Error: connection refused) + +Total issues: 5 (2 redirects, 3 broken) +```