2024-11-25 10:22:05 +01:00
|
|
|
# Link Checker
|
|
|
|
|
|
|
|
A recursive link checker that crawls websites to find broken links and redirects. It helps maintain website health by identifying:
|
|
|
|
|
|
|
|
- Broken links (HTTP 4xx, 5xx status codes)
|
|
|
|
- Network/DNS errors
|
|
|
|
- HTTP redirects (3xx status codes)
|
|
|
|
|
|
|
|
## Features
|
|
|
|
|
|
|
|
- Recursive crawling of websites
|
|
|
|
- Handles both absolute and relative URLs
|
|
|
|
- Detects and reports HTTP redirects
|
|
|
|
- Shows progress during scanning
|
|
|
|
- Normalizes URLs for consistent checking
|
|
|
|
- Stays within the same domain
|
|
|
|
- Detailed reporting of issues found
|
|
|
|
|
|
|
|
## Installation
|
|
|
|
|
2024-11-25 10:23:50 +01:00
|
|
|
Make sure you have Go installed (version 1.16 or later), then run:
|
2024-11-25 10:22:05 +01:00
|
|
|
|
|
|
|
```bash
|
2024-11-25 10:23:50 +01:00
|
|
|
go install forgejo.ewintr.nl/ewintr/linkchecker@latest
|
2024-11-25 10:22:05 +01:00
|
|
|
```
|
|
|
|
|
|
|
|
## Usage
|
|
|
|
|
|
|
|
Run the link checker by providing a starting URL:
|
|
|
|
|
|
|
|
```bash
|
2024-11-25 10:23:50 +01:00
|
|
|
linkchecker -url="https://example.com"
|
2024-11-25 10:22:05 +01:00
|
|
|
```
|
|
|
|
|
|
|
|
The tool will:
|
|
|
|
1. Crawl all pages on the same domain
|
|
|
|
2. Check all links found (both internal and external)
|
|
|
|
3. Display progress during the scan
|
|
|
|
4. Generate a report showing:
|
|
|
|
- Total pages checked
|
|
|
|
- List of redirected links
|
|
|
|
- List of broken links
|
|
|
|
- Summary statistics
|
|
|
|
|
|
|
|
## Example Output
|
|
|
|
|
|
|
|
```
|
|
|
|
Checking page 1: https://example.com
|
|
|
|
Checking page 2: https://example.com/about
|
|
|
|
...
|
|
|
|
|
|
|
|
Total pages checked: 15
|
|
|
|
|
|
|
|
Redirects found:
|
|
|
|
- http://example.com/old-page (Redirect 301 -> https://example.com/new-page)
|
|
|
|
- http://example.com/blog (Redirect 302 -> https://blog.example.com)
|
|
|
|
|
|
|
|
Broken links found:
|
|
|
|
- https://example.com/missing-page (Status: 404)
|
|
|
|
- https://example.com/server-error (Status: 500)
|
|
|
|
- https://external-site.com/broken (Error: connection refused)
|
|
|
|
|
|
|
|
Total issues: 5 (2 redirects, 3 broken)
|
|
|
|
```
|