Architecture
wayparam is intentionally modular. Each module has a single responsibility, which makes the tool easier to test, audit, and package.
High-level data flow
- cli.py
- Parses args
- Builds option objects
- Orchestrates concurrency
- wayback.py
- Builds CDX query parameters
- Handles pagination/resumeKey
- http.py
- Makes resilient HTTP requests (retries, backoff)
- filters.py
- Drops “boring” URLs (static assets) early
- normalize.py
- Canonicalizes and normalizes URLs (stable output)
- output.py
- Writes records to files and/or stdout (txt/jsonl)
- ratelimit.py
- Global RPS limiter (optional)
Why this structure matters
- unit tests focus on pure logic (
normalize.py,filters.py, parsing) - integration tests mock HTTP at the transport layer (httpx MockTransport)
- CLI stays pipeline-friendly: stdout is clean and predictable