Elvis

Elvis: Australian Sales Lead Call List Scraper

Welcome! Elvis is designed for everyone. You don’t need to know how to code. Just follow the step-by-step guides and diagrams below to get started quickly.


How Elvis Works (At a Glance)

Procedure RunElvis() Begin Read seed URLs from srv/urls.txt For each URL: Fetch job listings Extract company and location using SED/AWK Deduplicate and validate results Write output to home/calllist.txt If --append-history is set: Append new companies to history End If End End Procedure
flowchart TD A[Start] --> B[Read seed URLs] B --> C[Fetch job listings] C --> D[Extract company/location] D --> E[Deduplicate & validate] E --> F[Write calllist.txt] F --> G{Append history?} G -- Yes --> H[Update company_history.txt] G -- No --> I[Done]

Pseudocode: Validating Output

Procedure ValidateCallList() Begin If home/calllist.txt does not exist or is empty then Log error and exit End If For each row in calllist.txt: Check format and required fields If invalid, log error End For If all rows valid then Print "Validation successful" Else Print "Validation failed" End If End End Procedure

Mermaid: Elvis Main Pipeline

Mermaid: Elvis System Architecture (C4 Container Diagram)

C4Context Person(user, "User", "Runs Elvis and reviews call lists") System(elvis, "Elvis", "POSIX shell web scraper") Container(bin, "bin/elvis.sh", "Shell Script", "Entrypoint orchestrator") Container(dataInput, "lib/data_input.sh", "Shell Script", "Fetches and extracts job data") Container(processor, "lib/processor.sh", "Shell Script", "Normalizes and deduplicates") Container(validator, "lib/validate_calllist.sh", "Shell Script", "Validates output") ContainerDb(output, "home/calllist.txt", "Text File", "Final call list output") Rel(user, elvis, "Runs") Rel(elvis, bin, "Orchestrates") Rel(bin, dataInput, "Invokes") Rel(dataInput, processor, "Sends extracted data") Rel(processor, validator, "Sends processed data") Rel(validator, output, "Writes validated call list")


Build Status Release License: AGPL v3

Elvis is a POSIX shell-based web scraper that generates daily call lists of Australian companies from job boards (e.g., Seek). It is built for reliability, transparency, and easy customization using POSIX utilities only.


Onboarding: Choose Your Path

Start here! Use the flowchart below to find the best onboarding for your needs.

flowchart TD A[Start Here] --> B{What do you want to do?} B --> C[Just use Elvis to get call lists] B --> D[Understand how Elvis works] B --> E[Contribute code or docs] C --> F[Non-Technical Onboarding] D --> G[Technical Onboarding] E --> H[Contributor Onboarding]

See the Onboarding Guide for step-by-step help.

Glossary (Quick Reference)

Elvis Project Concepts (Mindmap)

mindmap root((Elvis)) Usage "Call List" "Seed URL" "User Agent" Architecture "POSIX Shell" "Modular Scripts" "Config in etc/elvisrc" Compliance "robots.txt" "Ethical scraping" Processing "Deduplication" "Validation" "Parser"

See the full Glossary in the Wiki.


Elvis terminal run generating home/calllist.txt
Example run: cloning the repo and generating home/calllist.txt.

Add a screenshot or animated GIF at assets/demo.png showing a typical run or home/calllist.txt sample. Keep images small for mobile readability.


Table of Contents

Wiki

The Elvis Wiki is your beginner-friendly guide to using, configuring, and understanding Elvis. It is organized for non-technical users and covers:

Start here: Elvis Wiki Home

Tip: regenerate an up-to-date TOC with:

grep '^#' README.md | sed 's/^#*/- /'

Overview

Elvis fetches job listings from configured seed URLs, extracts company names and locations using modular AWK/SED parsers, deduplicates results (history- aware), validates output format, and writes a daily home/calllist.txt for sales outreach.


Features


Getting Started

Prerequisites

Install & Quick Start

git clone https://github.com/yourusername/elvis.git cd elvis chmod +x bin/elvis.sh lib/*.sh bin/elvis.sh

Run with --append-history to append newly discovered companies to srv/company_history.txt (the default is not to append; change via APPEND_HISTORY_DEFAULT in etc/elvisrc). When history is updated, Elvis writes a company_history-YYYYMMDDTHHMMSS.patch to var/spool/ for auditing.


Configuration

All runtime configuration is in etc/elvisrc. Notable keys (see USAGE.md):

Testing / CI hooks:


Usage & Validation


Project Directory Tree

Generate a tree with:

find . -type d | sed 's|[^/]*/| |g'

Key folders:


Additional Documentation

Man Page

You can view the manual with:

man ./docs/man/elvis.1

To install for your user:

sh scripts/build_manpage.sh install --user man elvis

Or system-wide (may require sudo):

sh scripts/build_manpage.sh install man elvis

To uninstall:

sh scripts/build_manpage.sh uninstall [--user]

DiΓ‘taxis docs (organized)

See the docs/ folder for more content and examples.

Documentation Standards (short)


Roadmap


Contributing

Please see CONTRIBUTING.md for guidelines on reporting issues, proposing changes, and submitting pull requests. Also review CODE_OF_CONDUCT.md, SECURITY.md, and SUPPORT.md for community and security policies.

Basic expectations:


Support & Community


License

This project is licensed under the GNU Affero General Public License v3.0.


Acknowledgements