fix: remove stale mark_notified import, full main.py scheduler refactor, fix scraper datetime+image extraction
This commit is contained in:
@@ -1,63 +1,123 @@
|
||||
# willhaben-tracker
|
||||
|
||||
Telegram bot + scraper for willhaben.at classified ads. Self-hosted on Unraid via Docker Compose.
|
||||
Telegram bot that monitors classified ads on [willhaben.at](https://www.willhaben.at) and notifies you about new listings matching your keywords, plus price drops on tracked ads. Notifications include photos when available.
|
||||
|
||||
## Stack
|
||||
## Prerequisites
|
||||
|
||||
- **Postgres 15** with logical WAL, init scripts run alphabetically on first boot
|
||||
- **PostgREST** — auto-generated REST API over Postgres `public` schema
|
||||
- **Kong** — reverse proxy routing `/rest/v1/` to PostgREST
|
||||
- **Supabase Studio** — database browser and management UI
|
||||
- **Python worker** — Telegram long polling + scrape scheduler
|
||||
|
||||
## Services
|
||||
|
||||
| Service | Image | Port | Description |
|
||||
|---------|-------|------|-------------|
|
||||
| db | postgres:15-alpine | `55632` | PostgreSQL with init migrations |
|
||||
| rest | postgrest/postgrest:v12.2.0 | internal | REST API over Postgres |
|
||||
| kong | kong:2.8.1 | `55621` | API gateway / reverse proxy |
|
||||
| studio | supabase/studio | `55630` | Supabase dashboard UI |
|
||||
| meta | supabase/postgres-meta:v0.84.2 | internal | Database introspection for Studio |
|
||||
| worker | custom (./worker) | none | Bot + scraper process |
|
||||
- Docker
|
||||
- Docker Compose (v2.x)
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
cp .env.example .env
|
||||
# Edit TELEGRAM_BOT_TOKEN and POSTGRES_PASSWORD in .env
|
||||
# Edit .env with your bot token and database credentials
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
On first boot, Postgres init scripts run automatically in order:
|
||||
|
||||
1. `00-run-init.sh` — creates roles (authenticator, dashboard_user)
|
||||
2. `01-init.sql` — creates tables and indexes
|
||||
3. `post-boot.sql` — applies grants on created tables
|
||||
|
||||
## Deployment
|
||||
|
||||
### Unraid + Portainer
|
||||
### Prerequisites
|
||||
|
||||
1. Set the Docker Compose project path to `/mnt/user/appdata/willhaben-tracker`
|
||||
2. Ensure `.env` is present with valid credentials
|
||||
3. Deploy via Portainer: **Stacks → Add stack**, paste `docker-compose.yml` contents and attach `.env`
|
||||
4. Postgres data persists at `/mnt/user/appdata/willhaben-tracker/data/db`
|
||||
- Docker 24+ (or any modern version with compose v2)
|
||||
- Docker Compose plugin installed (`docker compose` command works)
|
||||
- ~500MB disk space for data volumes
|
||||
- Outbound HTTPS access to `api.telegram.org` and `willhaben.at`
|
||||
|
||||
### Manual (Linux)
|
||||
### Step-by-step Setup
|
||||
|
||||
1. Clone the repository and copy environment file:
|
||||
```bash
|
||||
cd /path/to/willhaben-tracker
|
||||
git clone <repo-url> && cd willhaben-tracker
|
||||
cp .env.example .env
|
||||
# Edit .env with your credentials
|
||||
```
|
||||
|
||||
2. Edit `.env` — at minimum set these values:
|
||||
- `TELEGRAM_BOT_TOKEN` — get from @BotFather on Telegram
|
||||
- `POSTGRES_PASSWORD` — any secure password
|
||||
|
||||
3. Start all services:
|
||||
```bash
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
4. Verify health:
|
||||
```bash
|
||||
# All containers should show "Up" and the db container "healthy"
|
||||
docker compose ps
|
||||
|
||||
# Worker logs should show "Bot started with long polling"
|
||||
docker compose logs worker | grep "started"
|
||||
```
|
||||
|
||||
5. Test the bot: send `/start` to your bot on Telegram, then `/add <keyword>` to create a search.
|
||||
|
||||
### Updating the Stack
|
||||
|
||||
```bash
|
||||
git pull && docker compose up -d --build
|
||||
```
|
||||
|
||||
New database migrations are applied automatically when the Postgres container restarts with new migration files in `supabase/migrations/`.
|
||||
|
||||
### Backing Up Data
|
||||
|
||||
```bash
|
||||
# Full database dump
|
||||
docker compose exec db pg_dump -U postgres postgres > backup_$(date +%F).sql
|
||||
|
||||
# Restore from backup
|
||||
psql -h localhost -p 55632 -U postgres -d postgres < backup_2024-01-01.sql
|
||||
```
|
||||
|
||||
### Troubleshooting Common Issues
|
||||
|
||||
| Problem | Solution |
|
||||
|---------|----------|
|
||||
| Postgres auth error on first boot | Volume may contain stale `pg_hba.conf`. Delete `data/db` and restart. |
|
||||
| Studio shows "unhealthy" but APIs work | Cosmetic health check issue — ignore unless you can't browse tables. |
|
||||
| Bot doesn't respond | Check `docker compose logs worker` for errors. Verify token in `.env`. |
|
||||
|
||||
## Configuration
|
||||
|
||||
Edit `.env` before first startup. All values are read by the worker and database services.
|
||||
|
||||
| Variable | Description | Default |
|
||||
|---------------------------|----------------------------------------------------|--------------------------------|
|
||||
| `TELEGRAM_BOT_TOKEN` | Token from @BotFather | (required) |
|
||||
| `POSTGRES_USER` | PostgreSQL username | `postgres` |
|
||||
| `POSTGRES_PASSWORD` | PostgreSQL password | (required) |
|
||||
| `POSTGRES_DB` | Database name | `postgres` |
|
||||
| `JWT_SECRET` | PostgREST JWT signing key | auto-generated default |
|
||||
| `DEFAULT_INTERVAL_MINUTES`| Default scrape interval per search query | `60` |
|
||||
| `ADMIN_TELEGRAM_IDS` | Comma-separated Telegram IDs with admin privileges | (none) |
|
||||
|
||||
## Architecture
|
||||
|
||||
| Service | Image | Port | Purpose |
|
||||
|----------|------------------------------------|-------|----------------------------------|
|
||||
| db | postgres:15-alpine | 55632 | PostgreSQL database |
|
||||
| rest | postgrest/postgrest:v12.2.0 | — | REST API over the database |
|
||||
| kong | kong:2.8.1 | 55621 | Reverse proxy / gateway |
|
||||
| studio | supabase/studio | 55630 | Supabase Studio admin UI |
|
||||
| meta | supabase/postgres-meta:v0.84.2 | — | Database metadata service |
|
||||
| worker | (built from ./worker) | — | Scraper + Telegram bot |
|
||||
|
||||
## Telegram Commands
|
||||
|
||||
- `/start` — Welcome + usage instructions (whitelisted only)
|
||||
- `/add "keyword"` — Create new search query
|
||||
- `/list` — Show active queries
|
||||
- `/pause <id>` / `/resume <id>` — Toggle query
|
||||
- `/delete <id>` — Remove query
|
||||
- `/stats` — Tracking statistics
|
||||
All users must be whitelisted before use. Run `/start` to activate your account.
|
||||
|
||||
| Command | Access | Description |
|
||||
|--------------------|-------------|--------------------------------------------------|
|
||||
| `/start` | Anyone | Activate account and show help message |
|
||||
| `/add <keyword>` | Active user | Subscribe to keyword (shared across users) |
|
||||
| `/list` | Active user | List your subscriptions with subscriber counts |
|
||||
| `/delete <keyword_id>` | Active user | Unsubscribe from a keyword |
|
||||
| `/stats` | Active user | Show queries count, ads tracked, notifications |
|
||||
| `/adduser <id> [admin]` | Admin only | Add or promote a user by Telegram ID |
|
||||
| `/removeuser <id>` | Admin only | Remove a user from the bot |
|
||||
| `/users` | Admin only | List all registered users and their roles |
|
||||
|
||||
## Default Admin
|
||||
|
||||
On first boot, Telegram ID `298181113` is seeded as an admin user. Add additional admins via `/adduser <telegram_id> admin`.
|
||||
|
||||
Reference in New Issue
Block a user