fix: remove stale mark_notified import, full main.py scheduler refactor, fix scraper datetime+image extraction

This commit is contained in:
2026-06-17 08:27:34 +02:00
parent b93811bb1b
commit a21c310eeb
5 changed files with 752 additions and 120 deletions
+99 -39
View File
@@ -1,63 +1,123 @@
# willhaben-tracker
Telegram bot + scraper for willhaben.at classified ads. Self-hosted on Unraid via Docker Compose.
Telegram bot that monitors classified ads on [willhaben.at](https://www.willhaben.at) and notifies you about new listings matching your keywords, plus price drops on tracked ads. Notifications include photos when available.
## Stack
## Prerequisites
- **Postgres 15** with logical WAL, init scripts run alphabetically on first boot
- **PostgREST** — auto-generated REST API over Postgres `public` schema
- **Kong** — reverse proxy routing `/rest/v1/` to PostgREST
- **Supabase Studio** — database browser and management UI
- **Python worker** — Telegram long polling + scrape scheduler
## Services
| Service | Image | Port | Description |
|---------|-------|------|-------------|
| db | postgres:15-alpine | `55632` | PostgreSQL with init migrations |
| rest | postgrest/postgrest:v12.2.0 | internal | REST API over Postgres |
| kong | kong:2.8.1 | `55621` | API gateway / reverse proxy |
| studio | supabase/studio | `55630` | Supabase dashboard UI |
| meta | supabase/postgres-meta:v0.84.2 | internal | Database introspection for Studio |
| worker | custom (./worker) | none | Bot + scraper process |
- Docker
- Docker Compose (v2.x)
## Quick Start
```bash
cp .env.example .env
# Edit TELEGRAM_BOT_TOKEN and POSTGRES_PASSWORD in .env
# Edit .env with your bot token and database credentials
docker compose up -d --build
```
On first boot, Postgres init scripts run automatically in order:
1. `00-run-init.sh` — creates roles (authenticator, dashboard_user)
2. `01-init.sql` — creates tables and indexes
3. `post-boot.sql` — applies grants on created tables
## Deployment
### Unraid + Portainer
### Prerequisites
1. Set the Docker Compose project path to `/mnt/user/appdata/willhaben-tracker`
2. Ensure `.env` is present with valid credentials
3. Deploy via Portainer: **Stacks → Add stack**, paste `docker-compose.yml` contents and attach `.env`
4. Postgres data persists at `/mnt/user/appdata/willhaben-tracker/data/db`
- Docker 24+ (or any modern version with compose v2)
- Docker Compose plugin installed (`docker compose` command works)
- ~500MB disk space for data volumes
- Outbound HTTPS access to `api.telegram.org` and `willhaben.at`
### Manual (Linux)
### Step-by-step Setup
1. Clone the repository and copy environment file:
```bash
cd /path/to/willhaben-tracker
git clone <repo-url> && cd willhaben-tracker
cp .env.example .env
# Edit .env with your credentials
```
2. Edit `.env` — at minimum set these values:
- `TELEGRAM_BOT_TOKEN` — get from @BotFather on Telegram
- `POSTGRES_PASSWORD` — any secure password
3. Start all services:
```bash
docker compose up -d --build
```
4. Verify health:
```bash
# All containers should show "Up" and the db container "healthy"
docker compose ps
# Worker logs should show "Bot started with long polling"
docker compose logs worker | grep "started"
```
5. Test the bot: send `/start` to your bot on Telegram, then `/add <keyword>` to create a search.
### Updating the Stack
```bash
git pull && docker compose up -d --build
```
New database migrations are applied automatically when the Postgres container restarts with new migration files in `supabase/migrations/`.
### Backing Up Data
```bash
# Full database dump
docker compose exec db pg_dump -U postgres postgres > backup_$(date +%F).sql
# Restore from backup
psql -h localhost -p 55632 -U postgres -d postgres < backup_2024-01-01.sql
```
### Troubleshooting Common Issues
| Problem | Solution |
|---------|----------|
| Postgres auth error on first boot | Volume may contain stale `pg_hba.conf`. Delete `data/db` and restart. |
| Studio shows "unhealthy" but APIs work | Cosmetic health check issue — ignore unless you can't browse tables. |
| Bot doesn't respond | Check `docker compose logs worker` for errors. Verify token in `.env`. |
## Configuration
Edit `.env` before first startup. All values are read by the worker and database services.
| Variable | Description | Default |
|---------------------------|----------------------------------------------------|--------------------------------|
| `TELEGRAM_BOT_TOKEN` | Token from @BotFather | (required) |
| `POSTGRES_USER` | PostgreSQL username | `postgres` |
| `POSTGRES_PASSWORD` | PostgreSQL password | (required) |
| `POSTGRES_DB` | Database name | `postgres` |
| `JWT_SECRET` | PostgREST JWT signing key | auto-generated default |
| `DEFAULT_INTERVAL_MINUTES`| Default scrape interval per search query | `60` |
| `ADMIN_TELEGRAM_IDS` | Comma-separated Telegram IDs with admin privileges | (none) |
## Architecture
| Service | Image | Port | Purpose |
|----------|------------------------------------|-------|----------------------------------|
| db | postgres:15-alpine | 55632 | PostgreSQL database |
| rest | postgrest/postgrest:v12.2.0 | — | REST API over the database |
| kong | kong:2.8.1 | 55621 | Reverse proxy / gateway |
| studio | supabase/studio | 55630 | Supabase Studio admin UI |
| meta | supabase/postgres-meta:v0.84.2 | — | Database metadata service |
| worker | (built from ./worker) | — | Scraper + Telegram bot |
## Telegram Commands
- `/start` — Welcome + usage instructions (whitelisted only)
- `/add "keyword"` — Create new search query
- `/list` — Show active queries
- `/pause <id>` / `/resume <id>` — Toggle query
- `/delete <id>` — Remove query
- `/stats` — Tracking statistics
All users must be whitelisted before use. Run `/start` to activate your account.
| Command | Access | Description |
|--------------------|-------------|--------------------------------------------------|
| `/start` | Anyone | Activate account and show help message |
| `/add <keyword>` | Active user | Subscribe to keyword (shared across users) |
| `/list` | Active user | List your subscriptions with subscriber counts |
| `/delete <keyword_id>` | Active user | Unsubscribe from a keyword |
| `/stats` | Active user | Show queries count, ads tracked, notifications |
| `/adduser <id> [admin]` | Admin only | Add or promote a user by Telegram ID |
| `/removeuser <id>` | Admin only | Remove a user from the bot |
| `/users` | Admin only | List all registered users and their roles |
## Default Admin
On first boot, Telegram ID `298181113` is seeded as an admin user. Add additional admins via `/adduser <telegram_id> admin`.