Behavior-based AI crawler detection proof of concept built with FastAPI and PostgreSQL.
This system analyzes raw server log files and detects AI-style retrieval behavior using deterministic behavioral scoring.
It does NOT rely on user-agent matching.
Instead, it evaluates:
- Crawl depth
- Burst request patterns
- URL repetition
- Sitemap interaction
- Behavioral clustering (0–100 AI score)
Modern AI systems retrieve content differently than traditional search engine crawlers.
This engine demonstrates how AI-style retrieval behavior can be identified purely through behavioral modeling.
Backend:
- FastAPI
- PostgreSQL
- Deterministic scoring engine
Frontend:
- Vanilla JavaScript
- Chart.js
- Dark-mode intelligence dashboard
- Average URL depth
- Burst rate
- Repeated URL ratio
- HTML vs resource request ratio
- Sitemap interaction
pip install -r requirements.txt
Create database: ai_crawler_detector
Run schema: psql -U postgres -d ai_crawler_detector -f schema.sql
Create .env:
DATABASE_URL=postgresql://USER:PASSWORD@localhost:5432/ai_crawler_detector
uvicorn main:app --reload
cd frontend python -m http.server 5500
Visit: http://localhost:5500
This project demonstrates:
- Behavioral AI crawler scoring
- Visual clustering
- Depth vs score correlation
- Burst activity modeling
- Explainable intelligence summary
This is a proof-of-concept and not production hardened.
MIT (or specify if private use)