pranala.org Open ↗ Has Plan
56
DOM 68 PLN 100 REV 10 EAS 30
plan.md
PREVIEW

pranala.org — Indonesia-only Search Engine

Positioning: Mesin pencari web Indonesia yang transparan, lambat tapi jujur. Tagline: Google ranks the web. Pranala ranks Indonesian trust.

Two hard constraints, equal weight:

  1. 100% Cloudflare — Workers + D1 + R2 + KV + Queues + Durable Objects + Cron + Vectorize + Workers AI. Zero origin servers, zero PM2, zero Docker in the request path.
  2. 100% AI automated — every step from crawl to ranking to billing to ad serving to support runs without a human in the loop. The only human surface allowed is a legally-mandated DMCA contact form (scoped below).

Automation Constitution (hard rules)

These rules are enforced in code via src/lib/automation-guard.ts. Any PR that violates them must fail CI.

# Rule Enforcement
A1 No admin UI may have an "Approve" or "Reject" button for content, listings, ads, badges, payouts, or rank decisions. Lint rule: forbid <button>Approve</button> patterns; admin UI is read-only dashboards + AI-action audit log.
A2 No D1 column may be named manual_review, pending_approval, reviewer_id, or equivalent. Schema lint at migration time.
A3 No worker route may require an admin JWT to write production data. All writes are AI-driven via Queue consumer or Cron. Route-level test: every POST/PUT/DELETE traces back to cron, queue, webhook, or verified-self-service.
A4 No Slack/email notification may have "approve here" CTAs. Notifications are post-hoc reports only. Template lint.
A5 Every AI decision must write (decision, score, model, prompt_hash, ts) to ai_decisions D1 table for audit. Wrapper function aiDecide() is the only callable; ESLint forbids direct env.AI.run().
A6 DMCA counter-notice handling is the ONLY allowed human surface, and is bounded to a single inbox processed by external counsel weekly — never inside the worker. Single legal@pranala.org mailbox; nothing else routes to a human.

If a rule needs to break, the PR description must include AUTOMATION-EXCEPTION: <ticket> and the exception is permanent technical debt visible on /admin/debt.


Free-Tier Reality (HARD CONSTRAINT)

Pranala lives entirely on Cloudflare Free tier until revenue forces an upgrade. Slow, honest, cheap. No surprises.

Free-tier ceilings (per CF docs you linked)

Resource Free quota Pranala budget
Workers requests 100K/day ≤ 80K/day for SERP + AC + API combined
Workers CPU (request) 10ms All hot paths must finish < 10ms
Workers CPU (cron) 30s Crawl/index/rank budget lives here
D1 storage 5GB total, 1GB/DB One DB only, ≤ 800MB metadata, HTML in R2
D1 reads 5M/day SERP cached → ≤ 1M reads/day
D1 writes 100K/day Crawl ≤ 10K URLs/day → ≤ 50K writes
KV reads 100K/day AC trie cached in module scope
KV writes 1K/day Trie rebuilt weekly, not daily
KV storage 1GB Trie + config only, never per-page
R2 storage 10GB Gzipped HTML (~5KB avg) → ≤ 2M pages
R2 Class A ops 1M/mo ≤ 33K writes/day
R2 Class B ops 10M/mo Reads cheap, fine
Workers AI 10K Neurons/day Cron-only, ≤ 100 LLM calls/day
Vectorize stored dims 5M/mo 1024d × 5K vectors max
Vectorize query dims 30M/mo ≤ 30K queries/mo (cron AC fallback only)
Analytics Engine free w/ caps Search log + clickthroughs only
Cron triggers 1K invocations/day on free One * * * * * multiplexer = 1440/day → use */2 * * * * (720/day)

Forbidden on free tier

  • Queues (paid only) → replaced with D1 polling work table (task_queue row + claimed_at timestamp + WHERE state='pending' LIMIT 50)
  • Durable Objects (paid only) → replaced with D1 host-throttle row (host_state(host, last_fetch_at, crawl_delay_s), cron checks WHERE last_fetch_at < datetime('now', '-' || crawl_delay_s || ' seconds'))
  • Workers AI per request → all AI moved to cron batch jobs
  • Vectorize on hot path → Vectorize used only in weekly cron to update AC suggestions, never in the request

Growth ladder (upgrade triggers tied to revenue)

Stage URLs indexed Monthly traffic Plan Why upgrade
α (now) 0 → 100K < 50K req/day Free Bootstrap
β 100K → 500K 50–200K req/day Workers Paid $5/mo Need 30s CPU on requests for autocomplete semantic + Queues
γ 500K → 5M 200K–2M req/day + R2 paid + D1 paid HTML storage exceeds 10GB
δ 5M+ 2M+ req/day + Workers AI paid More Llama capacity

Until pranala makes ≥ Rp 75K/mo from flio ad revenue (covers $5 Workers Paid), it stays on Free. Premium subscriptions self-fund the next upgrade tier. No cash burn.


Cloudflare Resource Map (Free tier — single project)

  • Worker: pranala-org → custom domain pranala.org (handles routes + cron in one Worker to stay simple and within request budget)
  • D1: pranala_db (single DB ≤ 1GB; sharding deferred to stage γ)
  • R2: pranala-html (gzipped HTML only, hash-keyed)
  • KV: PRANALA_KV (AC trie, config, parsed robots.txt — single namespace)
  • Vectorize: pranala-ac-v1 (≤ 5K query embeddings, used in cron only) — added at stage β
  • Workers AI (cron only): @cf/baai/bge-m3, @cf/meta/llama-3.3-70b-instruct-fp8-fast, @cf/meta/m2m100-1.2b
  • Analytics Engine: pranala_events
  • Cron: */2 * * * * single multiplexer that dispatches by minute-modulo (crawl, index-roll-up, AC trie rebuild, etc.)
  • Queues → D1 task_queue polling
  • Durable Objects → D1 host_state row throttling

FASE 1 — Bootstrap (semua di tier gratis, semua UI Bahasa Indonesia)

Lingkup yang dikirim:

  1. Worker tunggal dengan router Hono (TS).
  2. Skema D1 minimal: urls, documents, links, host_state, task_queue, submissions, ai_decisions, search_log.
  3. Cron */2 * * * * → minute-modulo dispatcher: crawl 5 URL/tick, index 10 dokumen/tick, AC trie rebuild jam 03:00 WIB.
  4. Halaman beranda mobile-native (Bahasa Indonesia) dengan kotak pencarian + tombol mikrofon + autocomplete trie.
  5. Halaman SERP mobile (Bahasa Indonesia) — bottom nav, kartu hasil, tab filter horizontal.
  6. Endpoint /api/ac — trie + typo (BK-tree) saja di Fase 1; semantic ditunda ke Fase 2 (Vectorize).
  7. Endpoint /api/submit — pengirim domain/sitemap publik, hasilnya di-enqueue di task_queue.
  8. Crawler cron: ambil URL → fetch (UA Mozilla/5.0 (compatible; pranala-bot/1.0; +https://pranala.org/bot)) → simpan HTML gzip ke R2 → ekstrak title/meta/outlink → tulis ke D1 dengan throttle host.
  9. Manifest PWA + Service Worker offline fallback.
  10. Halaman /dmca (form publik) → dmca_intake D1.
  11. Halaman /bot (info crawler) — disebutkan di User-Agent.
  12. Halaman /transparansi — formula peringkat publik, daftar trusted seeds, statistik indeks live.

Yang DITUNDA ke Fase 2+ (butuh Workers Paid):

  • PageRank shard processing (R2 graph shards) — Fase 2
  • Vectorize semantic AC fallback — Fase 2
  • Premium tier billing (Xendit) — Fase 2
  • Owner dashboard + entity badge auto-verifier — Fase 3
  • View Transitions API + Speculation Rules — Fase 1 ya (gratis di browser)
  • Llama-generated reports — Fase 2 (butuh kuota AI lebih)
  • API key issuance — Fase 3

Bahasa: 100% Indonesia. Tidak ada teks Inggris di UI publik. Slogan: "Mesin pencari web Indonesia yang transparan, lambat tapi jujur."



Component Plans + Automation Contracts

1. Free slow crawl

Flow: Cron (1m) → drain N URLs from crawl_queue D1 table → enqueue to pranala-fetch → consumer fetches with Mozilla/5.0 (compatible; pranala-bot/1.0; +https://pranala.org/bot) → store gzipped HTML in R2 (html/{sha256}.gz) → metadata + outlinks to D1 shard → enqueue outlinks back into pranala-discover → discoverer dedupes against seen_urls (D1) → re-enqueues novel ones.

Politeness: HostThrottle DO per host, 1 req/sec default, honors Crawl-delay from robots.txt cached 24h in pranala-robots KV. No per-host concurrency.

Automation contract:

  • Robots.txt parsed by code, never overridden by humans.
  • Recrawl interval is a pure function of (rank_score, last_change_detected, content_hash_age). No manual "force recrawl" button anywhere.
  • Domain blocklist is auto-populated from spam vector hits + repeated 4xx/5xx; entries auto-expire after 30d unless re-flagged.

CF limit fit: consumer batch ≤ 100 URLs × 3 subreq each = 300 < 1000 cap. CPU < 30s. R2 PUT < 1000/invocation.

2. Premium indexing (sell speed, not rank)

Tiers (auto-billed via Xendit):

  • Free: best-effort crawl, no SLA
  • Starter Rp 99K/mo: 1K pages, weekly recrawl, indexing report
  • Pro Rp 499K/mo: 10K pages, daily recrawl, structured-data AI report, broken-link AI report
  • Business Rp 2.5M/mo: 100K pages, hourly recrawl, API access, entity badge auto-issuance

Webhook → activation:

  1. Xendit /webhook/xendit verifies x-callback-token.
  2. INSERT into subscriptions, UPDATE site tier, increment crawl_priority integer.
  3. Crawler scheduler reads crawl_priority DESC first.
  4. Indexing report is generated weekly by Cron → Llama 3.3 70B → markdown → R2 → linked from dashboard.

Automation contract: No human ever sees a payment row. Refund flow: Xendit chargeback webhook auto-disables tier and writes ai_decisions row.

3. Ranking algorithm (offline cron, no human levers)

Rank = 0.30·LinkRank + 0.20·TrustRank + 0.15·EntityRank
     + 0.10·Freshness + 0.10·IndoRelevance + 0.10·QueryRel
     + 0.05·Engagement − SpamPenalty

LinkRank (PageRank at 10M scale on Workers):

  • Adjacency stored as R2 JSONL shards (graph/shard-{0..255}.jsonl), 1M edges/shard.
  • Cron tick: load 1 shard → compute partial rank delta → write to D1 rank_partial → atomic merge.
  • One full iteration ≈ 256 ticks ≈ 256 minutes.
  • 30 iterations to convergence ≈ 5–6 days. Fits CF Worker 5-min cron CPU per tick.

TrustRank: seeds in pranala-config KV trusted_seeds = hardcoded go.id, *.go.id, ojk.go.id, bpom.go.id, *.ac.id, kompas.com, tempo.co, detik.com (curated once at launch, never edited by humans — changes go through a seed_changes.sql migration that requires AUTOMATION-EXCEPTION if added post-launch).

EntityRank: auto-derived from registry verification (see §7).

Freshness: content hash diff timestamp from R2 html/ versioned objects.

IndoRelevance: m2m100 language detect → if id weight = 1.0, if en and .id ccTLD = 0.6, else 0.0. Geo IP of origin server adds bonus.

QueryRel: Vectorize cosine + BM25 over title/h1/anchor.

Engagement: Analytics Engine rollup (CTR, dwell, pogo-stick rate) into D1 engagement_daily.

SpamPenalty: Vectorize cosine to pranala-spam-v1. Threshold ≥ 0.85 = full deindex; 0.70–0.85 = −0.5 rank; < 0.70 = clean. Threshold values in code, not in admin UI.

Automation contract: Weights are constants in src/ranker/weights.ts. Changing them requires a code commit + canary diff report (auto-generated). No runtime knobs.

4. PSE seed graph

One-time D1 seed at launch: ~5K trusted Indonesian domains. Crawler walks outward. No human re-seeds; seed_v2 would be a code commit.

5. SERP (search results page)

  • Worker reads top-K from D1 + Vectorize fan-out.
  • Cache key = sha256(query + lang + region) → KV 60s.
  • Organic results UNION ALL flio ad results, with is_sponsored flag rendered as yellow "Iklan" pill above and below the organic block. Ranker MUST NOT see ad bid as a feature (lint guards bid access in ranker module).
  • Speculation Rules API prerenders top 3 organic links on hover/viewport (<script type="speculationrules">). Result tap feels instant.
  • View Transitions API morphs result card → destination page on tap (where same-origin) and morphs SERP filters (All/Web/Image/News/Q&A) on swipe.
  • Streaming HTML via TransformStream — first 3 results paint < 200ms, rest stream in.

5a. Autocomplete (instant search) — sub-100ms, fully AI-ranked

Endpoint: GET /api/ac?q=<prefix>&lang=id returns JSON {suggestions: [{text, type, score}]} in ≤ 100ms p95 from edge.

Three-stage suggestion pipeline (all AI, zero human-curated lists):

  1. Prefix trie (KV-backed): popular query log rolled up nightly into a compressed trie stored in pranala-cache KV under ac:trie:v{N}. Worker loads once per cold start, pinned in module scope. Returns top-10 prefix matches in <5ms.
  2. Typo tolerance: Damerau-Levenshtein distance ≤ 2 against a hot-vocabulary set (top 100K queries). Implemented as a BK-tree also in KV. ~10ms.
  3. Semantic completion: if prefix length ≥ 4 chars and trie returns < 5 results, embed prefix with @cf/baai/bge-m3 and fan-out Vectorize query against pranala-content-v1 titles. Returns conceptually-related queries even with novel phrasing. ~60ms. Cached per-prefix in KV 5min.

Indonesian-aware tokenization:

  • Stemmer: lightweight Sastrawi-derived rules ported to TS (handles ber-, me-, pe-, -kan, -an, -i affixes).
  • City/region expansion: jktjakarta, sbysurabaya, bdgbandung — table loaded from KV pranala-config:city_aliases (auto-built once from Wikipedia geo data, never hand-edited).
  • Common typo map: gigi → gigit, klinikgigi → klinik gigi (segmentation), apotik → apotek — auto-generated from query log misspellings via Llama 3.3 weekly cron.

Personalization (privacy-respecting):

  • Per-user recent queries stored in localStorage only — never sent to server, never stored in D1.
  • Suggestion re-rank on client side: queries the user has searched before float to top.

Voice autocomplete:

  • SpeechRecognition API with lang="id-ID" and interimResults=true.
  • As partial transcript arrives, fires /api/ac on each pause.
  • "Tap to talk" button uses navigator.vibrate(20) haptic on press.

Trending suggestions (zero-state, when input is empty):

  • Cron rolls up Analytics Engine top-N queries from last 1h/24h/7d into KV ac:trending:v{N}.
  • Indonesia-only filter via geo of original searches.
  • No human ever picks a trending term. If something abusive trends, the spam-vector classifier auto-suppresses it (cosine to spam corpus on the suggestion text).

Automation contract: there is no autocomplete_blocklist table editable by humans. Suppression is purely vector-similarity and abuse-classifier driven, decisions written to ai_decisions.

CF limit fit: trie payload < 2MB (within 25MB KV value cap, well under cold-start budget). BK-tree similar. Vectorize query < 50ms p95. KV read 1ms.

6. flio ads integration

  • pranala registers itself as a flio publisher unit at startup.
  • SERP renders <div data-flio-key="unit_pranala_serp_top" data-flio-mode="native">.
  • All ad logic (bidding, targeting, fraud, payout) handled by flio.net — already 100% AI.
  • Revenue: flio remits to pranala wallet weekly via existing flio payout cron.

7. Site owner dashboard + entity badge (zero human verification)

Domain ownership verification: DNS TXT record pranala-verify=<token> OR /.well-known/pranala-{token} file fetch. Worker fetches and verifies. Auto-grants ownership.

Entity badge — AI-only verification chain:

  1. NPWP regex match → cek pajak.go.id public NPWP validator endpoint.
  2. PT/CV → ahu.go.id company name search → fuzzy match.
  3. Fintech → OJK whitelist scrape (cached in KV daily).
  4. Food/cosmetic/drug → BPOM lookup.
  5. Healthcare → Kemenkes faskes registry.
  6. All required passes? → auto-issue badge + write ai_decisions. Any fail? → auto-deny + write reason. Owner sees AI-generated explanation, can re-submit after fixing — but no human ever reviews.

Automation contract: there is no verifications.reviewer_id column.

8. API (auto-issued)

  • Signup → auto-generate API key (32 bytes hex).
  • RateLimitDO enforces per-key QPS and monthly quota.
  • No human ever provisions a key.

9. Anti-abuse (AI-only)

  • Spam detection: Vectorize cosine + Llama classify combo.
  • Cloaking detection: render Worker fetches with Googlebot UA vs pranala-bot UA, diff > 30% by tokens → auto-flag.
  • Click fraud on flio ads: handled by flio's existing CF Turnstile + bot management layer.

10. DMCA — the single legal carve-out

Indonesian UU ITE / Permenkominfo 5/2020 requires a reachable contact for takedown counter-notices. This is the ONLY human-readable surface.

  • Public form /dmca posts to dmca_intake D1 table.
  • AI auto-classifies notice validity (URL exists, claimant info complete, sworn statement present). If valid → auto-deindex matching URLs within 1 hour and email claimant.
  • Counter-notice form posts to dmca_counter. The dmca_counter rows are emailed weekly to legal@pranala.org (external counsel) — no in-app review UI exists.
  • Counsel responds via email; their reply is processed by an inbound email worker (Cloudflare Email Routing → Worker) that parses an HMAC-signed verdict token. No human clicks "approve" inside pranala's UI.

This is the maximum tolerated human contact: 1 mailbox, 1 weekly digest, decisions returned via signed token. Everything else is forbidden by Constitution rule A6.


CF Limits Compliance Matrix

CF limit Worst-case load Mitigation
Worker CPU 30s req SERP fan-out ≤ 200ms Vectorize + KV cache 60s
Worker CPU 5min cron PageRank shard tick 1 shard / tick, 256 shards
Subrequests 1000 Crawler batch ≤ 100 URLs × 3 subreq
D1 10GB/DB 10M URLs metadata only Sharded 10 DBs by URL hash
D1 ~100 SQL bind vars Bulk inserts Chunk to 80
D1 30s query Joins Pre-denormalized hot tables
KV 1 write/sec/key Counters Counters live in DOs, never KV
KV 25MB value HTML HTML never in KV; goes to R2
Queues 100 msg/batch Crawl fanout Re-enqueue, not recursion
Cron 250/account Multiple schedulers Single * * * * * multiplexer + DO routing
Workers AI rate-limit/model Reports Queue consumer, never inline
Vectorize 5M vectors/index 10M docs 2 named indexes by year shard
R2 list ops cost URL discovery Never list; index in D1


Mobile-Native UI Standards

Pranala SERP must feel like a native iOS/Android app, not a desktop search page. All standards from /home/ucok/CLAUDE.md apply, plus search-specific patterns below.

Layout

  • Sticky compact header (52px): logo, search input, voice mic, profile avatar. Auto-hides on scroll-down, reveals on scroll-up via IntersectionObserver.
  • Bottom navigation (fixed, 64px + safe-area-inset-bottom): Cari | Trending | Riwayat | Tersimpan | Akun. Active state = filled icon + label.
  • Search input always front-and-center — never behind a hamburger.
  • NO hamburger menu anywhere.
  • Pull-to-refresh on result lists via touchstart/touchmove + CSS transform.
  • Skeleton loaders (gray shimmer cards matching result-card shape) — no spinners.

Search box (the killer surface)

  • Full-width pill with 16px radius, 48px tall, system font 17px (no zoom-on-focus on iOS).
  • Live autocomplete dropdown drops below input, full-width on mobile, max-height 60vh, scroll-snaps each item to 56px tap target.
  • Each suggestion row: leading icon (clock for recent, fire for trending, sparkle for AI semantic, location for places), text, trailing arrow → on tap fills input; arrow tap = submit.
  • Voice mic button inside the pill on the right, 44×44px, animates to pulsing red circle while listening.
  • Long-press on a recent query = bottom sheet with Hapus / Bagikan / Cari di tab baru.
  • Swipe left on a recent query = delete with undo toast.
  • Esc / swipe-down on dropdown closes it; on iOS, inputmode="search" shows the right keyboard with a "Cari" key.

Filter tabs (horizontal swipe, no taps required)

  • Below the search header: Semua · Web · Gambar · Berita · Tanya Jawab · Maps · Toko · Tokoh.
  • Native horizontal scroll-snap container (scroll-snap-type: x mandatory).
  • Buttons render at edges (left/right chevron) when overflow, per CLAUDE.md feedback_horizontal_slide rule. Pure CSS detection via :has(.snap-overflow).
  • Tab swipe triggers View Transition (cross-fade + slide).

Result cards

  • 16px corner radius, subtle shadow, 12px gap, full-bleed thumbnail when available.
  • Each card shows: favicon · domain · title · snippet · meta row (time, geo, entity badge if any).
  • Long-press = bottom-sheet menu: Buka / Buka di tab baru / Bagikan (uses navigator.share()) / Salin tautan / Tidak relevan (writes negative-feedback row → ranker training signal).
  • Swipe right on a card = save to Tersimpan (offline-readable via Service Worker cache).
  • Tap triggers View Transition where the favicon morphs into the destination page header.

Image search

  • Masonry grid via CSS column-count (1 col mobile, 2 tablet, 3 desktop).
  • Tap → fullscreen lightbox with pinch-zoom (CSS touch-action: pinch-zoom).
  • Stories-style horizontal swipe between images, dot indicators top.
  • Long-press = save / share / report.

Maps tab

  • Leaflet + offline tiles served from R2 + Cloudflare cache.
  • "Use my location" = navigator.geolocation with enableHighAccuracy: false (battery-friendly).
  • Result pins clustered, tap = peek card slides up from bottom (50% sheet → drag up = full).

Native cues

  • System font stack: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif.
  • Safe-area insets on top, bottom, left (for landscape notch).
  • Tap targets ≥ 48×48px.
  • Spacing 12px between interactive elements.
  • touch-action: manipulation to kill 300ms tap delay.
  • prefers-reduced-motion: reduce disables View Transitions.
  • Dark mode auto via prefers-color-scheme.
  • Haptic feedback on pull-to-refresh, voice-mic press, save-toast: navigator.vibrate(15).

Advanced web platform features (progressive enhancement)

Feature Use
View Transitions API SERP ↔ result page, filter tab switches
Speculation Rules API Prerender top 3 organic links
Navigation API Back/forward feels instant, no white flash
navigator.share() Native share sheet for results
SpeechRecognition (id-ID) Voice input, real-time transcript
SpeechSynthesis (id-ID) Read result snippets aloud (accessibility + voice answers)
Web Push API Optional: notify when a saved query has new results
Service Worker Network-first for /api/*, cache-first for static, offline fallback page, saved-results offline access
Web App Manifest display: standalone, theme color, 192/512 icons, share-target API so users can share TO pranala
Web Share Target API Pranala becomes a destination in Android share sheets — share-to-search any URL/text
Background Sync Saved queries refresh in background
content-visibility: auto Off-screen result cards skip layout/paint
scroll-snap Horizontal carousels and image gallery
@view-transition CSS Page-level navigation transitions
CSS container queries Result card adapts to grid/list density
CSS :has() Style header based on whether search has focus, etc.
103 Early Hints Preload critical CSS + autocomplete trie before HTML response
Brotli Auto on Cloudflare
AVIF + WebP with <picture> Image search thumbs

PWA installability

  • Manifest at /manifest.webmanifest.
  • After 2nd visit: bottom-sheet promo to "Pasang Pranala" (Add to Home Screen).
  • Standalone display: hide URL bar, full-bleed brand color.
  • Share Target: register pranala as receiver of text/plain and text/uri-list — share any link from Chrome/WA → pranala opens with that URL pre-loaded as a "more like this" semantic search.

Accessibility (WCAG 2.1 AA)

  • Visible focus rings via :focus-visible.
  • Skip-to-content link as first focusable element.
  • ARIA live region announces "X hasil ditemukan" after each search.
  • Screen reader labels in Indonesian (aria-label="Tombol mikrofon untuk pencarian suara").
  • Contrast ratios audited in CI.
  • Text scales to 200% without horizontal scroll.

Performance budget (enforced in CI)

  • LCP < 2.0s on 3G Indonesia (target, not 2.5s).
  • INP < 100ms (interaction).
  • CLS < 0.05.
  • JS bundle < 50KB gzipped on critical path; rest lazy-loaded.
  • Autocomplete p95 < 100ms edge.
  • Lighthouse Mobile ≥ 95 across all four scores.

Build Order

  1. Worker scaffold + wrangler.toml (multi-env, multi-D1 binding)
  2. D1 schema migrations: ctrl + 10 idx shards (sharding helper module)
  3. Trusted-seed loader + crawl_queue seed
  4. HostThrottle Durable Object
  5. Crawler queue consumer (pranala-org-crawler)
  6. HTML parser + outlink extractor
  7. Indexer queue consumer + Vectorize embedder
  8. Ranker cron multiplexer + R2 graph shard writer
  9. PageRank/TrustRank/EntityRank shard processors
  10. SERP route + KV cache + flio ad slot
  11. Autocomplete pipeline — trie builder cron, BK-tree typo, semantic Vectorize fallback, voice input, trending zero-state
  12. Mobile UI shell — sticky header, bottom nav, skeleton loaders, View Transitions, Speculation Rules, scroll-snap filter tabs, swipe-saveable result cards, long-press bottom sheets
  13. PWA layer — manifest, Service Worker (network-first API, cache-first static), Share Target receiver, Web Push opt-in
  14. Indonesian-aware tokenizer — Sastrawi-derived stemmer + city aliases + auto-typo map (Llama weekly cron)
  15. Owner dashboard (DNS-TXT auto-verify)
  16. Entity badge auto-verifier (5 registry adapters)
  17. Premium tier Xendit webhook + crawl_priority
  18. AI report generator (Llama 3.3 70B → R2 markdown)
  19. API key issuance + RateLimitDO
  20. Spam vectorize index + cron classifier
  21. DMCA intake + AI classifier + email-routing worker
  22. automation-guard.ts lint suite + CI gate
  23. CI perf budget gate (Lighthouse mobile ≥ 95, autocomplete p95 < 100ms)

Automation acceptance gate (must pass before launch)

  • grep -rn "Approve\|Reject\|reviewer_id\|manual_review" src/ → 0 hits
  • All POST/PUT/DELETE routes traced to cron|queue|webhook|verified-self-service (automated trace test)
  • ai_decisions table has rows for last 24h covering ≥ 95% of state changes
  • No human Slack/email alert says "click to approve"
  • DMCA flow: only legal@pranala.org mailbox is human; counter-notice replies are HMAC-signed tokens
  • Lint rule blocks env.AI.run direct calls outside aiDecide() wrapper
  • Admin dashboard renders read-only — <form> count = 0 except /dmca public intake
  • Autocomplete: no autocomplete_blocklist table; suppression is vector-classifier only
  • Trending suggestions sourced from Analytics Engine rollup, not human picks
  • Mobile Lighthouse ≥ 95 across all four categories (CI gate)
  • Autocomplete p95 < 100ms from edge (synthetic check in CI)
  • PWA installable check passes (manifest + SW + HTTPS)
  • Share Target API registered (verified by intent simulation)
  • No hamburger menus anywhere — bottom nav only
  • Voice search works in Chrome Android + Safari iOS (lang=id-ID)

⚙ HARD CONSTRAINTS (enforced for all sites)

This domain MUST operate within these constraints — no exceptions:

  • 100% Cloudflare serverless — Workers + D1 + R2 + KV + Workers AI + Vectorize. NEVER PM2, NEVER VPS, NEVER Docker in production path.
  • 100% AI-automated — every customer interaction, every moderation decision, every transaction reconcile = AI. No manual queue, no live human chat support, no physical fulfillment.
  • 1-operator solo — one person can run the entire operation from a phone. No team meetings, no shared inbox, no shift rotation.
  • WhatsApp AI bot for all support (24/7, instant response, no SLA promises that need humans).
  • Mayar QRIS for all Indonesian payments (subscription auto-renew, no manual invoicing).
  • Indonesian UI primary — bahasa-first, English fallback only where unavoidable.
  • Privacy — opt-in only, delete-on-request honored within 24h (cron-driven).
  • No physical goods, no inventory — digital products + affiliate referrals only.

If the plan above describes any flow that violates these constraints, treat the plan as ASPIRATIONAL only and rework before building. The constraint trifecta wins.

AI ASSISTANT

Ask AI to research, improve, or generate content.

Try: "Research competitors for this niche"

Actions