Edge Architecture for Copyright Documentation

Designing a Quota-Governed Edge Microservice

A private case study in edge-native architecture, abuse-aware quota design, and intentional scope containment.

Problem Framing

YouTube’s copyright dispute system requires structured documentation.

Pixabay provides royalty-free music with clear licensing terms.

Neither system integrates with the other.

Creators using Pixabay tracks who receive automated Content ID claims must manually reconstruct licensing metadata into the format YouTube expects. That process is:

Manual
Inconsistent
Error-prone

TrackDataPad formalizes the transformation layer between these systems. Nothing more.

System Behavior

The system performs a single deterministic transformation:

Pixabay Metadata
      ↓
Input Validation
      ↓
License Tier Check
      ↓
Quota Enforcement
      ↓
Structured Documentation Output

There is:

No probabilistic reasoning
No legal interpretation
No automated dispute submission

This narrow scope is deliberate.

Architecture

Deployment Model

┌────────────────────────┐
│     Cloudflare Pages   │
│      (Static UI)       │
└─────────────┬──────────┘
              │
              ▼
┌────────────────────────┐
│  Cloudflare Worker API │
│  - Validation          │
│  - License Check       │
│  - Quota Enforcement   │
│  - Record Generation   │
└─────────────┬──────────┘
              │
┌─────────────┴─────────────┐
▼                           ▼
┌─────────────────┐   ┌─────────────────┐
│  Cloudflare KV  │   │     Stripe      │
│ - License State │   │  Webhook Events │
│ - Usage Counters│   │ - Plan Upgrade  │
└─────────────────┘   └─────────────────┘

Stack

Layer	Technology	Role
Frontend	Cloudflare Pages	Static UI, CDN-distributed
Compute	Cloudflare Workers	Stateless request handling
State	Cloudflare KV	License state + usage counters
Billing	Stripe Webhooks	Plan activation + upgrades

No centralized server. No containers. No relational database.

Core Design Decisions

1. Stateless Compute

Each request is processed independently:

Validate input
Read license state (KV)
Increment usage counter with server-side quota enforcement
Generate structured record
Return output

No session state. No in-memory persistence.

Workers auto-scale at the edge with zero infrastructure management.

2. Quota Governance as a Security Primitive

Daily caps serve two roles that are often conflated but must be kept distinct:

Business logic — tier differentiation, revenue protection, usage signaling.

System integrity — abuse prevention, metadata scraping mitigation, resource protection.

Because the API exposes a structured metadata transformation endpoint, it would be trivially scrapeable without enforcement. Quota limits enforced via KV counters, not just checked but incremented atomically on each request, functioning as a rate-limiting layer independent of client behavior.

URL validation is treated the same way. Each incoming Pixabay URL is validated against an allowlist before any fetch is attempted, accepting only pixabay.com and clean single-level subdomains matching /^[a-z0-9-]+\.pixabay\.com$/. This blocks hostname-injection patterns that pass naive suffix checks: 10.0.0.1.pixabay.com ends with .pixabay.com but is not Pixabay. The rule is enforced at three independent points in the request pipeline, not once at the entry point.

This was designed before growth, not retrofitted after abuse.

Illustrative tier model:

Tier	Daily Cap	Annual Price
Free	10	N/A
Standard	200	$29

Lower caps + intentional pricing reduce abuse surface while preserving margin. The cap-to-price ratio was chosen to raise the cost of automated abuse and signal intentionality to legitimate users.

3. Deterministic Transformation Philosophy

The system was designed around a single constraint:

Perform a well-defined transformation — nothing more.

TrackDataPad:

Does not interpret legal meaning
Does not generate new claims
Does not automate platform actions

It takes publicly available metadata and returns a structured documentation record suitable for dispute workflows.

Benefits:

Predictable behavior across all inputs
Minimal compute complexity
Clear legal boundary: the system structures data, it does not assert authority
Fully traceable output, directly derived from input

The value lies in precision and constraint, not in expanding beyond the problem it was built to solve.

Failure Mode Analysis

Risk	Likelihood	Impact	Mitigation
Stripe webhook delay	Low	Temporary activation lag	Non-critical path
KV eventual consistency	Very Low	Minor counter drift	Within quota tolerance
Pixabay metadata change	Medium	Record generation failure	Modular parser
YouTube form change	Medium	Documentation mismatch	Field abstraction
SSRF via hostname injection	Low	Internal fetch to attacker-controlled IP	Allowlist regex enforced at three independent pipeline points

Metadata parsing is isolated behind a versioned adapter layer to reduce coupling to upstream changes.

Primary risk vector is upstream dependency, not internal complexity.

Scalability Profile

The system supports significant load growth without architectural changes:

Workers scale automatically
Pages CDN-distributed globally
KV optimized for read-heavy workloads

Structured logs and deployment-level monitoring provide visibility into quota enforcement and upstream parsing errors.

Binding constraint: abuse rate, not throughput.

If needed at higher scale:

Durable Objects or D1 for stronger counter consistency
Structured logging and observability layer

What This Project Demonstrates

Edge-native architecture without server management
Abuse-aware quota design as a first-class constraint
Stripe webhook integration for license state transitions
Dependency risk modeling
Scope containment under product pressure
High-margin micro-SaaS design

This is intentionally a small system.
Its strength lies in what was deliberately left out.

Private repository. Architecture discussions and engineering collaborations welcome.