Managed multi-tenant inference for teams running CV on gigapixel aerial imagery. Bring your own model or start with a tuned YOLO baseline. Per-tenant cost attribution — GPU-seconds, tiles, latency — baked into the API, not bolted on in Grafana.
Tile-aware scheduling, georeferenced outputs, GPU sharing — running on Heronflux right now.
Python SDK, HTTP, or async webhook. Tag every job with a tenant — Heronflux meters per-tile and per-GPU-second automatically.
# pip install heronflux from heronflux import Client client = Client(api_key=...) job = client.jobs.create( model="heronflux/solar-defects-yolov11", inputs=["s3://acme/flights/2026-05/*.tif"], tenant="midwest-power-and-light", geo=True, ) result = client.jobs.wait(job.id) # result.detections — GeoJSON # result.usage — per-tenant cost & tiles
Horizontal inference (Modal, Replicate, SageMaker) doesn't know what a tile is. Vertical drone-data SaaS doesn't expose the inference layer. Heronflux is the missing layer between them — built for aerial CV, and only aerial CV.
Tile-aware batching, weight caching across tenants, and GPU sharing with no quality loss. The substrate that makes per-customer inference viable instead of margin-eating.
Tiling, projection, and georeferencing are first-class. Inputs can be 50,000×50,000 pixels without you writing a single line of tiling code.
Per-tenant GPU quotas, weight isolation, and line-item cost attribution. Bill your end-customers what they actually consumed — to the cent.
Start in minutes with a tuned YOLO baseline. Or upload your own YOLOv8 / YOLOv11 weights and deploy them on a dedicated GPU endpoint of your chosen tier — small, standard, or fast. Cold-start, weight caching, and per-deployment scale-to-zero handled.
Drop into an existing pipeline, not replace it.
Use the tuned YOLOv8n baseline out of the box, or upload your own YOLOv8 / YOLOv11 .pt weights. Pick a GPU tier per deployment — your model gets its own RunPod-backed endpoint with scale-to-zero idle.
COG, GeoTIFF, S3 URI, or a flight folder. Heronflux tiles with the right overlap, schedules across the fleet, and respects per-tenant quotas automatically.
Bounding boxes and classifications reprojected to WGS84. Export as GeoJSON, CSV, or KML — or receive a signed JSON payload at your webhook URL the moment a job finishes.
From solar O&M scans to autonomous perimeter security — anywhere aerial imagery hits a model and a tenant gets billed.
Heronflux tracks GPU-time, tile-count, and inference latency per tenant — automatically. Bill your end-customers from real consumption data, not estimates.
Stop over-provisioning to cover the long tail. Start running tight margins on a fleet that does only what it needs to.
Upload your weights. Pick a tier. We provision a dedicated RunPod-backed endpoint for that (model, GPU) pair, scaled-to-zero by default — you only pay when a job is actually running.
The integration plumbing engineering buyers expect from an inference platform — not an afterthought, not a vertical workflow trap.
HMAC-SHA256-signed POST to your URL the moment a job hits a terminal state. Header: X-Heronflux-Signature: sha256=…. Verify with a four-line snippet.
Compare two completed jobs on the same project. We compute IoU-matched detections and split into matched / added / removed — perfect for "did this defect grow?"
Upload a KML of your towers, panels, or pads. Polygons / points / lines render alongside detections on the map. Tagging detections to assets is on the v1.1 roadmap.
GeoJSON for QGIS / ArcGIS. CSV for spreadsheets. KML for Google Earth. Print → PDF for client deliverables.
Every job carries a tenant tag. Per-tile and per-GPU-second metering rolls up to a real cost line — bill your end-customer from consumption, not estimates.
Mark false positives with one click. Exports and downstream counts respect dismissed rows. Restore anytime.
See how Heronflux handles your imagery, your models, your tenants. We'll walk you through a live workload in 30 minutes.