OpenFly & OpenFly-Agent
What it is
OpenFly is a platform and benchmark for outdoor aerial Vision–Language Navigation (VLN): a UAV follows natural-language instructions and uses egocentric vision to decide flight actions. The work (arXiv:2502.18041, ICLR 2026 on OpenReview) provides:
A data-generation toolchain (point clouds, semantic segmentation, trajectories, instructions) using multiple simulators and renderers (Unreal / AirSim, GTA V, Google Earth, 3D Gaussian Splatting, etc.).
A large-scale aerial VLN dataset (on the order of 100k trajectories, 18 scenes, varied altitude and path length).
OpenFly-Agent — a keyframe-aware VLN model (derived from the OpenVLA line) that emphasizes informative frames to improve success rates vs baselines in the published evaluation.
For Konnex, this matches the Drone navigation workload: text mission → visual observations → flight decisions that validators can score against a signed task and a PoPW sensor bundle.
OpenFly-Agent (at a glance)
Inputs: language instruction, current images, and history keyframes (as in the public architecture).
Outputs: action prediction for the VLN head; the published real-robot setup pairs the policy with a separate local planner and MPC for tracking (see the paper for the full stack).
OpenVLA: OpenFly-Agent is described as a full fine-tune from an OpenVLA checkpoint for aerial VLN. Use the upstream README for
unnorm_key, tokenizer, and weight paths.
Official resources
Project page
Paper
Weights
Listed on Hugging Face in the repository / model card (e.g. community repos naming openfly-agent; names can change)
Follow the upstream repo for CUDA, flash-attn, dlimp, and licensing. GPU memory requirements are release-specific.
Integration sketch for miners
Receive a text mission (and any subnet schema for altitude, geofence, or safety).
Run OpenFly-Agent (or a fine-tune) on timestamped camera frames; connect outputs to your planner / FCU bridge as your stack requires.
Submit model outputs and signed telemetry for validator scoring and PoPW per subnet API rules.
See also
Last updated

