# OpenFly & OpenFly-Agent

## What it is

**OpenFly** is a **platform and benchmark** for **outdoor aerial Vision–Language Navigation (VLN)**: a UAV follows **natural-language** instructions and uses **egocentric vision** to decide **flight** actions. The work ([arXiv:2502.18041](https://arxiv.org/abs/2502.18041), [ICLR 2026](https://openreview.net/forum?id=OKm3w71ymP) on OpenReview) provides:

* A **data-generation toolchain** (point clouds, semantic segmentation, trajectories, instructions) using multiple simulators and renderers (Unreal / AirSim, GTA V, Google Earth, 3D Gaussian Splatting, etc.).
* A **large-scale** aerial VLN **dataset** (on the order of **100k** trajectories, **18** scenes, varied altitude and path length).
* **OpenFly-Agent** — a **keyframe-aware** VLN model (derived from the **OpenVLA** line) that emphasizes informative frames to improve success rates vs baselines in the published evaluation.

For Konnex, this matches the [Drone navigation](/subnets-workload-classes/drone-navigation.md) workload: text mission → visual observations → flight decisions that validators can score against a signed task and a [PoPW](/understand-konnex/contracts-and-popw.md) sensor bundle.

## OpenFly-Agent (at a glance)

* **Inputs:** language instruction, current images, and **history** keyframes (as in the public architecture).
* **Outputs:** action prediction for the VLN head; the published **real-robot** setup pairs the policy with a **separate** local **planner** and **MPC** for tracking (see the paper for the full stack).
* **OpenVLA:** OpenFly-Agent is described as a **full fine-tune** from an **OpenVLA** checkpoint for aerial VLN. Use the [upstream README](https://github.com/SHAILAB-IPEC/OpenFly-Platform) for `unnorm_key`, tokenizer, and weight paths.

## Official resources

| Resource     | URL                                                                                                                   |
| ------------ | --------------------------------------------------------------------------------------------------------------------- |
| Project page | [shailab-ipec.github.io/openfly](https://shailab-ipec.github.io/openfly/)                                             |
| Paper        | [arXiv:2502.18041](https://arxiv.org/abs/2502.18041)                                                                  |
| Code         | [github.com/SHAILAB-IPEC/OpenFly-Platform](https://github.com/SHAILAB-IPEC/OpenFly-Platform)                          |
| Weights      | Listed on Hugging Face in the repository / model card (e.g. community repos naming `openfly-agent`; names can change) |

> Follow the upstream repo for **CUDA**, **flash-attn**, **dlimp**, and **licensing**. GPU **memory** requirements are **release-specific**.

## Integration sketch for miners

1. Receive a **text mission** (and any subnet **schema** for altitude, geofence, or safety).
2. Run **OpenFly-Agent** (or a fine-tune) on timestamped **camera** frames; connect outputs to your **planner** / **FCU** **bridge** as your stack requires.
3. Submit model outputs and **signed** **telemetry** for **validator** scoring and **PoPW** per **subnet** **API** rules.

## See also

* [OpenVLA](/supported-ai-models/openvla.md)
* [AI verifier](/supported-ai-models/verifier.md)
* [Drone navigation](/subnets-workload-classes/drone-navigation.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.konnex.world/supported-ai-models/openfly.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
