🔗OpenVLA · OFT
1. What is OpenVLA-OFT?
OpenVLA-OFT is a set of methods and code for parameter-efficient fine-tuning (OFT = Orthogonal/Offset Fine-Tuning) on top of the base OpenVLA 7B model. The goal is to cheaply adapt the same model to new domains/tasks by training small add-on parameterizations instead of touching all weights.
Official resources:
Website:
https://openvla-oft.github.ioCode:
https://github.com/moojink/openvla-oft
Conceptually:
Start with the base 7B OpenVLA model.
Add small OFT components (akin to PEFT/LoRA-family ideas).
Train only those components on new data (e.g., LIBERO variants or new scenes).
At inference, the base model + OFT layers together provide adapted behavior.
2. Architecture and differences vs. “vanilla” OpenVLA
Key differences:
Depends on a custom Transformers fork:
transformers-openvla-oftthat knows how to load/use OFT layers.Checkpoints like
moojink/openvla-7b-oft-finetuned-libero-10bundle the base model plus OFT parameters.The
openvla-oftrepository contains:training scripts,
evaluation scripts on LIBERO,
configs for different modes (e.g., L1 regression vs. diffusion action decoders).
For KNX, this means you can:
Efficiently adapt model behavior to your sim/robots without full retraining.
Use the existing OFT checkpoints as strong baselines on LIBERO and extend from there.
3. Installation: a working pipeline
Below is a setup that ran run_libero_eval.py end-to-end without errors.
3.1. Conda environment
3.2. PyTorch
Check:
3.3. Clone the repo and base install
This will install part of the dependencies, but some will remain missing or have mismatched versions.
4. Installing additional dependencies
In practice, you’ll need the following (versions chosen for compatibility):
Then install the custom Transformers fork and dlimp:
Sanity checks:
5. LIBERO + simulators (robosuite, MuJoCo)
Same as with base OpenVLA:
Then:
NumPy/TF conflict (same story):
If opencv-python attempts to bump NumPy to 2.x, re-pin NumPy (e.g., numpy==1.26.4).
6. Environment variables
Use the same headless-friendly configuration; adjust PYTHONPATH for openvla-oft:
7. Test: run_libero_eval.py with an OFT checkpoint
run_libero_eval.py with an OFT checkpointRun the official evaluation script:
Expected output:
TensorFlow logs about CPU optimizations (fine).
robosuite warnings about missing “private macro file” (fine).
episodes progress and video saving under
./rollouts/....
If you see:
Install:
8. Behavior: hi-res vs. low-res and seeds
We’ve observed small discrepancies between low-res control runs and hi-res replays:
e.g., an object slightly tilted in one video but upright in another, even with the same
actions_log.
Reasons:
OffScreenRenderEnvand robosuite/MuJoCo can produce slightly different numeric trajectories with different resolutions/backends.Seeds:
we set
env.seed(0)and globalset_seed_everywhere(cfg.seed),but additional randomness in LIBERO/robosuite can exist if not fixed everywhere.
Over long rollouts, tiny integration differences can accumulate.
Practically:
For demos, this is acceptable—intent is preserved.
For strict replication and comparisons:
fix seeds everywhere,
ensure identical init states and physics params,
consider keeping the same resolution for control and replay.
9. Pros and cons for KNX
Pros:
Parameter-efficient fine-tuning:
no need to train all 7B parameters,
cheaper and faster adaptation to new tasks.
Ready OFT checkpoints for LIBERO (e.g.,
moojink/openvla-7b-oft-finetuned-libero-10) with strong performance.Great fit for KNX:
base OpenVLA = “universal brain”,
OFT layers = “stickers” for specific robots/business use-cases.
Cons:
Heavier dependency stack than base OpenVLA:
TensorFlow + TF-graphics + TFDS,
diffusers,
custom Transformers fork,
dlimp, etc.
Requires careful env assembly, especially NumPy/TF/OpenCV versions.
Research-oriented code and docs; for production KNX you’ll still need orchestration, safety layers, and integration with real robots and web infra.
Last updated

