🔗OpenVLA · OFT

1. What is OpenVLA-OFT?

OpenVLA-OFT is a set of methods and code for parameter-efficient fine-tuning (OFT = Orthogonal/Offset Fine-Tuning) on top of the base OpenVLA 7B model. The goal is to cheaply adapt the same model to new domains/tasks by training small add-on parameterizations instead of touching all weights.

Official resources:

Website: https://openvla-oft.github.io
Code: https://github.com/moojink/openvla-oft

Conceptually:

Start with the base 7B OpenVLA model.
Add small OFT components (akin to PEFT/LoRA-family ideas).
Train only those components on new data (e.g., LIBERO variants or new scenes).
At inference, the base model + OFT layers together provide adapted behavior.

2. Architecture and differences vs. “vanilla” OpenVLA

Key differences:

Depends on a custom Transformers fork: transformers-openvla-oft that knows how to load/use OFT layers.
Checkpoints like moojink/openvla-7b-oft-finetuned-libero-10 bundle the base model plus OFT parameters.
The openvla-oft repository contains:
- training scripts,
- evaluation scripts on LIBERO,
- configs for different modes (e.g., L1 regression vs. diffusion action decoders).

For KNX, this means you can:

Efficiently adapt model behavior to your sim/robots without full retraining.
Use the existing OFT checkpoints as strong baselines on LIBERO and extend from there.

3. Installation: a working pipeline

Below is a setup that ran run_libero_eval.py end-to-end without errors.

3.1. Conda environment

conda create -n openvla_oft python=3.10 -y
conda activate openvla_oft

3.2. PyTorch

pip install --index-url https://download.pytorch.org/whl/cu121 \
  "torch==2.2.0" "torchvision==0.17.0" "torchaudio==2.2.0"

Check:

python -c "import torch; print('torch:', torch.__version__)"
# torch: 2.2.0+cu121

3.3. Clone the repo and base install

cd /data
git clone https://github.com/moojink/openvla-oft.git
cd openvla-oft

pip install -e .

This will install part of the dependencies, but some will remain missing or have mismatched versions.

4. Installing additional dependencies

In practice, you’ll need the following (versions chosen for compatibility):

pip install accelerate==0.28.0 \
  diffusers==0.30.3 \
  einops \
  fastapi \
  huggingface_hub \
  imageio \
  json-numpy \
  jsonlines \
  matplotlib \
  "peft==0.11.1" \
  protobuf \
  rich \
  "sentencepiece==0.1.99" \
  "tensorflow==2.15.0" \
  "tensorflow_datasets==4.9.3" \
  "tensorflow_graphics==2021.12.3" \
  "timm==0.9.10" \
  "tokenizers==0.19.1" \
  uvicorn \
  wandb \
  "draccus==0.8.0"

Then install the custom Transformers fork and dlimp:

pip install "transformers @ git+https://github.com/moojink/transformers-openvla-oft.git"
pip install "dlimp @ git+https://github.com/moojink/dlimp_openvla.git"

Sanity checks:

python -c "import transformers, draccus, diffusers; print('OK')"
python -c "import tensorflow as tf; print('tf:', tf.__version__)"
# tf: 2.15.0

5. LIBERO + simulators (robosuite, MuJoCo)

Same as with base OpenVLA:

cd /data
git clone https://github.com/Lifelong-Robot-Learning/LIBERO.git
cd LIBERO
pip install -e .

Then:

pip install mujoco==2.3.7
pip install "robosuite<2.0.0"
pip install gym==0.26.2 gym-notices
pip install pyopengl glfw opencv-python pynput easydict bddl

NumPy/TF conflict (same story):

pip install "numpy>=1.23.5,<2.0.0"

If opencv-python attempts to bump NumPy to 2.x, re-pin NumPy (e.g., numpy==1.26.4).

6. Environment variables

Use the same headless-friendly configuration; adjust PYTHONPATH for openvla-oft:

export HF_HOME=/data/hf_home
export HF_HUB_CACHE=/data/hf_home/hub
export TRANSFORMERS_CACHE=/data/hf_home/hub
export HF_MODULES_CACHE=/data/hf_home/modules

export PYTHONPATH=/data/LIBERO:/data/openvla-oft:$PYTHONPATH

export MUJOCO_GL=egl
export MUJOCO_EGL_DEVICE_ID=0
export TOKENIZERS_PARALLELISM=false

7. Test: `run_libero_eval.py` with an OFT checkpoint

Run the official evaluation script:

cd /data/openvla-oft
conda activate openvla_oft

python experiments/robot/libero/run_libero_eval.py \
  --model_family openvla \
  --pretrained_checkpoint moojink/openvla-7b-oft-finetuned-libero-10 \
  --task_suite_name libero_10 \
  --use_l1_regression True \
  --use_diffusion False \
  --use_film False \
  --num_images_in_input 2 \
  --use_proprio True \
  --center_crop True \
  --num_open_loop_steps 8 \
  --num_trials_per_task 1 \
  --env_img_res 256 \
  --local_log_dir ./experiments/logs_oft_libero10

Expected output:

TensorFlow logs about CPU optimizations (fine).
robosuite warnings about missing “private macro file” (fine).
episodes progress and video saving under ./rollouts/....

If you see:

ValueError: Could not find a backend to open `...mp4` with iomode `w?`.
...
FFMPEG: pip install imageio[ffmpeg]

Install:

pip install "imageio[ffmpeg]"

8. Behavior: hi-res vs. low-res and seeds

We’ve observed small discrepancies between low-res control runs and hi-res replays:

e.g., an object slightly tilted in one video but upright in another, even with the same actions_log.

Reasons:

OffScreenRenderEnv and robosuite/MuJoCo can produce slightly different numeric trajectories with different resolutions/backends.
Seeds:
- we set env.seed(0) and global set_seed_everywhere(cfg.seed),
- but additional randomness in LIBERO/robosuite can exist if not fixed everywhere.
Over long rollouts, tiny integration differences can accumulate.

Practically:

For demos, this is acceptable—intent is preserved.
For strict replication and comparisons:
- fix seeds everywhere,
- ensure identical init states and physics params,
- consider keeping the same resolution for control and replay.

9. Pros and cons for KNX

Pros:

Parameter-efficient fine-tuning:
- no need to train all 7B parameters,
- cheaper and faster adaptation to new tasks.
Ready OFT checkpoints for LIBERO (e.g., moojink/openvla-7b-oft-finetuned-libero-10) with strong performance.
Great fit for KNX:
- base OpenVLA = “universal brain”,
- OFT layers = “stickers” for specific robots/business use-cases.

Cons:

Heavier dependency stack than base OpenVLA:
- TensorFlow + TF-graphics + TFDS,
- diffusers,
- custom Transformers fork,
- dlimp, etc.
Requires careful env assembly, especially NumPy/TF/OpenCV versions.
Research-oriented code and docs; for production KNX you’ll still need orchestration, safety layers, and integration with real robots and web infra.

PreviousOpenVLA NextPI‑0

Last updated 2 months ago

Good night

hashtag1. What is OpenVLA-OFT?

hashtag2. Architecture and differences vs. “vanilla” OpenVLA

hashtag3. Installation: a working pipeline

hashtag3.1. Conda environment

hashtag3.2. PyTorch

hashtag3.3. Clone the repo and base install

hashtag4. Installing additional dependencies

hashtag5. LIBERO + simulators (robosuite, MuJoCo)

hashtag6. Environment variables

hashtag7. Test: run_libero_eval.py with an OFT checkpoint

hashtag8. Behavior: hi-res vs. low-res and seeds

hashtag9. Pros and cons for KNX