Skip to content

Initial Release#6

Merged
slecleach merged 19 commits into
rai-opensource:mainfrom
alberthli:alberthli/initial-release
Jun 10, 2025
Merged

Initial Release#6
slecleach merged 19 commits into
rai-opensource:mainfrom
alberthli:alberthli/initial-release

Conversation

@alberthli
Copy link
Copy Markdown
Collaborator

@alberthli alberthli commented Jun 4, 2025

This PR adds the private development changes to judo.

As per slack discussion with @pculbertson-bdai, we have an internal checklist of action items prior to release on pypi.

Copy link
Copy Markdown
Contributor

@pculbertson-bdai pculbertson-bdai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, lgtm (lol)

In general: I read through everything + think the code makes sense at a high level. Flagged some style stuff/things that aren't working (benchmark in particular seems bricked on my Mac) but overall support merging to use this version as the foundation for the repo.

When we get Simon's buy-in, let's merge and do an initial release on pypi. Then we can stand up an issue board/roadmap and start whacking bugs.

Comment thread README.md
The `vis_name` field specifies the name of the visualization object in the GUI backend, so it doesn't matter that much. `xyz_vis_indices` indicates which indices of the array to use for the x, y, and z coordinates of the visualization. If you set it to `None`, it will use the default values specified in `xyz_vis_defaults`.

<p align="center">
<img src="/judo/_static/images/cylinder_push.gif" alt="cylinder_push_config" width="640">
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This image doesn't show up for me - is it just my machine?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe this will show up on the actual docs website, but it does NOT show up locally. If there's a way you guys know how to resolve the image paths thing, then let me know, I fiddled with it for a few hours but couldn't figure it out - I could get it to show up locally but not on the published site or vice versa.

Comment thread judo/app/controller.py
@on_event("INPUT", "controller_config")
def update_controller_config(self, event: dict) -> None:
"""Callback to update controller config on receiving a new config message."""
self.controller_config = from_event(event, self.controller_config_cls)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why store things in two places? Shouldn't there be one "source of truth" for these config classes?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - this is a holdover from some of the legacy judo code, where there were some configs bookkept in the dora node. In reality, the clean way to handle this is to have self.controller_config point to self.controller.controller_config using a getter, and have a setter set the config in the controller attribute.

I think cleaning up this internal logic is a good candidate for v0.0.2 release.

Comment thread judo/app/controller.py
def update_states(self, event: dict) -> None:
"""Callback to update states on receiving a new state measurement."""
state_msg = from_event(event, MujocoState)
self.states = np.concatenate([state_msg.qpos, state_msg.qvel])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to consider generalizing this to include other state data - e.g., in the official rollouts they dynamically compute what's in the full physics state.

We may also want to make a note that we only support basically Lagrangian dynamics with rigid bodies / stateless actuators atm

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with the sentiment, but also there are some concerns which we might want to discuss.

I think we should have a more future-proof and robust way of handling the state data. One thing that is minimal and always shared (for rigid body Lagrangian dynamics) is passing times, generalized positions, and generalized velocities. If we allowed swapping the MuJoCo rollout backend out later, then adding other fields like the mocap fields could break things.

For example, in our lab, we were considering allowing custom function definitions like ODE callables for the rollouts (this is useful for things like doing rollouts with ROMs). If there were additional fields besides the ones I described above, it might break this proposed addition.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated docs to include a note about simulation support.

Comment thread judo/app/simulation.py
self.step()
self.write_states()

# Force controller to run at fixed rate specified by model dt.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to introspect here and print warnings if the inner loop time is running slower than realtime. I'm seeing behavior that looks suspiciously slower than realtime when my CPU is working hard

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on this.

Many of the default tasks also use a pretty high number of default threads, so we may want to converge on some reasonable numbers and/or compute the min of os.cpu_count() and the specified threads.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a warning.

Comment thread judo/app/benchmark.py
@@ -0,0 +1,107 @@
# Copyright (c) 2025 Robotics and AI Institute LLC. All rights reserved.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running benchmark from the terminal errors with:

File "/Users/preston/judo/judo/cli.py", line 51, in main_benchmark
    run(cfg)
  File "/Users/preston/miniconda3/envs/judo/lib/python3.12/site-packages/dora_utils/launch/run.py", line 32, in run
    node_name = f"tmp_{node_cfg.node_id}"
                       ^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'node_id'
started dora coordinator

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm this doesn't error for me, also doesn't error for Zach at Caltech. @slecleach does this work for you?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preston, If you're using conda, I think you may want to consider nuking and recreating your environment. I made some small changes to dora_utils to prep for release, so if you're using an old env, you might have some weird issues like this.

self.times = self.task.data.time + self.spline_timesteps
self.update_spline(self.times, self.nominal_knots)

def update_traces(self) -> None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will probably need a refactor for readability at some point - the shapes end up being pretty arcane. curious if we could try bringing back the CatmullRomSpline type + if the performance is any different these days

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to the issue tracker.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's reevaluate the CatmullRomSpline from Viser, if'that's too slow we could stick to plotting segments.

self.traces = np.reshape(elites, (total_traces_rollouts, 2, 3))


def make_spline(times: np.ndarray, controls: np.ndarray, spline_order: str) -> interp1d:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having this util and the constructor in SplineData seems redundant/hard to maintain. I think we should refactor to use the class method from SplineData or vice versa.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added an issue for this.

Comment thread judo/optimizers/cem.py
def pre_optimization(self, old_times: np.ndarray, new_times: np.ndarray) -> None:
"""Update sigma if the number of nodes has changed."""
if len(self.sigma) != self.num_nodes:
self.sigma = interp1d(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should be refactored to a unified spline call

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added an issue for this.

@slecleach
Copy link
Copy Markdown
Collaborator

slecleach commented Jun 9, 2025

Hi Albert! thanks so much for putting this PR together. This is a huge undertaking and it's addressing some critical pain points and limitations with the current version of Judo.

Some initial comments after I went through the documentation, the readme, the installation and docs building scripts.
I still need to go through the code and github actions. I'll add more comments when I am done with this.
Tested:

  • tested conda install
  • tested docs build: only issue are the instructions see below
  • tested judo app
  • tested benchmark
  • proof-read the documentation

Suggested fixes:

  • for the github actions, we should make it consistent for the python and Ubuntu versions. We were using ubuntu 22.04 and python 3.10. I am happy to move to a matrix-based test where we test both 3.10 + 22.04 and 3.13 + 24.04. We need to investigate the consequences on the dependencies though.
  • add badges at the top of the README, build status, docs status, codecov, documentation link, supported python versions
  • Docs building instructions are confusing because cd .. is implicit. One proposal could be:
cd docs
python -m venv .docs_venv [not sure we want to mention venv though]
source .docs_venv/bin/activate
pip install -r requirements.txt
cd ..
sphinx-build docs/source docs/build -b dirhtml
python -m http.server --directory docs/build 8000

Minor fixes: [we can put this on the roadmap for later development if we agree on these points]

  • w_pusher_velocity for the cylinder push task, the slider is stuck between 0 and 0, revamp the logic for the min_value and max value of the slider depending on the value of the default (case1 default < 0, case2 default=0, case3 default > 0) I think our logic is only good for case3 atm.
  • caltech_leap_cube visualization of fingers
  • README “For developers, to build docs locally, run the following in your environment. Note that asset paths will be broken locally that work correctly on Github Pages.” some typo maybe
  • Meshes not showing up. If a mesh isn’t showing up in the GUI, make sure that the geoms are named. If there are multiple mesh geoms that are not named, they will not show up in the GUI. I fixed this issue on the RAI side, I can PR this soon (just add give a unique name to each on the viser side). I don’t think we should expect people to modify their XMLs as the ones on mujoco-menagerie are often missing geometry names.
  • docs warning treated as errors (same as Nerfstudio); this is not a priority atm
  • dice icon at the bottom right of the docs pages: this is a remain from Jacta’s docs

@alberthli
Copy link
Copy Markdown
Collaborator Author

@slecleach

Question: how do I fix the dice in the bottom right of the docs? I can't find where the symbol is originating from. I'll happily remove it before merging this PR.

Addressed the following:

  • Updated code_checks.yml to use a strategy matrix testing 22.04/24.04 and python 3.10-3.13.
  • Added requested badges. The link to the docs already appears in the README. I don't think I have permissions to add links on the repo to things like the docs, but you should do this.
  • Fixed the doc building instruction typo - thanks for the catch to both Preston and Simon
  • caltech_leap_cube fingertips not showing up. In particular, this is unrelated to the meshes not showing up issue that you pointed out above (in fact, I independently ran into this and already have a fix for it). I have an undocumented feature that lets you filter geoms from being visualized based on whether substrings are in their names. I accidentally left the substring "collision" in the fingertip geom names in this XML. Removing them fixes this issue.

Relegated the following to the v0.0.2 release:

  • Make docs warnings errors
  • Revamping slider logic. In the meantime, I patched the logic somewhat so that you can input max/min values greater than what is specified and the limits will update to accommodate that choice, since it's a lot better than needing to restart the entire app.

@slecleach slecleach merged commit 2189498 into rai-opensource:main Jun 10, 2025
17 checks passed
@alberthli alberthli deleted the alberthli/initial-release branch June 10, 2025 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants