Skip to content

Trinity contrib - Fix notebook for upstream clone and add LNC documentation#91

Open
jimburtoft wants to merge 1 commit intoaws-neuron:mainfrom
jimburtoft:contrib/trinity-model
Open

Trinity contrib - Fix notebook for upstream clone and add LNC documentation#91
jimburtoft wants to merge 1 commit intoaws-neuron:mainfrom
jimburtoft:contrib/trinity-model

Conversation

@jimburtoft
Copy link
Contributor

Fix for Trinity contrib demonstration notebook -- no model code changes.

  • Fix nki-library install: use clone+copy workaround (pip install fails due to setuptools_scm issue)
  • Fix HF_TOKEN: replace Python syntax in %%bash cell with proper bash
  • Fix sys.path: point to /home/ubuntu/nxdi/ (upstream clone) not nxdi-fork
  • Fix Nano TP: change from TP=2 to TP=1 (TP=2 invalid with LNC=2)
  • Add kernel restart warnings between model sections (NeuronCore OOM)
  • Add LNC configuration section to README with valid TP degrees
  • Update validated results to match upstream testing (TP=1 Nano, +28% Mini)

Testing

I ran the notebook on a trn2.3xlarge with SDK 2.28

By submitting this PR, I confirm that:

  • [x ] I have read and followed the contributing guidelines
  • [ x] This is a community contribution and may have limited testing compared to officially-supported models
  • [ x] The code follows best practices and is well-documented
  • [ x] All required components listed above are included

- Fix nki-library install: use clone+copy workaround (pip install fails
  due to setuptools_scm issue)
- Fix HF_TOKEN: replace Python syntax in %%bash cell with proper bash
- Fix sys.path: point to /home/ubuntu/nxdi/ (upstream clone) not nxdi-fork
- Fix Nano TP: change from TP=2 to TP=1 (TP=2 invalid with LNC=2)
- Add kernel restart warnings between model sections (NeuronCore OOM)
- Add LNC configuration section to README with valid TP degrees
- Update validated results to match upstream testing (TP=1 Nano, +28% Mini)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants