docs: Add model performance comparison and selection guide#121
Open
hnshah wants to merge 1 commit intoKittenML:mainfrom
Open
docs: Add model performance comparison and selection guide#121hnshah wants to merge 1 commit intoKittenML:mainfrom
hnshah wants to merge 1 commit intoKittenML:mainfrom
Conversation
Adds performance benchmarks and selection guidance to help users choose the right model for their use case. - Added performance comparison table with RTF, memory usage, and use cases - Added 'Which Model Should I Use?' section with clear recommendations - Included performance notes with testing methodology Tested all 3 models (nano, micro, mini) on Apple M2 Ultra with comprehensive benchmarks measuring load time, generation speed (RTF), and memory usage. Hardware: Mac Studio M2 Ultra, 24 cores, macOS
Collaborator
|
yo @hnshah , thanks for making the pr. can you share what text samples you tested this on? also we just shipped an example for streaming so that should change things so let me incude this example w streaming as well. |
Author
|
@therealron Thanks for the quick response! Test SamplesTested with two text types: Short/simple (pronunciation test): Long-form (realistic use case): Both samples tested across all 3 models (nano, micro, mini) on Mac Studio M2 Ultra to measure RTF and memory usage. Streaming ExampleGreat! Looking forward to seeing the streaming example. Should I wait for that before updating the PR, or would you like me to add any additional benchmarks in the meantime? Happy to help test or document the streaming approach once it's ready! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds performance benchmarks and selection guidance to help users choose the right model for their use case.
Changes
Testing
Comprehensively tested all 3 models (nano, micro, mini) on Apple M2 Ultra:
Hardware: Mac Studio M2 Ultra (24 cores), macOS
Rationale
The README shows model sizes and parameters but doesn't help users understand the performance trade-offs or which model to choose. This is the #1 question users have when getting started.
This addition provides:
Results
All measurements are reproducible and based on real-world testing.