New Wasm workloads: Transformers.js ML sentiment analysis and Speech-to-Text#148
Conversation
✅ Deploy Preview for webkit-jetstream-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
I am getting the following error when running the deploy preview in Firefox: |
My bad, the dynamic import of ONNX runtime didn't work with the blob/preloading in the browser, fixed. |
|
Seems like both Bert and Whisper spend a majority of the time in whatever function index The fact that Whisper does other things (and IIUC, is more popular) makes it somewhat more interesting. On the other hand, it seems like the dominant function in either case is the same, and running faster is a significant benefit. |
|
I didn't look at the Wasm in detail yet, but I think your assumption about function I'll take a more in-depth look tomorrow, i.e., overlap / difference between the Bert and Whisper tasks, and what Whisper does besides the computational kernel. |
|
Regarding the profiles of the Whisper vs. Bert task, it does seem to me that Whisper is substantially "flatter" or more diverse, and subsumes Bert. See: So in that light, I'd rather search for ways to make Whisper quicker (although the audio snippet is already pretty short and low number of iterations) and keep that. In either case, how about merging this since we are generally happy with having this style of workload, and we can still disable and tune in follow-up PRs? |
|
Interestingly, on my M2 Pro Mac I see way more time in 7460 than you're reporting (at least in V8 cli, FF browser, Safari/jsc cli).
Sounds good to me |
|
Thanks, merging! Also interesting difference in the profiles, maybe our x64 backend got more love than arm64, so we should probably look into this in the next weeks/months. |








These could be replacements for the tfjs workloads (even though the model file size issue remains). Run via
transformersjs-bert-wasmandtransformersjs-whisper-wasm.TODOs: Evaluate startup/model loading performance, take CPU profile, decide whether Whisper task is too long-running, compress model files on-disk for repo size, modify NPM server to handle those?