Add a `bytes` tensor type

Some models accept tensors whose items are `bytes`. In order to add these to `enum tensor-type`, we need to figure out how to represent these `bytes` items as tensor data, which is currently a `u8` array:

https://github.com/WebAssembly/wasi-nn/blob/747d8dfb389e061eeb52a518aafe777dbf70bdda/wit/wasi-nn.wit#L44

Imagine the situation where a model's input is a `1x10` tensor of `bytes`; this means 10 byte arrays need to be stored in the tensor data section. Unfortunately, these byte arrays could all be of different sizes; how should the specification handle this? Some options:

 - perhaps this kind of input should be rejected since they are unlikely to be needed, i.e., that only `1x1` tensors are possible with `bytes` or something of that nature
 - perhaps this kind of input is limited: byte arrays must all be the same size and expressed as, e.g., `1x10xN`
 - perhaps we could encode each byte array size directly into the tensor data

There might be other options &mdash; let's discuss them in this issue. @geekbeast has floated the idea that tensor data should be represented a `list<list<u8>>`: this way we can use the WIT/WITX type system for encoding each of the lengths of the `bytes` arrays. This has some problems: (1) what about tensors with more dimensions? We don't know how many `list<...>` wrappers we need. (2) This representation doesn't fit other tensor types well: e.g., we don't need to know that `f32` is a 4-byte `list<u8>`. (3) Coercing tensors into a specific WIT/WITX type could involve some copying; ideally we just want to be able to pass some pre-existing bytes (e.g., from a decoded image) as tensor data without additional overhead.

Your feedback is appreciated to figure this out!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a `bytes` tensor type #55

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add a bytes tensor type #55

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Add a `bytes` tensor type #55