Skip to content

Kim.readMetadata(Source) seems to return null always #131

@hakanai

Description

@hakanai

Whether I get my raw source from a Ktor HttpResponse, or from a kotlinx-io FileSystem, when I buffer the source and then pass it to Kim.readMetadata, I always get null back.

Demo program (you'll have to add your own URLs, as the ones I had were local to my environment):

import com.ashampoo.kim.Kim
import com.ashampoo.kim.ktor.readMetadata
import io.ktor.client.HttpClient
import io.ktor.client.engine.cio.CIO
import io.ktor.client.request.get
import io.ktor.client.statement.bodyAsChannel
import io.ktor.utils.io.asSource
import kotlinx.coroutines.runBlocking
import kotlinx.io.RawSource
import kotlinx.io.buffered
import kotlinx.io.files.SystemFileSystem

private val httpClient = HttpClient(CIO) {
    expectSuccess = true
}

private suspend fun sourceForHttpUrl(url: String): RawSource {
    val httpResponse = httpClient.get(url)
    val content = httpResponse.bodyAsChannel()
    return content.asSource()
}

private fun sourceForFileUrl(url: String): RawSource {
    // XXX: Technically doesn't handle file URLs with a host inside. Handle this rare case?
    val filePath = kotlinx.io.files.Path(url.removePrefix("file:///"))
    return SystemFileSystem.source(path = filePath)
}

private suspend fun sourceForUrl(url: String): RawSource {
    val colon = url.indexOf(':')
    require (colon >= 0) { "Invalid URL: $url" }

    val scheme = url.substring(0, colon)
    return when (scheme) {
        "http", "https" -> sourceForHttpUrl(url)
        "file" -> sourceForFileUrl(url)
        else -> throw IllegalArgumentException("Unsupported URL scheme: $scheme for URL: $url")
    }
}

fun main() {
    val urls = listOf(
        "https://example.com/.../file.jpg",
        "file:///C:/Users/.../file.jpg",
    )

    runBlocking {
        urls.forEach { url ->
            println("---- url: $url ----")
            sourceForUrl(url).use { source ->
                println("source is a: ${source::class}")

                val metadata = Kim.readMetadata(source.buffered())
                println("metadata = $metadata")
            }
        }
    }
}

The culprit appears to be KtorInputByteReader.contentLength - source.remaining always returns 0 at the time it is called. The way Ktor implements this is to return buffer.size, which doesn't seem like it would ever return the value you'd want. Rather, you might just have to pass the size in from the outside. Or, try to avoid needing it?

Workarounds:

  1. Both Kim.readMetadata(ByteReadChannel, Long) and Kim.readMetadata(Path) work fine. The only reason I was looking at the overload taking Source was that I was trying to decouple my data source code from my metadata reading code.
  2. Implementing ByteReader directly, or even using KtorByteReadChannelByteReader and other impls directly, probably works fine, since the corresponding methods do, and this looks like it might be the clean approach I'm looking for.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions