Describe the bug
Certain videos seem much more likely to fail on download. The root cause seems to be a different URL is retrieved via the "metadata" endpoint. These /kmoat/mps/logo/ endpoints end with 403 errors. They seem to be requesting tokens which we are not receiving via the metadata endpoint.
I have tried changing how we are using sessions as well as passing all cookies forward from the metadata request to the video request (it seems we only receive ttwid cookie and the endpoint recieved expects a tt_chain_token). I will note that if I manually make the requests, I do not run into this issue so it seems likely that we are being detected as unusual behavior. It is interesting that it seems to be more common for certain users/videos as if there is additional DRM settings for certain content (interesting as an object of study if we can consistently find it).
Comparison to ytdlp
ytdlp is able to download the videos. They appear to be doing a few things differently. Primarily they are using curl-cffi to better impersonate a browser. They also do have a _solve_challenge_and_set_cookies function though I am unsure if that is actually our issue. They also are using different "metadata" endpoints. One looks like an actual API, but there is also a webpage and it seems to note that there is web-only content (since we're collecting via the web this seems important).
Possible solutions
-
Use ytdlp ourselves. We do collect metadata, but we could presumably still collect the metadata and then use ytdlp. Download being that ytdlp is not innately async (and I tested the async version some time ago but ran into some unremembered problem... dependency issues maybe?).
-
We could also test curl_cffi and see if that helps. That could be a useful tool in general if successful.
-
Test out other Tiktok endpoints. Maybe they are more consitent or otherwise do not
Describe the bug
Certain videos seem much more likely to fail on download. The root cause seems to be a different URL is retrieved via the "metadata" endpoint. These
/kmoat/mps/logo/endpoints end with 403 errors. They seem to be requesting tokens which we are not receiving via the metadata endpoint.I have tried changing how we are using sessions as well as passing all cookies forward from the metadata request to the video request (it seems we only receive
ttwidcookie and the endpoint recieved expects att_chain_token). I will note that if I manually make the requests, I do not run into this issue so it seems likely that we are being detected as unusual behavior. It is interesting that it seems to be more common for certain users/videos as if there is additional DRM settings for certain content (interesting as an object of study if we can consistently find it).Comparison to
ytdlpytdlpis able to download the videos. They appear to be doing a few things differently. Primarily they are usingcurl-cffito better impersonate a browser. They also do have a_solve_challenge_and_set_cookiesfunction though I am unsure if that is actually our issue. They also are using different "metadata" endpoints. One looks like an actual API, but there is also a webpage and it seems to note that there is web-only content (since we're collecting via the web this seems important).Possible solutions
Use
ytdlpourselves. We do collect metadata, but we could presumably still collect the metadata and then use ytdlp. Download being that ytdlp is not innately async (and I tested the async version some time ago but ran into some unremembered problem... dependency issues maybe?).We could also test
curl_cffiand see if that helps. That could be a useful tool in general if successful.Test out other Tiktok endpoints. Maybe they are more consitent or otherwise do not