Conversation
|
Let's take stock of the quality of search here: Path Search Qualityadam has a file here called tech-debt
and what's most intuitive for him is to use a space to quickly delimit his search, old search cannot fathom this behavior
but new search uses nucleo which is the fuzzy funder that powers helix (a very telescope like experience, the sort of experience multiple users have asked us to take inspiration from. And nucleo has lots of rules coming from a path centric search environment that work really nicely.
old path search would also take a lot of characters to match documents, even though i was typing the name of the file (very common workflow), I added bonuses for this sort of situation but the code is just too naively implemented to work well in practice. Without the help of suggested docs I have to type to get
but new search, doesn't even need the help of suggested docs to behave the way I want:
I'll leave one more example of a flow I do very often and never have it behave exactly the way I expect: Content Search QualityThere are two fundamental issues with our use of tantivy, arguably one of these is a skill issue. Let's start with the one that's not a skill issue: Tantivy doesn't support substring search, which means you don't get search results until you've finished typing the word you're looking for. So luca gets practically no search results until you're done forming a word:
new content search is a handrolled implementation that behaves closer to the behavior of other popular note taking apps (deprecates tantivy):
it performs a substring search across all text documents. The query is split by space into search terms. Each term must be mached at-least once in the document for it to appear in the list, and any documents that have exact matches are grouped above documents that have partial matches.
There's a lot of tiny nuisances caused by this the main tradeoff is that the search executes very quickly. But my handrolled-unoptimized-more-flexible implementation is executing these queries in less than 5ms. I could probably make this much faster as well if needed, the workload is embarrassingly parallel, additionally I could incrementally process the inputs. Finally if that's not enough there are some interesting algorithms in this space. But I'm not too concerned about that so far. The other category of badness is sorta a double edged sword:
I'm getting no results here because of tantivy's support of advanced query constructs.
We could wrap each search term in Another request is for people who have long documents, they don't just want the search result to find "the doc". Raayan has a wine.md and search returns this single doc as the result, he wants it to show all the results (like telescope does) and be able to jump to them. Presumably #4426 gets him part of the way there, but imagine a user with a few large docs, they still want content results that are more usable for search terms that occur frequently inside a document. So as you can see in these screenshots we're aware of all the matches, and we'll show them in a sorta tree hierarchy. See tuned for ui beautification I alude to below. Lastly this search is structured in a way to have other searchable modules alongside, potentially sharing infrastructure like the in memory cache of documents. There's room here for specifically for in memory semantic search, a command palette and other ideas. You may notice these things are a little ugly right now, I spent most of my effort so far on data quality, and I'm personally satisfied. Feel free to comment on this PR with other situations you'd like me to check. Now I'm going to move onto a beautification stage. The UI is ready for contribution from @tvanderstad to embed the editor, I'm going to make the query and results panel much prettier at the same time. On mobile there's two paths we could take. We could allow the input to be native, and present the search results in egui. Or I could firm up the data contract and do it all natively. The search performance is quite good and I'm tempted to backport this to lb-rs now, but I think fixing the rest of the UI will give me more information regardless. |
|
@ad-tra has convinced me that the search implementation should in fact be native, so I'm going to move this over into lb-rs and work on a generalizable contract. I think macOS will have the same egui-like experience, but iOS will get some special attention (from me). |
|
note to self: handle the folder activation situation properly |

















problem:

solution:
make search not mid