Skip to content

ws: new search#4312

Draft
Parth wants to merge 35 commits intomasterfrom
new-search
Draft

ws: new search#4312
Parth wants to merge 35 commits intomasterfrom
new-search

Conversation

@Parth
Copy link
Copy Markdown
Member

@Parth Parth commented Mar 27, 2026

problem:
image

solution:
make search not mid

@Parth Parth mentioned this pull request Apr 3, 2026
12 tasks
@Parth
Copy link
Copy Markdown
Member Author

Parth commented Apr 11, 2026

Let's take stock of the quality of search here:

Path Search Quality

adam has a file here called tech-debt

image

and what's most intuitive for him is to use a space to quickly delimit his search, old search cannot fathom this behavior

image

but new search uses nucleo which is the fuzzy funder that powers helix (a very telescope like experience, the sort of experience multiple users have asked us to take inspiration from. And nucleo has lots of rules coming from a path centric search environment that work really nicely.

image

old path search would also take a lot of characters to match documents, even though i was typing the name of the file (very common workflow), I added bonuses for this sort of situation but the code is just too naively implemented to work well in practice. Without the help of suggested docs I have to type to get huddle.md:

image

but new search, doesn't even need the help of suggested docs to behave the way I want:

image

I'll leave one more example of a flow I do very often and never have it behave exactly the way I expect:

before:
image
after:
image

Content Search Quality

There are two fundamental issues with our use of tantivy, arguably one of these is a skill issue. Let's start with the one that's not a skill issue:

Tantivy doesn't support substring search, which means you don't get search results until you've finished typing the word you're looking for. So luca gets practically no search results until you're done forming a word:

image

new content search is a handrolled implementation that behaves closer to the behavior of other popular note taking apps (deprecates tantivy):

image

it performs a substring search across all text documents. The query is split by space into search terms. Each term must be mached at-least once in the document for it to appear in the list, and any documents that have exact matches are grouped above documents that have partial matches.

image

There's a lot of tiny nuisances caused by this the main tradeoff is that the search executes very quickly. But my handrolled-unoptimized-more-flexible implementation is executing these queries in less than 5ms. I could probably make this much faster as well if needed, the workload is embarrassingly parallel, additionally I could incrementally process the inputs. Finally if that's not enough there are some interesting algorithms in this space. But I'm not too concerned about that so far.

The other category of badness is sorta a double edged sword:

image

I'm getting no results here because of tantivy's support of advanced query constructs.

image image

We could wrap each search term in "", but then we'd be giving up access to tantivy's advanced search (#3548). Overall feels like we have a vibe mismatch with our goals and tantivy's. If I think about what I personally want out of an advanced search experience, it's the ability to use grep expressions and these could be handled directly by the current content search implementation as a flag, similar to what travis is showing with find in doc. #4426.

Another request is for people who have long documents, they don't just want the search result to find "the doc". Raayan has a wine.md and search returns this single doc as the result, he wants it to show all the results (like telescope does) and be able to jump to them. Presumably #4426 gets him part of the way there, but imagine a user with a few large docs, they still want content results that are more usable for search terms that occur frequently inside a document. So as you can see in these screenshots we're aware of all the matches, and we'll show them in a sorta tree hierarchy. See tuned for ui beautification I alude to below.

Lastly this search is structured in a way to have other searchable modules alongside, potentially sharing infrastructure like the in memory cache of documents. There's room here for specifically for in memory semantic search, a command palette and other ideas.

You may notice these things are a little ugly right now, I spent most of my effort so far on data quality, and I'm personally satisfied. Feel free to comment on this PR with other situations you'd like me to check.

Now I'm going to move onto a beautification stage. The UI is ready for contribution from @tvanderstad to embed the editor, I'm going to make the query and results panel much prettier at the same time.

On mobile there's two paths we could take. We could allow the input to be native, and present the search results in egui. Or I could firm up the data contract and do it all natively. The search performance is quite good and I'm tempted to backport this to lb-rs now, but I think fixing the rest of the UI will give me more information regardless.

@Parth
Copy link
Copy Markdown
Member Author

Parth commented Apr 14, 2026

image

featuring bolded sections, now going to have a "selection"

@Parth
Copy link
Copy Markdown
Member Author

Parth commented Apr 14, 2026

better icons
image

@Parth
Copy link
Copy Markdown
Member Author

Parth commented Apr 14, 2026

image

I like how this is shaping up

@Parth
Copy link
Copy Markdown
Member Author

Parth commented Apr 16, 2026

@ad-tra has convinced me that the search implementation should in fact be native, so I'm going to move this over into lb-rs and work on a generalizable contract. I think macOS will have the same egui-like experience, but iOS will get some special attention (from me).

@Parth
Copy link
Copy Markdown
Member Author

Parth commented Apr 18, 2026

note to self: handle the folder activation situation properly

@Parth
Copy link
Copy Markdown
Member Author

Parth commented Apr 20, 2026

prob: image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant