mirror of https://github.com/microsoft/PowerToys.git synced 2026-04-03 17:56:44 +02:00

Files

Dave Rayment 9e4bf1e3e0 [Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551 )

## Summary of the Pull Request
Window Walker's fuzzy string matching algorithm exhibits exponential
memory usage and execution time when given inputs containing repeated
characters or phrases. When a user has several windows open with long
titles (such as browser windows), it is straightforward to trigger a
pathological case which uses up gigabytes of memory and freezes the UI.
This is exacerbated by Run's lack of thread pruning, meaning work
triggered by older keystrokes consumes CPU and memory until completion.

<!-- Please review the items on the PR checklist before submitting-->
## PR Checklist

- [x] Closes: #44546
- [x] Closes: #44184
- [ ] **Communication:** I've discussed this with core contributors
already. If the work hasn't been agreed, this work might be rejected
- [ ] **Tests:** Added/updated and all pass
- [ ] **Localization:** All end-user-facing strings can be localized
- [ ] **Dev docs:** Added/updated
- [ ] **New binaries:** Added on the required places
- [ ] [JSON for
signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json)
for new binaries
- [ ] [WXS for
installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs)
for new binaries and localization folder
- [ ] [YML for CI
pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml)
for new test projects
- [ ] [YML for signed
pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml)
- [ ] **Documentation updated:** If checked, please file a pull request
on [our docs
repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys)
and link it here: #xxx

## Detailed Description of the Pull Request / Additional comments
The existing algorithm in `FuzzyMatching.cs` is greedy, creating all
possible matching combinations of the search string within the candidate
via its `GetAllMatchIndexes()` method. After this, it selects the best
match and discards the others. This may be considered reasonable for
small search strings, but it causes a combinatorial explosion when there
are multiple possible matches where characters or substrings repeat,
even when the search string is small.

The current brute-force algorithm has time complexity of **O(n * m *
C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of
**O(C(n,m) * m)** because it stores all possible match combinations
before choosing the best.

For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** =
**70** match combinations, which stores 70 lists with 4 integers each,
plus overhead from the LINQ-based list copying and appending:

```csharp
var tempList = results
    .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex)
    .Select(x => x.ToList())   // Creates a full copy of each matching path
    .ToList();                 // Materializes all copies

results.AddRange(tempList);    // Adds lists to results
```

Each potential sub-match may be recalculated many times.

Window Walker queries across all window titles, so this problem will be
magnified if the search text happens to match multiple titles and/or if
a search string containing a single repeated character is used. For
browser windows, where titles may be long, this is especially
problematic, and similarly for Explorer windows with longer paths.

## Proposed solution
The solution presented here is to use a dynamic programming algorithm
which finds the optimal match directly without generating all
possibilities.

In terms of complexity, the new algorithm benefits from a single pass
through its DP table and only has to store two integer arrays which are
sized proportionally to the search and candidate text string lengths; so
**O(n * m)** for both time and space, i.e. polynomial instead of
exponential.

Scoring is equivalent between the old and new algorithms, based strictly
on the minimum match span within the candidate string.

## Implementation notes
The new algorithm tracks the best start index for matches ending at each
position, eliminating the need to store all possible paths. By storing
the "latest best match so far" as you scan through the search text, you
are guaranteed to minimise the span length. To recreate the best match,
a separate table of parent indexes is kept and iterated backwards once
the DP step is complete. Reversing this provides you with the same
result (or equivalent if there are multiple best matches) as the
original algorithm.

For this "minimum-span" fuzzy matching method, this should be optimal as
it only scans once and storage is proportional to the search and
candidate strings only.

## Benchmarks
A verification and benchmarking suite is here:
https://github.com/daverayment/WindowWalkerBench

Results from comparing the old and new algorithms are here:
https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing

| Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2
| Allocated | Alloc Ratio |
|----------------
|-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:|
| Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760
| - | - | 8.09 KB | 1.00 |
| New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | -
| - | 1.77 KB | 0.22 |
| Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 |
3.7079 | - | - | 15.16 KB | 1.87 |
| New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 |
1.6403 | - | - | 6.72 KB | 0.83 |
| Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns
| 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 |
351885.11 KB | 43,518.16 |
| New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 |
1.7738 | - | - | 7.26 KB | 0.90 |
| Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06
ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 |
173117.13 KB | 21,409.65 |
| New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 |
1.5793 | - | - | 6.46 KB | 0.80 |
| Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns
| 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96
KB | 8,855.37 |
| New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 |
1.3924 | - | - | 5.7 KB | 0.70 |
| Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns
| 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB
| 3,155.60 |
| New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 |
1.1978 | - | - | 4.91 KB | 0.61 |
| Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns |
2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB |
943.67 |
| New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 |
1.0109 | - | - | 4.13 KB | 0.51 |
| Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns |
632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 |
| New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 |
0.8163 | - | - | 3.34 KB | 0.41 |
| Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 |
9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 |
| New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 |
0.6275 | - | - | 2.57 KB | 0.32 |
| Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 |
17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 |
| New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 |
1.3542 | - | - | 5.55 KB | 0.69 |

(Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or
one millionth of a millisecond.)

It is worth noting that the results show a **single string match**. So
matching "eeeeee" against a 99-character string took 25 MB of memory and
71 milliseconds to compute. For the new algorithm, this is reduced down
to <5KB and 0.002 milliseconds. Even for a three-character repetition,
the new algorithm is >150x faster with <1% of the allocations.

## Real world example
**Before (results still pending after more than a minute):**
<img width="837" height="336" alt="Image"
src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5"
/>

**After (instantaneous results):**
<img width="829" height="444" alt="image"
src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2"
/>

## Validation Steps Performed
The verification tests in the benchmark project pass, with results
identical to the original across a number of test cases, including the
pathological cases identified earlier and edge cases such as
single-character searches.

All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest`
entries still pass.