Files
PowerToys/src/modules/launcher/Plugins/Microsoft.Plugin.WindowWalker/Components/FuzzyMatching.cs

164 lines
5.7 KiB
C#
Raw Normal View History

2020-08-17 10:00:56 -07:00
// Copyright (c) Microsoft Corporation
// The Microsoft Corporation licenses this file to you under the MIT license.
// See the LICENSE file in the project root for more information.
// Code forked from Betsegaw Tadele's https://github.com/betsegaw/windowwalker/
Adding FxCop to Microsoft.Plugin.WindowWalker (#6260) * Adding FxCop to Microsoft.Plugin.WindowWalker * Delete WindowResult.cs -- Fix for CA1812 WindowResult is an internal class that is apparently never instantiated. If so, remove the code from the assembly. If this class is intended to contain only static members, make it static (Shared in Visual Basic). * Fix for CA1806 UpdateOpenWindowsList calls EnumWindows but does not use the HRESULT or error code that the method returns. This could lead to unexpected behavior in error conditions or low-resource situations. Use the result in a conditional statement, assign the result to a variable, or pass it as an argument to another method. * Fix for: CA1066 Type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.RECT should implement IEquatable<T> because it overrides Equals * Fix for: CA1052 Type 'FuzzyMatching' is a static holder type but is neither static nor NotInheritable * Suppress for CA1069 - These values are defined in https://docs.microsoft.com/en-us/windows/win32/winmsg/extended-window-styles. CA1069 The enum member 'WS_EX_LTRREADING' has the same constant value '0' as member 'WS_EX_LEFT' CA1069 The enum member 'WS_EX_RIGHTSCROLLBAR' has the same constant value '0' as member 'WS_EX_LEFT' * Supress CA1069 Code Description CA1069 The enum member 'SWP_NOREPOSITION' has the same constant value '512' as member 'SWP_NOOWNERZORDER' CA1069 The enum member 'SWP_FRAMECHANGED' has the same constant value '32' as member 'SWP_DRAWFRAME' * Suprress CA1069 for ShowWindow values. See https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-showwindow CA1069 The enum member 'ShowMaximized' has the same constant value '3' as member 'Maximize' * Fix code formatting error * Fix for CA2235: Making POINT serializable CA2235 Field MinPosition is a member of type WINDOWPLACEMENT which is serializable but is of type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.POINT which is not serializable CA2235 Field MaxPosition is a member of type WINDOWPLACEMENT which is serializable but is of type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.POINT which is not serializable * Fix CA2235 Making RECT serializable CA2235 Field NormalPosition is a member of type WINDOWPLACEMENT which is serializable but is of type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.RECT which is not serializable * Fixes for CA2101 Specify marshaling for P/Invoke string arguments. * Fixes for CA2007 Consider calling ConfigureAwait on the awaited task * Fixes for the following (CA1822 / CA1801): CA1822 Member 'OnOpenWindowsUpdate' does not access instance data and can be marked as static Code Description CA1801 Parameter value of method add_OnOpenWindowsUpdate is never used. Remove the parameter or use it in the method body. CA1801 Parameter value of method remove_OnOpenWindowsUpdate is never used. Remove the parameter or use it in the method body. * Fix: CA1710 Rename OpenWindowsUpdateHandler to end in 'EventHandler' * Fix CA1822 Member 'GetProcessIDFromWindowHandle' does not access instance data and can be marked as static * Fix CA1062 In externally visible method 'List<int> FuzzyMatching.FindBestFuzzyMatch(string text, string searchText)', validate parameter 'searchText' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fixes for CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'FuzzyMatching.FindBestFuzzyMatch(string, string)' with a call to 'string.ToLower(CultureInfo)'. Code Description CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'FuzzyMatching.FindBestFuzzyMatch(string, string)' with a call to 'string.ToLower(CultureInfo)'. * Supressing warning for CA1814: Prefer jagged arrays over multidimensional however this might be something to consider if needing to optimize the window walker search. * Fix: CA1062 In externally visible method 'List<List<int>> FuzzyMatching.GetAllMatchIndexes(bool[,] matches)', validate parameter 'matches' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fix for CA1062 In externally visible method 'int FuzzyMatching.CalculateScoreForMatches(List<int> matches)', validate parameter 'matches' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fixes for CA1806 Calls x... but does not use the HRESULT or error code that the method returns. This could lead to unexpected behavior in error conditions or low-resource situations. Use the result in a conditional statement, assign the result to a variable, or pass it as an argument to another method. Using discard for methods that return void, and checking the hresult before returning parameters. * Fix for CA1820 Test for empty strings using 'string.Length' property or 'string.IsNullOrEmpty' method instead of an Equality check * Supress CA1031 Modify 'get_WindowIcon' to catch a more specific allowed exception type, or rethrow the exception * Code Description CA1062 In externally visible method 'List<Result> Main.Query(Query query)', validate parameter 'query' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fixes For CA1304 The behavior of 'string.ToUpper()' could vary based on the current user's locale settings. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'SearchController.SearchText.set' with a call to 'string.ToLower(CultureInfo)'. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'Window.ProcessName.get' with a call to 'string.ToLower(CultureInfo)'. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'Window.SwitchToWindow()' with a call to 'string.ToLower(CultureInfo)'. CA1304 The behavior of 'string.ToUpper()' could vary based on the current user's locale settings. Replace this call in 'Window.ToString()' with a call to 'string.ToUpper(CultureInfo)'. CA1307 The behavior of 'string.Equals(string?)' could vary based on the current user's locale settings. Replace this call in 'Microsoft.Plugin.WindowWalker.Components.Window.SwitchToWindow()' with a call to 'string.Equals(string?, System.StringComparison)'. * Fix: CA1710 Rename SearchResultUpdateHandler to end in 'EventHandler' * Fix CA1060 Move pinvokes to native methods class * Fix: CS0067 The event 'OpenWindows.OnOpenWindowsUpdateEventHandler' is never used 1) Remove SearchController::OpenWindowsUpdateHandler(object sender, SearchResultUpdateEventArgs e) as it wasn't being called and was redundant with Update Search Text. 2) In Main.cs calling UpdateOpenWindowsList before UpdateSearchText so that the latest enumerated windows will be called. 3) Removing unused OnOpenWindowsUpdateEventHandler and related code. * Revert "Fixes for CA2101 Specify marshaling for P/Invoke string arguments." This reverts commit b3dfe07915dc37618881d348130a7b3c0cd5c59d. * Fixing CA2101 by turning off best fit mapping for methods that require ANSI marshalling. See: https://docs.microsoft.com/en-us/visualstudio/code-quality/ca2101?view=vs-2019 * Previous fix for CA1806 misunderstood int result as hresult. The actual return value is number of characters written. NativeMethods.GetWindowText(hwnd, titleBuffer, sizeOfTitle); * Previous fix for CA1806 misunderstood int result as hresult. The actual return value is number of characters written. NativeMethods.GetClassName(Hwnd, windowClassName, windowClassName.MaxCapacity); * Removing unused window code. This was done instead of validating fxcop changes in WindowIcon. * Fixing typos in Window.cs (charachter -> character)
2020-09-10 09:44:22 -07:00
using System;
2020-08-17 10:00:56 -07:00
using System.Collections.Generic;
Adding FxCop to Microsoft.Plugin.WindowWalker (#6260) * Adding FxCop to Microsoft.Plugin.WindowWalker * Delete WindowResult.cs -- Fix for CA1812 WindowResult is an internal class that is apparently never instantiated. If so, remove the code from the assembly. If this class is intended to contain only static members, make it static (Shared in Visual Basic). * Fix for CA1806 UpdateOpenWindowsList calls EnumWindows but does not use the HRESULT or error code that the method returns. This could lead to unexpected behavior in error conditions or low-resource situations. Use the result in a conditional statement, assign the result to a variable, or pass it as an argument to another method. * Fix for: CA1066 Type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.RECT should implement IEquatable<T> because it overrides Equals * Fix for: CA1052 Type 'FuzzyMatching' is a static holder type but is neither static nor NotInheritable * Suppress for CA1069 - These values are defined in https://docs.microsoft.com/en-us/windows/win32/winmsg/extended-window-styles. CA1069 The enum member 'WS_EX_LTRREADING' has the same constant value '0' as member 'WS_EX_LEFT' CA1069 The enum member 'WS_EX_RIGHTSCROLLBAR' has the same constant value '0' as member 'WS_EX_LEFT' * Supress CA1069 Code Description CA1069 The enum member 'SWP_NOREPOSITION' has the same constant value '512' as member 'SWP_NOOWNERZORDER' CA1069 The enum member 'SWP_FRAMECHANGED' has the same constant value '32' as member 'SWP_DRAWFRAME' * Suprress CA1069 for ShowWindow values. See https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-showwindow CA1069 The enum member 'ShowMaximized' has the same constant value '3' as member 'Maximize' * Fix code formatting error * Fix for CA2235: Making POINT serializable CA2235 Field MinPosition is a member of type WINDOWPLACEMENT which is serializable but is of type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.POINT which is not serializable CA2235 Field MaxPosition is a member of type WINDOWPLACEMENT which is serializable but is of type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.POINT which is not serializable * Fix CA2235 Making RECT serializable CA2235 Field NormalPosition is a member of type WINDOWPLACEMENT which is serializable but is of type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.RECT which is not serializable * Fixes for CA2101 Specify marshaling for P/Invoke string arguments. * Fixes for CA2007 Consider calling ConfigureAwait on the awaited task * Fixes for the following (CA1822 / CA1801): CA1822 Member 'OnOpenWindowsUpdate' does not access instance data and can be marked as static Code Description CA1801 Parameter value of method add_OnOpenWindowsUpdate is never used. Remove the parameter or use it in the method body. CA1801 Parameter value of method remove_OnOpenWindowsUpdate is never used. Remove the parameter or use it in the method body. * Fix: CA1710 Rename OpenWindowsUpdateHandler to end in 'EventHandler' * Fix CA1822 Member 'GetProcessIDFromWindowHandle' does not access instance data and can be marked as static * Fix CA1062 In externally visible method 'List<int> FuzzyMatching.FindBestFuzzyMatch(string text, string searchText)', validate parameter 'searchText' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fixes for CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'FuzzyMatching.FindBestFuzzyMatch(string, string)' with a call to 'string.ToLower(CultureInfo)'. Code Description CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'FuzzyMatching.FindBestFuzzyMatch(string, string)' with a call to 'string.ToLower(CultureInfo)'. * Supressing warning for CA1814: Prefer jagged arrays over multidimensional however this might be something to consider if needing to optimize the window walker search. * Fix: CA1062 In externally visible method 'List<List<int>> FuzzyMatching.GetAllMatchIndexes(bool[,] matches)', validate parameter 'matches' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fix for CA1062 In externally visible method 'int FuzzyMatching.CalculateScoreForMatches(List<int> matches)', validate parameter 'matches' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fixes for CA1806 Calls x... but does not use the HRESULT or error code that the method returns. This could lead to unexpected behavior in error conditions or low-resource situations. Use the result in a conditional statement, assign the result to a variable, or pass it as an argument to another method. Using discard for methods that return void, and checking the hresult before returning parameters. * Fix for CA1820 Test for empty strings using 'string.Length' property or 'string.IsNullOrEmpty' method instead of an Equality check * Supress CA1031 Modify 'get_WindowIcon' to catch a more specific allowed exception type, or rethrow the exception * Code Description CA1062 In externally visible method 'List<Result> Main.Query(Query query)', validate parameter 'query' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fixes For CA1304 The behavior of 'string.ToUpper()' could vary based on the current user's locale settings. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'SearchController.SearchText.set' with a call to 'string.ToLower(CultureInfo)'. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'Window.ProcessName.get' with a call to 'string.ToLower(CultureInfo)'. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'Window.SwitchToWindow()' with a call to 'string.ToLower(CultureInfo)'. CA1304 The behavior of 'string.ToUpper()' could vary based on the current user's locale settings. Replace this call in 'Window.ToString()' with a call to 'string.ToUpper(CultureInfo)'. CA1307 The behavior of 'string.Equals(string?)' could vary based on the current user's locale settings. Replace this call in 'Microsoft.Plugin.WindowWalker.Components.Window.SwitchToWindow()' with a call to 'string.Equals(string?, System.StringComparison)'. * Fix: CA1710 Rename SearchResultUpdateHandler to end in 'EventHandler' * Fix CA1060 Move pinvokes to native methods class * Fix: CS0067 The event 'OpenWindows.OnOpenWindowsUpdateEventHandler' is never used 1) Remove SearchController::OpenWindowsUpdateHandler(object sender, SearchResultUpdateEventArgs e) as it wasn't being called and was redundant with Update Search Text. 2) In Main.cs calling UpdateOpenWindowsList before UpdateSearchText so that the latest enumerated windows will be called. 3) Removing unused OnOpenWindowsUpdateEventHandler and related code. * Revert "Fixes for CA2101 Specify marshaling for P/Invoke string arguments." This reverts commit b3dfe07915dc37618881d348130a7b3c0cd5c59d. * Fixing CA2101 by turning off best fit mapping for methods that require ANSI marshalling. See: https://docs.microsoft.com/en-us/visualstudio/code-quality/ca2101?view=vs-2019 * Previous fix for CA1806 misunderstood int result as hresult. The actual return value is number of characters written. NativeMethods.GetWindowText(hwnd, titleBuffer, sizeOfTitle); * Previous fix for CA1806 misunderstood int result as hresult. The actual return value is number of characters written. NativeMethods.GetClassName(Hwnd, windowClassName, windowClassName.MaxCapacity); * Removing unused window code. This was done instead of validating fxcop changes in WindowIcon. * Fixing typos in Window.cs (charachter -> character)
2020-09-10 09:44:22 -07:00
using System.Globalization;
2020-08-17 10:00:56 -07:00
namespace Microsoft.Plugin.WindowWalker.Components
{
/// <summary>
/// Class housing fuzzy matching methods
/// </summary>
internal static class FuzzyMatching
2020-08-17 10:00:56 -07:00
{
/// <summary>
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
/// Find the best match (the one with the smallest span) using a Dynamic Programming approach
/// to minimize candidate matches.
2020-08-17 10:00:56 -07:00
/// </summary>
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
/// <param name="text">The text to search inside of.</param>
/// <param name="searchText">The text to search for.</param>
/// <returns>The index location of each of the letters in the best match.</returns>
internal static List<int> FindBestFuzzyMatch(string text, string searchText)
2020-08-17 10:00:56 -07:00
{
🚧 [Dev][Build] .NET 8 Upgrade (#28655) * Upgraded projects to target .NET 8 * Updated .NET runtime package targets to use latest .NET 8 build * Updated PowerToys Interop to target .NET 8 * Switch to use ArgumentNullException.ThrowIfNull * ArgumentNullException.ThrowIfNull for CropAndLockViewModel * Switching to ObjectDisposedException.ThrowIf * Upgrade System.ComponentModel.Composition to 8.0 * ArgumentNullException.ThrowIfNull in Helper * Switch to StartsWith using StringComparison.Ordinal * Disabled CA1859, CA1716, SYSLIB1096 analyzers * Update RIDs to reflect breaking changes in .NET 8 * Updated Microsoft NuGet packages to RC1 * Updated Analyzer package to latest .NET 8 preview package * CA1854: Use TryGetValue instead of ContainsKey * [Build] Update TFM to .NET 8 for publish profiles * [Analyzers] Remove CA1309, CA1860-CA1865, CA1869, CA2208 from warning. * [Analyzers] Fix for C26495 * [Analyzers] Disable CS1615, CS9191 * [CI] Target .NET 8 in YAML * [CI] Add .NET preview version flag temporarily. * [FileLocksmith] Update TFM to .NET 8 * [CI] Switch to preview agent * [CI] Update NOTICE.md * [CI] Update Release to target .NET 8 and use Preview agent * [Analyzers] Disable CA1854 * Fix typo * Updated Microsoft.CodeAnalysis.NetAnalyzers to latest preview Updated packages to rc2 * [Analyzers][CPP] Turn off warning for 5271 * [Analyzers][CPP] Turn off warning for 26493 * [KeyboardListener] Add mutex include to resolve error * [PT Run][Folder] Use static SearchValues to resolve CA1870 * [PowerLauncher] Fix TryGetValue * [MouseJumpSettings] Use ArgumentNullException.ThrowIfNull * [Build] Disable parallel dotnet tool restore * [Build] No cache of dotnet tool packages * [Build] Temporarily move .NET 8 SDK task before XAML formatting * [Build][Temp] Try using .NET 7 prior to XAML formatting and then switch to .NET 8 after * [Build] Use .NET 6 for XAML Styler * [CI] Updated NOTICE.md * [FancyZones] Update TFM to .NET 8 * [EnvVar] Update TFM to .NET 8 and update RID * [EnvVar] Use ArgumentNullException.ThrowIfNull * [Dev] Updated packages to .NET 8 RTM version * [Dev] Updated Microsoft.CodeAnalysis.NetAnalyzers to latest * [CI] Updated NOTICE.md with latest package versions * Fix new utility target fameworks and runtimeids * Don't use preview images anymore * [CI] Add script to update VCToolsVersion environment variable * [CI] Add Step to Verify VCToolsVersion * [CI] Use latest flag for vswhere to set proper VCToolsVersion * Add VCToolsVersion checking to release.yml * Remove net publishing from local/ PR CI builds * Revert "Remove net publishing from local/ PR CI builds" This reverts commit f469778996c5053e8bf93233e8191858c46f6420. * Only publish necessary projects * Add verbosity to release pipelines builds of PowerTOys * Set VCToolsVersion for publish.cmd when called from installer * [Installer] Moved project publish logic to MSBuild Task * [CI] Revert using publish.cmd * [CI] Set VCToolsVersion and unset ClearDevCommandPromptEnvVars property * Installer publishes for x64 too * Revert "Add verbosity to release pipelines builds of PowerTOys" This reverts commit 654d4a7f7852e95e44df315c473c02d38b1f538b. * [Dev] Update CodeAnalysis library to non-preview package * Remove unneeded warning removal * Fix Notice.md * Rename VCToolsVersion file and task name * Remove unneeded mutex header include --------- Co-authored-by: Jaime Bernardo <jaime@janeasystems.com>
2023-11-22 12:46:59 -05:00
ArgumentNullException.ThrowIfNull(searchText);
Adding FxCop to Microsoft.Plugin.WindowWalker (#6260) * Adding FxCop to Microsoft.Plugin.WindowWalker * Delete WindowResult.cs -- Fix for CA1812 WindowResult is an internal class that is apparently never instantiated. If so, remove the code from the assembly. If this class is intended to contain only static members, make it static (Shared in Visual Basic). * Fix for CA1806 UpdateOpenWindowsList calls EnumWindows but does not use the HRESULT or error code that the method returns. This could lead to unexpected behavior in error conditions or low-resource situations. Use the result in a conditional statement, assign the result to a variable, or pass it as an argument to another method. * Fix for: CA1066 Type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.RECT should implement IEquatable<T> because it overrides Equals * Fix for: CA1052 Type 'FuzzyMatching' is a static holder type but is neither static nor NotInheritable * Suppress for CA1069 - These values are defined in https://docs.microsoft.com/en-us/windows/win32/winmsg/extended-window-styles. CA1069 The enum member 'WS_EX_LTRREADING' has the same constant value '0' as member 'WS_EX_LEFT' CA1069 The enum member 'WS_EX_RIGHTSCROLLBAR' has the same constant value '0' as member 'WS_EX_LEFT' * Supress CA1069 Code Description CA1069 The enum member 'SWP_NOREPOSITION' has the same constant value '512' as member 'SWP_NOOWNERZORDER' CA1069 The enum member 'SWP_FRAMECHANGED' has the same constant value '32' as member 'SWP_DRAWFRAME' * Suprress CA1069 for ShowWindow values. See https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-showwindow CA1069 The enum member 'ShowMaximized' has the same constant value '3' as member 'Maximize' * Fix code formatting error * Fix for CA2235: Making POINT serializable CA2235 Field MinPosition is a member of type WINDOWPLACEMENT which is serializable but is of type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.POINT which is not serializable CA2235 Field MaxPosition is a member of type WINDOWPLACEMENT which is serializable but is of type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.POINT which is not serializable * Fix CA2235 Making RECT serializable CA2235 Field NormalPosition is a member of type WINDOWPLACEMENT which is serializable but is of type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.RECT which is not serializable * Fixes for CA2101 Specify marshaling for P/Invoke string arguments. * Fixes for CA2007 Consider calling ConfigureAwait on the awaited task * Fixes for the following (CA1822 / CA1801): CA1822 Member 'OnOpenWindowsUpdate' does not access instance data and can be marked as static Code Description CA1801 Parameter value of method add_OnOpenWindowsUpdate is never used. Remove the parameter or use it in the method body. CA1801 Parameter value of method remove_OnOpenWindowsUpdate is never used. Remove the parameter or use it in the method body. * Fix: CA1710 Rename OpenWindowsUpdateHandler to end in 'EventHandler' * Fix CA1822 Member 'GetProcessIDFromWindowHandle' does not access instance data and can be marked as static * Fix CA1062 In externally visible method 'List<int> FuzzyMatching.FindBestFuzzyMatch(string text, string searchText)', validate parameter 'searchText' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fixes for CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'FuzzyMatching.FindBestFuzzyMatch(string, string)' with a call to 'string.ToLower(CultureInfo)'. Code Description CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'FuzzyMatching.FindBestFuzzyMatch(string, string)' with a call to 'string.ToLower(CultureInfo)'. * Supressing warning for CA1814: Prefer jagged arrays over multidimensional however this might be something to consider if needing to optimize the window walker search. * Fix: CA1062 In externally visible method 'List<List<int>> FuzzyMatching.GetAllMatchIndexes(bool[,] matches)', validate parameter 'matches' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fix for CA1062 In externally visible method 'int FuzzyMatching.CalculateScoreForMatches(List<int> matches)', validate parameter 'matches' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fixes for CA1806 Calls x... but does not use the HRESULT or error code that the method returns. This could lead to unexpected behavior in error conditions or low-resource situations. Use the result in a conditional statement, assign the result to a variable, or pass it as an argument to another method. Using discard for methods that return void, and checking the hresult before returning parameters. * Fix for CA1820 Test for empty strings using 'string.Length' property or 'string.IsNullOrEmpty' method instead of an Equality check * Supress CA1031 Modify 'get_WindowIcon' to catch a more specific allowed exception type, or rethrow the exception * Code Description CA1062 In externally visible method 'List<Result> Main.Query(Query query)', validate parameter 'query' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fixes For CA1304 The behavior of 'string.ToUpper()' could vary based on the current user's locale settings. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'SearchController.SearchText.set' with a call to 'string.ToLower(CultureInfo)'. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'Window.ProcessName.get' with a call to 'string.ToLower(CultureInfo)'. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'Window.SwitchToWindow()' with a call to 'string.ToLower(CultureInfo)'. CA1304 The behavior of 'string.ToUpper()' could vary based on the current user's locale settings. Replace this call in 'Window.ToString()' with a call to 'string.ToUpper(CultureInfo)'. CA1307 The behavior of 'string.Equals(string?)' could vary based on the current user's locale settings. Replace this call in 'Microsoft.Plugin.WindowWalker.Components.Window.SwitchToWindow()' with a call to 'string.Equals(string?, System.StringComparison)'. * Fix: CA1710 Rename SearchResultUpdateHandler to end in 'EventHandler' * Fix CA1060 Move pinvokes to native methods class * Fix: CS0067 The event 'OpenWindows.OnOpenWindowsUpdateEventHandler' is never used 1) Remove SearchController::OpenWindowsUpdateHandler(object sender, SearchResultUpdateEventArgs e) as it wasn't being called and was redundant with Update Search Text. 2) In Main.cs calling UpdateOpenWindowsList before UpdateSearchText so that the latest enumerated windows will be called. 3) Removing unused OnOpenWindowsUpdateEventHandler and related code. * Revert "Fixes for CA2101 Specify marshaling for P/Invoke string arguments." This reverts commit b3dfe07915dc37618881d348130a7b3c0cd5c59d. * Fixing CA2101 by turning off best fit mapping for methods that require ANSI marshalling. See: https://docs.microsoft.com/en-us/visualstudio/code-quality/ca2101?view=vs-2019 * Previous fix for CA1806 misunderstood int result as hresult. The actual return value is number of characters written. NativeMethods.GetWindowText(hwnd, titleBuffer, sizeOfTitle); * Previous fix for CA1806 misunderstood int result as hresult. The actual return value is number of characters written. NativeMethods.GetClassName(Hwnd, windowClassName, windowClassName.MaxCapacity); * Removing unused window code. This was done instead of validating fxcop changes in WindowIcon. * Fixing typos in Window.cs (charachter -> character)
2020-09-10 09:44:22 -07:00
🚧 [Dev][Build] .NET 8 Upgrade (#28655) * Upgraded projects to target .NET 8 * Updated .NET runtime package targets to use latest .NET 8 build * Updated PowerToys Interop to target .NET 8 * Switch to use ArgumentNullException.ThrowIfNull * ArgumentNullException.ThrowIfNull for CropAndLockViewModel * Switching to ObjectDisposedException.ThrowIf * Upgrade System.ComponentModel.Composition to 8.0 * ArgumentNullException.ThrowIfNull in Helper * Switch to StartsWith using StringComparison.Ordinal * Disabled CA1859, CA1716, SYSLIB1096 analyzers * Update RIDs to reflect breaking changes in .NET 8 * Updated Microsoft NuGet packages to RC1 * Updated Analyzer package to latest .NET 8 preview package * CA1854: Use TryGetValue instead of ContainsKey * [Build] Update TFM to .NET 8 for publish profiles * [Analyzers] Remove CA1309, CA1860-CA1865, CA1869, CA2208 from warning. * [Analyzers] Fix for C26495 * [Analyzers] Disable CS1615, CS9191 * [CI] Target .NET 8 in YAML * [CI] Add .NET preview version flag temporarily. * [FileLocksmith] Update TFM to .NET 8 * [CI] Switch to preview agent * [CI] Update NOTICE.md * [CI] Update Release to target .NET 8 and use Preview agent * [Analyzers] Disable CA1854 * Fix typo * Updated Microsoft.CodeAnalysis.NetAnalyzers to latest preview Updated packages to rc2 * [Analyzers][CPP] Turn off warning for 5271 * [Analyzers][CPP] Turn off warning for 26493 * [KeyboardListener] Add mutex include to resolve error * [PT Run][Folder] Use static SearchValues to resolve CA1870 * [PowerLauncher] Fix TryGetValue * [MouseJumpSettings] Use ArgumentNullException.ThrowIfNull * [Build] Disable parallel dotnet tool restore * [Build] No cache of dotnet tool packages * [Build] Temporarily move .NET 8 SDK task before XAML formatting * [Build][Temp] Try using .NET 7 prior to XAML formatting and then switch to .NET 8 after * [Build] Use .NET 6 for XAML Styler * [CI] Updated NOTICE.md * [FancyZones] Update TFM to .NET 8 * [EnvVar] Update TFM to .NET 8 and update RID * [EnvVar] Use ArgumentNullException.ThrowIfNull * [Dev] Updated packages to .NET 8 RTM version * [Dev] Updated Microsoft.CodeAnalysis.NetAnalyzers to latest * [CI] Updated NOTICE.md with latest package versions * Fix new utility target fameworks and runtimeids * Don't use preview images anymore * [CI] Add script to update VCToolsVersion environment variable * [CI] Add Step to Verify VCToolsVersion * [CI] Use latest flag for vswhere to set proper VCToolsVersion * Add VCToolsVersion checking to release.yml * Remove net publishing from local/ PR CI builds * Revert "Remove net publishing from local/ PR CI builds" This reverts commit f469778996c5053e8bf93233e8191858c46f6420. * Only publish necessary projects * Add verbosity to release pipelines builds of PowerTOys * Set VCToolsVersion for publish.cmd when called from installer * [Installer] Moved project publish logic to MSBuild Task * [CI] Revert using publish.cmd * [CI] Set VCToolsVersion and unset ClearDevCommandPromptEnvVars property * Installer publishes for x64 too * Revert "Add verbosity to release pipelines builds of PowerTOys" This reverts commit 654d4a7f7852e95e44df315c473c02d38b1f538b. * [Dev] Update CodeAnalysis library to non-preview package * Remove unneeded warning removal * Fix Notice.md * Rename VCToolsVersion file and task name * Remove unneeded mutex header include --------- Co-authored-by: Jaime Bernardo <jaime@janeasystems.com>
2023-11-22 12:46:59 -05:00
ArgumentNullException.ThrowIfNull(text);
Adding FxCop to Microsoft.Plugin.WindowWalker (#6260) * Adding FxCop to Microsoft.Plugin.WindowWalker * Delete WindowResult.cs -- Fix for CA1812 WindowResult is an internal class that is apparently never instantiated. If so, remove the code from the assembly. If this class is intended to contain only static members, make it static (Shared in Visual Basic). * Fix for CA1806 UpdateOpenWindowsList calls EnumWindows but does not use the HRESULT or error code that the method returns. This could lead to unexpected behavior in error conditions or low-resource situations. Use the result in a conditional statement, assign the result to a variable, or pass it as an argument to another method. * Fix for: CA1066 Type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.RECT should implement IEquatable<T> because it overrides Equals * Fix for: CA1052 Type 'FuzzyMatching' is a static holder type but is neither static nor NotInheritable * Suppress for CA1069 - These values are defined in https://docs.microsoft.com/en-us/windows/win32/winmsg/extended-window-styles. CA1069 The enum member 'WS_EX_LTRREADING' has the same constant value '0' as member 'WS_EX_LEFT' CA1069 The enum member 'WS_EX_RIGHTSCROLLBAR' has the same constant value '0' as member 'WS_EX_LEFT' * Supress CA1069 Code Description CA1069 The enum member 'SWP_NOREPOSITION' has the same constant value '512' as member 'SWP_NOOWNERZORDER' CA1069 The enum member 'SWP_FRAMECHANGED' has the same constant value '32' as member 'SWP_DRAWFRAME' * Suprress CA1069 for ShowWindow values. See https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-showwindow CA1069 The enum member 'ShowMaximized' has the same constant value '3' as member 'Maximize' * Fix code formatting error * Fix for CA2235: Making POINT serializable CA2235 Field MinPosition is a member of type WINDOWPLACEMENT which is serializable but is of type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.POINT which is not serializable CA2235 Field MaxPosition is a member of type WINDOWPLACEMENT which is serializable but is of type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.POINT which is not serializable * Fix CA2235 Making RECT serializable CA2235 Field NormalPosition is a member of type WINDOWPLACEMENT which is serializable but is of type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.RECT which is not serializable * Fixes for CA2101 Specify marshaling for P/Invoke string arguments. * Fixes for CA2007 Consider calling ConfigureAwait on the awaited task * Fixes for the following (CA1822 / CA1801): CA1822 Member 'OnOpenWindowsUpdate' does not access instance data and can be marked as static Code Description CA1801 Parameter value of method add_OnOpenWindowsUpdate is never used. Remove the parameter or use it in the method body. CA1801 Parameter value of method remove_OnOpenWindowsUpdate is never used. Remove the parameter or use it in the method body. * Fix: CA1710 Rename OpenWindowsUpdateHandler to end in 'EventHandler' * Fix CA1822 Member 'GetProcessIDFromWindowHandle' does not access instance data and can be marked as static * Fix CA1062 In externally visible method 'List<int> FuzzyMatching.FindBestFuzzyMatch(string text, string searchText)', validate parameter 'searchText' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fixes for CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'FuzzyMatching.FindBestFuzzyMatch(string, string)' with a call to 'string.ToLower(CultureInfo)'. Code Description CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'FuzzyMatching.FindBestFuzzyMatch(string, string)' with a call to 'string.ToLower(CultureInfo)'. * Supressing warning for CA1814: Prefer jagged arrays over multidimensional however this might be something to consider if needing to optimize the window walker search. * Fix: CA1062 In externally visible method 'List<List<int>> FuzzyMatching.GetAllMatchIndexes(bool[,] matches)', validate parameter 'matches' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fix for CA1062 In externally visible method 'int FuzzyMatching.CalculateScoreForMatches(List<int> matches)', validate parameter 'matches' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fixes for CA1806 Calls x... but does not use the HRESULT or error code that the method returns. This could lead to unexpected behavior in error conditions or low-resource situations. Use the result in a conditional statement, assign the result to a variable, or pass it as an argument to another method. Using discard for methods that return void, and checking the hresult before returning parameters. * Fix for CA1820 Test for empty strings using 'string.Length' property or 'string.IsNullOrEmpty' method instead of an Equality check * Supress CA1031 Modify 'get_WindowIcon' to catch a more specific allowed exception type, or rethrow the exception * Code Description CA1062 In externally visible method 'List<Result> Main.Query(Query query)', validate parameter 'query' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fixes For CA1304 The behavior of 'string.ToUpper()' could vary based on the current user's locale settings. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'SearchController.SearchText.set' with a call to 'string.ToLower(CultureInfo)'. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'Window.ProcessName.get' with a call to 'string.ToLower(CultureInfo)'. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'Window.SwitchToWindow()' with a call to 'string.ToLower(CultureInfo)'. CA1304 The behavior of 'string.ToUpper()' could vary based on the current user's locale settings. Replace this call in 'Window.ToString()' with a call to 'string.ToUpper(CultureInfo)'. CA1307 The behavior of 'string.Equals(string?)' could vary based on the current user's locale settings. Replace this call in 'Microsoft.Plugin.WindowWalker.Components.Window.SwitchToWindow()' with a call to 'string.Equals(string?, System.StringComparison)'. * Fix: CA1710 Rename SearchResultUpdateHandler to end in 'EventHandler' * Fix CA1060 Move pinvokes to native methods class * Fix: CS0067 The event 'OpenWindows.OnOpenWindowsUpdateEventHandler' is never used 1) Remove SearchController::OpenWindowsUpdateHandler(object sender, SearchResultUpdateEventArgs e) as it wasn't being called and was redundant with Update Search Text. 2) In Main.cs calling UpdateOpenWindowsList before UpdateSearchText so that the latest enumerated windows will be called. 3) Removing unused OnOpenWindowsUpdateEventHandler and related code. * Revert "Fixes for CA2101 Specify marshaling for P/Invoke string arguments." This reverts commit b3dfe07915dc37618881d348130a7b3c0cd5c59d. * Fixing CA2101 by turning off best fit mapping for methods that require ANSI marshalling. See: https://docs.microsoft.com/en-us/visualstudio/code-quality/ca2101?view=vs-2019 * Previous fix for CA1806 misunderstood int result as hresult. The actual return value is number of characters written. NativeMethods.GetWindowText(hwnd, titleBuffer, sizeOfTitle); * Previous fix for CA1806 misunderstood int result as hresult. The actual return value is number of characters written. NativeMethods.GetClassName(Hwnd, windowClassName, windowClassName.MaxCapacity); * Removing unused window code. This was done instead of validating fxcop changes in WindowIcon. * Fixing typos in Window.cs (charachter -> character)
2020-09-10 09:44:22 -07:00
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
var sLower = searchText.ToLower(CultureInfo.CurrentCulture);
var tLower = text.ToLower(CultureInfo.CurrentCulture);
int m = sLower.Length;
int n = tLower.Length;
// A subsequence longer than the candidate text can never match.
if (m > n)
2020-08-17 10:00:56 -07:00
{
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
return [];
2020-08-17 10:00:56 -07:00
}
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
// bestStart[k, i] stores the latest possible start index of a match for s[0..k] that
// ends exactly at t[i], or -1 if no such match exists.
//
// Tracking the latest start ensures that we only retain the smallest span of all matches
// that end at i.
int[,] bestStart = new int[m, n];
2020-08-17 10:00:56 -07:00
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
// parent[k, i] stores the index where the previous character matched to allow for
// reconstruction of the best path once the DP step completes.
int[,] parent = new int[m, n];
2020-08-17 10:00:56 -07:00
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
// Initialize tables.
for (int k = 0; k < m; k++)
2020-08-17 10:00:56 -07:00
{
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
for (int i = 0; i < n; i++)
2020-08-17 10:00:56 -07:00
{
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
bestStart[k, i] = -1;
2020-08-17 10:00:56 -07:00
}
}
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
// Base case: match the first character of the search string s[0].
for (int i = 0; i < n; i++)
{
if (tLower[i] == sLower[0])
{
bestStart[0, i] = i;
parent[0, i] = -1;
}
}
2020-08-17 10:00:56 -07:00
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
// Dynamic programming step: extend matches for the remaining characters s[1..m-1].
for (int k = 1; k < m; k++)
2020-08-17 10:00:56 -07:00
{
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
int currentMaxStart = -1;
int currentParentIndex = -1;
for (int i = 0; i < n; i++)
2020-08-17 10:00:56 -07:00
{
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
// 1. Try to match s[k] at t[i].
// We must use a valid start from the previous row (k-1) that appeared BEFORE i.
// 'currentMaxStart' holds the best start value from indices 0 to i-1.
if (tLower[i] == sLower[k])
2020-08-17 10:00:56 -07:00
{
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
if (currentMaxStart != -1)
{
bestStart[k, i] = currentMaxStart;
parent[k, i] = currentParentIndex;
}
2020-08-17 10:00:56 -07:00
}
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
// 2. Maintain the dominating predecessor for the next column.
// We only keep the match with the latest start index, as it strictly dominates
// all earlier-starting matches for the purpose of minimizing the match span.
if (bestStart[k - 1, i] > currentMaxStart)
2020-08-17 10:00:56 -07:00
{
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
currentMaxStart = bestStart[k - 1, i];
currentParentIndex = i;
}
}
}
2020-08-17 10:00:56 -07:00
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
// Select the ending position that minimizes span.
int bestEndIndex = -1;
int maxScore = int.MinValue;
// Score logic: -(LastIndex - StartIndex).
// We want to Maximize Score => Minimize Span.
for (int i = 0; i < n; i++)
{
if (bestStart[m - 1, i] != -1)
{
int start = bestStart[m - 1, i];
int score = -(i - start);
2020-08-17 10:00:56 -07:00
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
if (score > maxScore)
{
maxScore = score;
bestEndIndex = i;
2020-08-17 10:00:56 -07:00
}
}
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
}
if (bestEndIndex == -1)
{
return [];
}
2020-08-17 10:00:56 -07:00
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
// Reconstruct only the winning path.
var result = new List<int>(m);
int curr = bestEndIndex;
for (int k = m - 1; k >= 0; k--)
{
result.Add(curr);
curr = parent[k, curr];
2020-08-17 10:00:56 -07:00
}
[Run] Replace WindowWalker's brute-force fuzzy matching algorithm with optimal DP solution (#44551) ## Summary of the Pull Request Window Walker's fuzzy string matching algorithm exhibits exponential memory usage and execution time when given inputs containing repeated characters or phrases. When a user has several windows open with long titles (such as browser windows), it is straightforward to trigger a pathological case which uses up gigabytes of memory and freezes the UI. This is exacerbated by Run's lack of thread pruning, meaning work triggered by older keystrokes consumes CPU and memory until completion. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist - [x] Closes: #44546 - [x] Closes: #44184 - [ ] **Communication:** I've discussed this with core contributors already. If the work hasn't been agreed, this work might be rejected - [ ] **Tests:** Added/updated and all pass - [ ] **Localization:** All end-user-facing strings can be localized - [ ] **Dev docs:** Added/updated - [ ] **New binaries:** Added on the required places - [ ] [JSON for signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json) for new binaries - [ ] [WXS for installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs) for new binaries and localization folder - [ ] [YML for CI pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml) for new test projects - [ ] [YML for signed pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml) - [ ] **Documentation updated:** If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys) and link it here: #xxx ## Detailed Description of the Pull Request / Additional comments The existing algorithm in `FuzzyMatching.cs` is greedy, creating all possible matching combinations of the search string within the candidate via its `GetAllMatchIndexes()` method. After this, it selects the best match and discards the others. This may be considered reasonable for small search strings, but it causes a combinatorial explosion when there are multiple possible matches where characters or substrings repeat, even when the search string is small. The current brute-force algorithm has time complexity of **O(n * m * C(n,m))** where **C(n,m)** = **n!/(m!(n-m)!)** and space complexity of **O(C(n,m) * m)** because it stores all possible match combinations before choosing the best. For example, matching `"eeee"` in `"eeeeeeee"` creates **C(8,4)** = **70** match combinations, which stores 70 lists with 4 integers each, plus overhead from the LINQ-based list copying and appending: ```csharp var tempList = results .Where(x => x.Count == secondIndex && x[x.Count - 1] < firstIndex) .Select(x => x.ToList()) // Creates a full copy of each matching path .ToList(); // Materializes all copies results.AddRange(tempList); // Adds lists to results ``` Each potential sub-match may be recalculated many times. Window Walker queries across all window titles, so this problem will be magnified if the search text happens to match multiple titles and/or if a search string containing a single repeated character is used. For browser windows, where titles may be long, this is especially problematic, and similarly for Explorer windows with longer paths. ## Proposed solution The solution presented here is to use a dynamic programming algorithm which finds the optimal match directly without generating all possibilities. In terms of complexity, the new algorithm benefits from a single pass through its DP table and only has to store two integer arrays which are sized proportionally to the search and candidate text string lengths; so **O(n * m)** for both time and space, i.e. polynomial instead of exponential. Scoring is equivalent between the old and new algorithms, based strictly on the minimum match span within the candidate string. ## Implementation notes The new algorithm tracks the best start index for matches ending at each position, eliminating the need to store all possible paths. By storing the "latest best match so far" as you scan through the search text, you are guaranteed to minimise the span length. To recreate the best match, a separate table of parent indexes is kept and iterated backwards once the DP step is complete. Reversing this provides you with the same result (or equivalent if there are multiple best matches) as the original algorithm. For this "minimum-span" fuzzy matching method, this should be optimal as it only scans once and storage is proportional to the search and candidate strings only. ## Benchmarks A verification and benchmarking suite is here: https://github.com/daverayment/WindowWalkerBench Results from comparing the old and new algorithms are here: https://docs.google.com/spreadsheets/d/1eXmmnN2eI3774QxXXyx1Dv4SKu78U96q28GYnpHT0_8/edit?usp=sharing | Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio | |---------------- |-----------------:|-----------------:|-----------------:|-----------:|----------:|-----------:|-----------:|----------:|-------------:|------------:| | Old_Normal | 4,034.4 ns | 220.94 ns | 647.98 ns | 1.02 | 0.23 | 1.9760 | - | - | 8.09 KB | 1.00 | | New_Normal | 804.5 ns | 24.29 ns | 70.47 ns | 0.20 | 0.04 | 0.4339 | - | - | 1.77 KB | 0.22 | | Old_Repetitive | 7,624.7 ns | 318.06 ns | 912.57 ns | 1.94 | 0.38 | 3.7079 | - | - | 15.16 KB | 1.87 | | New_Repetitive | 2,714.6 ns | 109.03 ns | 318.03 ns | 0.69 | 0.13 | 1.6403 | - | - | 6.72 KB | 0.83 | | Old_Explosion | 881,443,209.3 ns | 26,273,980.96 ns | 76,225,588.43 ns | 223,872.87 | 39,357.31 | 50000.0000 | 27000.0000 | 5000.0000 | 351885.11 KB | 43,518.16 | | New_Explosion | 3,225.4 ns | 111.98 ns | 315.84 ns | 0.82 | 0.15 | 1.7738 | - | - | 7.26 KB | 0.90 | | Old_Explosion_8 | 460,153,862.6 ns | 18,744,417.95 ns | 54,974,137.06 ns | 116,871.93 | 22,719.87 | 25000.0000 | 14000.0000 | 3000.0000 | 173117.13 KB | 21,409.65 | | New_Explosion_8 | 2,958.3 ns | 78.16 ns | 230.45 ns | 0.75 | 0.13 | 1.5793 | - | - | 6.46 KB | 0.80 | | Old_Explosion_7 | 189,069,384.8 ns | 3,774,916.46 ns | 6,202,296.49 ns | 48,020.68 | 7,501.98 | 11000.0000 | 6333.3333 | 2000.0000 | 71603.96 KB | 8,855.37 | | New_Explosion_7 | 2,667.5 ns | 117.69 ns | 337.68 ns | 0.68 | 0.13 | 1.3924 | - | - | 5.7 KB | 0.70 | | Old_Explosion_6 | 71,960,114.8 ns | 1,757,017.15 ns | 5,125,301.87 ns | 18,276.75 | 3,083.86 | 4500.0000 | 2666.6667 | 1333.3333 | 25515.96 KB | 3,155.60 | | New_Explosion_6 | 2,232.5 ns | 72.65 ns | 202.52 ns | 0.57 | 0.10 | 1.1978 | - | - | 4.91 KB | 0.61 | | Old_Explosion_5 | 9,121,126.4 ns | 180,744.42 ns | 228,583.84 ns | 2,316.62 | 358.55 | 1000.0000 | 968.7500 | 484.3750 | 7630.49 KB | 943.67 | | New_Explosion_5 | 1,917.3 ns | 48.63 ns | 133.95 ns | 0.49 | 0.08 | 1.0109 | - | - | 4.13 KB | 0.51 | | Old_Explosion_4 | 2,489,593.2 ns | 82,937.33 ns | 236,624.90 ns | 632.32 | 113.96 | 281.2500 | 148.4375 | 74.2188 | 1729.71 KB | 213.92 | | New_Explosion_4 | 1,598.3 ns | 51.92 ns | 152.28 ns | 0.41 | 0.07 | 0.8163 | - | - | 3.34 KB | 0.41 | | Old_Explosion_3 | 202,814.0 ns | 7,684.44 ns | 22,293.96 ns | 51.51 | 9.72 | 72.7539 | 0.2441 | - | 298.13 KB | 36.87 | | New_Explosion_3 | 1,222.5 ns | 26.07 ns | 76.45 ns | 0.31 | 0.05 | 0.6275 | - | - | 2.57 KB | 0.32 | | Old_Subsequence | 419,417.7 ns | 8,308.97 ns | 22,178.33 ns | 106.53 | 17.23 | 266.6016 | 0.9766 | - | 1090.05 KB | 134.81 | | New_Subsequence | 2,501.9 ns | 80.91 ns | 233.43 ns | 0.64 | 0.11 | 1.3542 | - | - | 5.55 KB | 0.69 | (Where "Old_Explosion" is "e" repeated 9 times. Times in nanoseconds or one millionth of a millisecond.) It is worth noting that the results show a **single string match**. So matching "eeeeee" against a 99-character string took 25 MB of memory and 71 milliseconds to compute. For the new algorithm, this is reduced down to <5KB and 0.002 milliseconds. Even for a three-character repetition, the new algorithm is >150x faster with <1% of the allocations. ## Real world example **Before (results still pending after more than a minute):** <img width="837" height="336" alt="Image" src="https://github.com/user-attachments/assets/c4c3ae04-6a47-40b9-a2a4-7a4da169f7d5" /> **After (instantaneous results):** <img width="829" height="444" alt="image" src="https://github.com/user-attachments/assets/055fc4a6-f34f-4bed-a12c-408b52274de2" /> ## Validation Steps Performed The verification tests in the benchmark project pass, with results identical to the original across a number of test cases, including the pathological cases identified earlier and edge cases such as single-character searches. All unit tests under `Wox.Test`, including all 38 `FuzzyMatcherTest` entries still pass.
2026-03-02 12:45:14 +00:00
result.Reverse();
return result;
2020-08-17 10:00:56 -07:00
}
/// <summary>
/// Calculates the score for a string
/// </summary>
/// <param name="matches">the index of the matches</param>
/// <returns>an integer representing the score</returns>
internal static int CalculateScoreForMatches(List<int> matches)
2020-08-17 10:00:56 -07:00
{
🚧 [Dev][Build] .NET 8 Upgrade (#28655) * Upgraded projects to target .NET 8 * Updated .NET runtime package targets to use latest .NET 8 build * Updated PowerToys Interop to target .NET 8 * Switch to use ArgumentNullException.ThrowIfNull * ArgumentNullException.ThrowIfNull for CropAndLockViewModel * Switching to ObjectDisposedException.ThrowIf * Upgrade System.ComponentModel.Composition to 8.0 * ArgumentNullException.ThrowIfNull in Helper * Switch to StartsWith using StringComparison.Ordinal * Disabled CA1859, CA1716, SYSLIB1096 analyzers * Update RIDs to reflect breaking changes in .NET 8 * Updated Microsoft NuGet packages to RC1 * Updated Analyzer package to latest .NET 8 preview package * CA1854: Use TryGetValue instead of ContainsKey * [Build] Update TFM to .NET 8 for publish profiles * [Analyzers] Remove CA1309, CA1860-CA1865, CA1869, CA2208 from warning. * [Analyzers] Fix for C26495 * [Analyzers] Disable CS1615, CS9191 * [CI] Target .NET 8 in YAML * [CI] Add .NET preview version flag temporarily. * [FileLocksmith] Update TFM to .NET 8 * [CI] Switch to preview agent * [CI] Update NOTICE.md * [CI] Update Release to target .NET 8 and use Preview agent * [Analyzers] Disable CA1854 * Fix typo * Updated Microsoft.CodeAnalysis.NetAnalyzers to latest preview Updated packages to rc2 * [Analyzers][CPP] Turn off warning for 5271 * [Analyzers][CPP] Turn off warning for 26493 * [KeyboardListener] Add mutex include to resolve error * [PT Run][Folder] Use static SearchValues to resolve CA1870 * [PowerLauncher] Fix TryGetValue * [MouseJumpSettings] Use ArgumentNullException.ThrowIfNull * [Build] Disable parallel dotnet tool restore * [Build] No cache of dotnet tool packages * [Build] Temporarily move .NET 8 SDK task before XAML formatting * [Build][Temp] Try using .NET 7 prior to XAML formatting and then switch to .NET 8 after * [Build] Use .NET 6 for XAML Styler * [CI] Updated NOTICE.md * [FancyZones] Update TFM to .NET 8 * [EnvVar] Update TFM to .NET 8 and update RID * [EnvVar] Use ArgumentNullException.ThrowIfNull * [Dev] Updated packages to .NET 8 RTM version * [Dev] Updated Microsoft.CodeAnalysis.NetAnalyzers to latest * [CI] Updated NOTICE.md with latest package versions * Fix new utility target fameworks and runtimeids * Don't use preview images anymore * [CI] Add script to update VCToolsVersion environment variable * [CI] Add Step to Verify VCToolsVersion * [CI] Use latest flag for vswhere to set proper VCToolsVersion * Add VCToolsVersion checking to release.yml * Remove net publishing from local/ PR CI builds * Revert "Remove net publishing from local/ PR CI builds" This reverts commit f469778996c5053e8bf93233e8191858c46f6420. * Only publish necessary projects * Add verbosity to release pipelines builds of PowerTOys * Set VCToolsVersion for publish.cmd when called from installer * [Installer] Moved project publish logic to MSBuild Task * [CI] Revert using publish.cmd * [CI] Set VCToolsVersion and unset ClearDevCommandPromptEnvVars property * Installer publishes for x64 too * Revert "Add verbosity to release pipelines builds of PowerTOys" This reverts commit 654d4a7f7852e95e44df315c473c02d38b1f538b. * [Dev] Update CodeAnalysis library to non-preview package * Remove unneeded warning removal * Fix Notice.md * Rename VCToolsVersion file and task name * Remove unneeded mutex header include --------- Co-authored-by: Jaime Bernardo <jaime@janeasystems.com>
2023-11-22 12:46:59 -05:00
ArgumentNullException.ThrowIfNull(matches);
Adding FxCop to Microsoft.Plugin.WindowWalker (#6260) * Adding FxCop to Microsoft.Plugin.WindowWalker * Delete WindowResult.cs -- Fix for CA1812 WindowResult is an internal class that is apparently never instantiated. If so, remove the code from the assembly. If this class is intended to contain only static members, make it static (Shared in Visual Basic). * Fix for CA1806 UpdateOpenWindowsList calls EnumWindows but does not use the HRESULT or error code that the method returns. This could lead to unexpected behavior in error conditions or low-resource situations. Use the result in a conditional statement, assign the result to a variable, or pass it as an argument to another method. * Fix for: CA1066 Type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.RECT should implement IEquatable<T> because it overrides Equals * Fix for: CA1052 Type 'FuzzyMatching' is a static holder type but is neither static nor NotInheritable * Suppress for CA1069 - These values are defined in https://docs.microsoft.com/en-us/windows/win32/winmsg/extended-window-styles. CA1069 The enum member 'WS_EX_LTRREADING' has the same constant value '0' as member 'WS_EX_LEFT' CA1069 The enum member 'WS_EX_RIGHTSCROLLBAR' has the same constant value '0' as member 'WS_EX_LEFT' * Supress CA1069 Code Description CA1069 The enum member 'SWP_NOREPOSITION' has the same constant value '512' as member 'SWP_NOOWNERZORDER' CA1069 The enum member 'SWP_FRAMECHANGED' has the same constant value '32' as member 'SWP_DRAWFRAME' * Suprress CA1069 for ShowWindow values. See https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-showwindow CA1069 The enum member 'ShowMaximized' has the same constant value '3' as member 'Maximize' * Fix code formatting error * Fix for CA2235: Making POINT serializable CA2235 Field MinPosition is a member of type WINDOWPLACEMENT which is serializable but is of type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.POINT which is not serializable CA2235 Field MaxPosition is a member of type WINDOWPLACEMENT which is serializable but is of type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.POINT which is not serializable * Fix CA2235 Making RECT serializable CA2235 Field NormalPosition is a member of type WINDOWPLACEMENT which is serializable but is of type Microsoft.Plugin.WindowWalker.Components.InteropAndHelpers.RECT which is not serializable * Fixes for CA2101 Specify marshaling for P/Invoke string arguments. * Fixes for CA2007 Consider calling ConfigureAwait on the awaited task * Fixes for the following (CA1822 / CA1801): CA1822 Member 'OnOpenWindowsUpdate' does not access instance data and can be marked as static Code Description CA1801 Parameter value of method add_OnOpenWindowsUpdate is never used. Remove the parameter or use it in the method body. CA1801 Parameter value of method remove_OnOpenWindowsUpdate is never used. Remove the parameter or use it in the method body. * Fix: CA1710 Rename OpenWindowsUpdateHandler to end in 'EventHandler' * Fix CA1822 Member 'GetProcessIDFromWindowHandle' does not access instance data and can be marked as static * Fix CA1062 In externally visible method 'List<int> FuzzyMatching.FindBestFuzzyMatch(string text, string searchText)', validate parameter 'searchText' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fixes for CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'FuzzyMatching.FindBestFuzzyMatch(string, string)' with a call to 'string.ToLower(CultureInfo)'. Code Description CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'FuzzyMatching.FindBestFuzzyMatch(string, string)' with a call to 'string.ToLower(CultureInfo)'. * Supressing warning for CA1814: Prefer jagged arrays over multidimensional however this might be something to consider if needing to optimize the window walker search. * Fix: CA1062 In externally visible method 'List<List<int>> FuzzyMatching.GetAllMatchIndexes(bool[,] matches)', validate parameter 'matches' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fix for CA1062 In externally visible method 'int FuzzyMatching.CalculateScoreForMatches(List<int> matches)', validate parameter 'matches' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fixes for CA1806 Calls x... but does not use the HRESULT or error code that the method returns. This could lead to unexpected behavior in error conditions or low-resource situations. Use the result in a conditional statement, assign the result to a variable, or pass it as an argument to another method. Using discard for methods that return void, and checking the hresult before returning parameters. * Fix for CA1820 Test for empty strings using 'string.Length' property or 'string.IsNullOrEmpty' method instead of an Equality check * Supress CA1031 Modify 'get_WindowIcon' to catch a more specific allowed exception type, or rethrow the exception * Code Description CA1062 In externally visible method 'List<Result> Main.Query(Query query)', validate parameter 'query' is non-null before using it. If appropriate, throw an ArgumentNullException when the argument is null or add a Code Contract precondition asserting non-null argument. * Fixes For CA1304 The behavior of 'string.ToUpper()' could vary based on the current user's locale settings. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'SearchController.SearchText.set' with a call to 'string.ToLower(CultureInfo)'. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'Window.ProcessName.get' with a call to 'string.ToLower(CultureInfo)'. CA1304 The behavior of 'string.ToLower()' could vary based on the current user's locale settings. Replace this call in 'Window.SwitchToWindow()' with a call to 'string.ToLower(CultureInfo)'. CA1304 The behavior of 'string.ToUpper()' could vary based on the current user's locale settings. Replace this call in 'Window.ToString()' with a call to 'string.ToUpper(CultureInfo)'. CA1307 The behavior of 'string.Equals(string?)' could vary based on the current user's locale settings. Replace this call in 'Microsoft.Plugin.WindowWalker.Components.Window.SwitchToWindow()' with a call to 'string.Equals(string?, System.StringComparison)'. * Fix: CA1710 Rename SearchResultUpdateHandler to end in 'EventHandler' * Fix CA1060 Move pinvokes to native methods class * Fix: CS0067 The event 'OpenWindows.OnOpenWindowsUpdateEventHandler' is never used 1) Remove SearchController::OpenWindowsUpdateHandler(object sender, SearchResultUpdateEventArgs e) as it wasn't being called and was redundant with Update Search Text. 2) In Main.cs calling UpdateOpenWindowsList before UpdateSearchText so that the latest enumerated windows will be called. 3) Removing unused OnOpenWindowsUpdateEventHandler and related code. * Revert "Fixes for CA2101 Specify marshaling for P/Invoke string arguments." This reverts commit b3dfe07915dc37618881d348130a7b3c0cd5c59d. * Fixing CA2101 by turning off best fit mapping for methods that require ANSI marshalling. See: https://docs.microsoft.com/en-us/visualstudio/code-quality/ca2101?view=vs-2019 * Previous fix for CA1806 misunderstood int result as hresult. The actual return value is number of characters written. NativeMethods.GetWindowText(hwnd, titleBuffer, sizeOfTitle); * Previous fix for CA1806 misunderstood int result as hresult. The actual return value is number of characters written. NativeMethods.GetClassName(Hwnd, windowClassName, windowClassName.MaxCapacity); * Removing unused window code. This was done instead of validating fxcop changes in WindowIcon. * Fixing typos in Window.cs (charachter -> character)
2020-09-10 09:44:22 -07:00
2020-08-17 10:00:56 -07:00
var score = 0;
for (int currentIndex = 1; currentIndex < matches.Count; currentIndex++)
{
var previousIndex = currentIndex - 1;
score -= matches[currentIndex] - matches[previousIndex];
}
return score == 0 ? -10000 : score;
}
}
}