fix: Improve Unicode normalization and add regex metachar tests (#44944)

Enhanced SanitizeAndNormalize to handle Unicode normalization more
robustly, ensuring correct buffer sizing and error handling. Added unit
tests for regex metacharacters `$` and `^` to verify correct replacement
behavior at string boundaries. Improves Unicode support and test
coverage for regex edge cases.

<!-- Enter a brief description/summary of your PR here. What does it
fix/what does it change/how was it tested (even manually, if necessary)?
-->
## Summary of the Pull Request

<!-- Please review the items on the PR checklist before submitting-->
## PR Checklist

- [x] Closes: #44942 #44892
<!-- - [ ] Closes: #yyy (add separate lines for additional resolved
issues) -->
- [ ] **Communication:** I've discussed this with core contributors
already. If the work hasn't been agreed, this work might be rejected
- [ ] **Tests:** Added/updated and all pass
- [ ] **Localization:** All end-user-facing strings can be localized
- [ ] **Dev docs:** Added/updated
- [ ] **New binaries:** Added on the required places
- [ ] [JSON for
signing](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ESRPSigning_core.json)
for new binaries
- [ ] [WXS for
installer](https://github.com/microsoft/PowerToys/blob/main/installer/PowerToysSetup/Product.wxs)
for new binaries and localization folder
- [ ] [YML for CI
pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/ci/templates/build-powertoys-steps.yml)
for new test projects
- [ ] [YML for signed
pipeline](https://github.com/microsoft/PowerToys/blob/main/.pipelines/release.yml)
- [ ] **Documentation updated:** If checked, please file a pull request
on [our docs
repo](https://github.com/MicrosoftDocs/windows-uwp/tree/docs/hub/powertoys)
and link it here: #xxx

<!-- Provide a more detailed description of the PR, other things fixed,
or any additional comments/features here -->
## Detailed Description of the Pull Request / Additional comments

<!-- Describe how you validated the behavior. Add automated tests
wherever possible, but list manual validation steps taken as well -->
## Validation Steps Performed

---------

Co-authored-by: Yu Leng <yuleng@microsoft.com>
This commit is contained in:
moooyo
2026-01-23 14:43:21 +08:00
committed by GitHub
parent 60b8419366
commit d192672c74
3 changed files with 44 additions and 7 deletions

View File

@@ -33,23 +33,27 @@ static std::wstring SanitizeAndNormalize(const std::wstring& input)
// Normalize to NFC (Precomposed).
// Get the size needed for the normalized string, including null terminator.
int size = NormalizeString(NormalizationC, sanitized.c_str(), -1, nullptr, 0);
if (size <= 0)
int sizeEstimate = NormalizeString(NormalizationC, sanitized.c_str(), -1, nullptr, 0);
if (sizeEstimate <= 0)
{
return sanitized; // Return unaltered if normalization fails.
}
// Perform the normalization.
std::wstring normalized;
normalized.resize(size);
NormalizeString(NormalizationC, sanitized.c_str(), -1, &normalized[0], size);
normalized.resize(sizeEstimate);
int actualSize = NormalizeString(NormalizationC, sanitized.c_str(), -1, &normalized[0], sizeEstimate);
// Remove the explicit null terminator added by NormalizeString.
if (!normalized.empty() && normalized.back() == L'\0')
if (actualSize <= 0)
{
normalized.pop_back();
// Normalization failed, return sanitized string.
return sanitized;
}
// Resize to actual size minus the null terminator.
// actualSize includes the null terminator when input length is -1.
normalized.resize(static_cast<size_t>(actualSize) - 1);
return normalized;
}

View File

@@ -695,6 +695,38 @@ TEST_METHOD(VerifyUnicodeAndWhitespaceNormalizationRegex)
VerifyNormalizationHelper(UseRegularExpressions);
}
TEST_METHOD(VerifyRegexMetacharacterDollarSign)
{
CComPtr<IPowerRenameRegEx> renameRegEx;
Assert::IsTrue(CPowerRenameRegEx::s_CreateInstance(&renameRegEx) == S_OK);
DWORD flags = UseRegularExpressions;
Assert::IsTrue(renameRegEx->PutFlags(flags) == S_OK);
PWSTR result = nullptr;
Assert::IsTrue(renameRegEx->PutSearchTerm(L"$") == S_OK);
Assert::IsTrue(renameRegEx->PutReplaceTerm(L"_end") == S_OK);
unsigned long index = {};
Assert::IsTrue(renameRegEx->Replace(L"test.txt", &result, index) == S_OK);
Assert::AreEqual(L"test.txt_end", result);
CoTaskMemFree(result);
}
TEST_METHOD(VerifyRegexMetacharacterCaret)
{
CComPtr<IPowerRenameRegEx> renameRegEx;
Assert::IsTrue(CPowerRenameRegEx::s_CreateInstance(&renameRegEx) == S_OK);
DWORD flags = UseRegularExpressions;
Assert::IsTrue(renameRegEx->PutFlags(flags) == S_OK);
PWSTR result = nullptr;
Assert::IsTrue(renameRegEx->PutSearchTerm(L"^") == S_OK);
Assert::IsTrue(renameRegEx->PutReplaceTerm(L"start_") == S_OK);
unsigned long index = {};
Assert::IsTrue(renameRegEx->Replace(L"test.txt", &result, index) == S_OK);
Assert::AreEqual(L"start_test.txt", result);
CoTaskMemFree(result);
}
#ifndef TESTS_PARTIAL
};
}