Compare commits

...

5 Commits

Author SHA1 Message Date
Gordon Lam (SH)
0371b4134c Add correct header ps1 comment 2025-09-30 13:20:04 +08:00
Gordon Lam (SH)
e963125210 Fix the instruction and dump prs 2025-09-30 11:51:30 +08:00
Gordon Lam (SH)
235c3ef3e6 Change the instruction for step 5 to explicitly using SampleOutput.md as sample 2025-09-29 12:19:15 +08:00
Gordon Lam (SH)
bb67ae4068 Instruct better for agent doing 2025-09-29 12:19:15 +08:00
Gordon Lam (SH)
39646748a0 Initial draft 2025-09-29 12:19:14 +08:00
7 changed files with 694 additions and 0 deletions

View File

@@ -0,0 +1,76 @@
## Background
This document describes how to collect pull requests for a milestone, request a GitHub Copilot code review for each, and produce releasenotes summaries grouped by label.
## Agentmode execution policy (important)
- By default, do NOT run terminal commands or PowerShell scripts beside the ps1 in this folder. Perform all collection, parsing, grouping, and summarization entirely in Agent mode using available files and MCP capabilities.
- Only execute existing scripts if the user explicitly asks you to (optin). Otherwise, assume the input artifacts (milestone_prs.json, sorted_prs.csv, grouped_csv/*) are present or will be provided.
- Do NOT create new scripts unless requested and justified.
## Prerequisites
- Windows with PowerShell 7+ (pwsh)
- GitHub CLI installed and authenticated to the target repo
- gh version that supports Copilot review requests
- Logged in: gh auth login (ensure repo scope)
- Access to the repository configured in the scripts (default: `microsoft/PowerToys`)
- GitHub Copilot code review enabled for the org/repo (required for requesting reviews)
- 'MCP Server: github-remote' is installed, please find it at [github-mcp-server](https://github.com/github/github-mcp-server)
## Files in this repo (overview)
- `dump-prs-information.ps1`: Fetches PRs for a milestone and outputs `milestone_prs.json` and `sorted_prs.csv`
- CSV columns: `Id, Title, Labels, Author, Url, Body, CopilotSummary`
- `diff_prs.ps1`: Creates an incremental CSV by diffing two CSVs (in case more PRs cherry pick to stable)
- `MemberList.md`: Internal contributors list (used to decide when to add external thanks)
- `SampleOutput.md`: Example formatting for summary content
## Step-by-step
1) run `dump-prs-information.ps1` to export PRs for the target milestone (initial run, CopilotSummary likely empty)
- Open `dump-prs-information.ps1` and set:
- `$repo` (e.g., `microsoft/PowerToys`)
- `$milestone` (milestone title exactly as in GitHub, e.g., `PowerToys 0.95`)
- run the script in PowerShell; it will generate `milestone_prs.json` and `sorted_prs.csv`.
2) Request Copilot reviews for each PR listed in the CSV in Agent mode (MUST NOT generate or run any ps1)
- Must use MCP tools "MCP Server: github-remote" in current Agent mode to request Copilot reviews for all PR Ids in `sorted_prs.csv`.
3) run `dump-prs-information.ps1` again
- This refresh collects the latest Copilot review body into the `CopilotSummary` column in `sorted_prs.csv`.
4) run `group-prs-by-label.ps1` to generate `grouped_csv/`
5) Summarize PRs into perlabel Markdown files in Agent mode (MUST NOT generate or run any script in terminal nor ps1)
- Read the the csv files in the folder grouped_csv one by one
- For each label group, create a markdown file under a new folder `grouped_md/` (create if missing). File name: sanitized label group name (same pattern as CSV) with `.md` extension. Example: `Area-Build.md`.
- Each markdown file content must follow the structure below (two sections) and preserve the PR order from the source CSV.
- Do not embed PR numbers in the bullet list lines; only link them in the table.
- If re-running, overwrite existing markdown files (idempotent generation).
- After generation, you should have a 1:1 correspondence between files in `grouped_csv/` and `grouped_md/` (excluding any intentionally skipped groups—document if skipped).
- Generate the summary md file as the following instruction in two parts:
1. Markdown list: one concise, userfacing line per PR (no deep technical jargon). Use "Verbed" + "Scenario" + "Impact" as setence structure. Use `Title`, `Body`, and `CopilotSummary` as sources.
- If `Author` is NOT in `**/MemberList.md`, append a "Thanks @handle!" see `**/SampleOutput.md` as example.
- Do NOT include PR numbers or IDs in the list line; keep the PR link only in the table mentioned in 2. below, please refer to `**/SampleOutput.md` as example.
- If confidence to have enough information for summarization according to guideline above is < 70%, write: `Human Summary Needed: <PR full link>` on that line.
2. Threecolumn table (in the same PR order):
- Column 1: The concise, userfacing summary (the "cut version")
- Column 2: PR link
- Column 3: Confidence (e.g., `High/Medium/Low`) and the reasoning if < 70%
6) According the generated grouped_md/*.md, update back the repo root's `Readme.md`. Here is the guideline:
a. Replace all versioned references in `README.md`:
- Bump current release heading (e.g. **Version 0.xx**) by +0.01.
- Shift link references: previous `[github-current-release-work]` becomes old version; increment `[github-next-release-work]` to point to the following milestone.
- Update download asset filenames (e.g. `PowerToysSetup-0.94.0-...``PowerToysSetup-0.95.0-...`).
b. Build the What's New content from `grouped_md`:
- Combine `Area-Build` and `Area-Tests` entries under a single `Development` subsection (keep bullet order from CSV).
- Each other `Product-*` group gets its own subsection titled by the module name.
- Order subsections alphabetically by their heading text, with **Highlights** always first and **Development** always last (e.g., Environment Variables, File Locksmith, Find My Mouse, ... , ZoomIt, Development).
- Copy bullet lines verbatim from the corresponding `grouped_md` files (preserve punctuation and any trailing `Thanks @handle!`). Do NOT add, remove, or reevaluate thanks in the README stage.
c. Highlights: choose up to 10 bullets focused on user-visible feature additions or impactful fixes (avoid purely internal refactors). Use pattern: `Module/Feature <past-tense verb> <scenario> <impact>`.
d. Keep wording concise (aim 1 line per bullet), no PR numbers, no deep implementation details.
e. After updating, verify total highlight count ≤ 10 and that all internal contributors are not thanked.
## Notes and conventions
- Terminal usage: Disabled by default. Do NOT run terminal commands or ps1 scripts unless the user explicitly instructs you to.
- Do NOT generate/add new ps1 until instructed (and explain why a new script is needed).
- Label filtering in `dump-prs-information.ps1` currently keeps labels matching: `Product-*`, `Area-*`, `Github*`, `*Plugin`, `Issue-*`.
- CSV columns are singleline (line breaks removed) for easier processing.
- Keep PRs in the same order as in `sorted_prs.csv` when building summaries.
- Sanitize filenames: replace spaces with `-`, strip or replace characters that are invalid on Windows (`<>:"/\\|?*`).

View File

@@ -0,0 +1,26 @@
cinnamon-msft
craigloewen-msft
niels9001
dhowett
yeelam-gordon
jamrobot
lei9444
shuaiyuanxx
moooyo
haoliuu
chenmy77
chemwolf6922
yaqingmi
zhaoqpcn
urnotdfs
zhaopy536
wang563681252
vanzue
zadjii-msft
khmyznikov
chatasweetie
michaeljolley
Jaylyn-Barbee
zateutsch
crutkas
app/copilot-swe-agent

View File

@@ -0,0 +1,9 @@
- Added mouse button actions so you can choose what left, right, or middle click does. Thanks [@PesBandi](https://github.com/PesBandi)!
- Aligned window styling with current Windows theme for a cleaner look. Thanks [@sadirano](https://github.com/sadirano)!
- Ensured screen readers are notified when the selected item in the list changes for better accessibility.
- Implemented configurable UI test pipeline that can use pre-built official releases instead of building everything from scratch, reducing test execution time from 2+ hours.
- Fixed Alt+Left Arrow navigation not working when search box contains text. Thanks [@jiripolasek](https://github.com/jiripolasek)!

View File

@@ -0,0 +1,100 @@
<#
.SYNOPSIS
Produce an incremental PR CSV containing rows present in a newer full export but absent from a baseline export.
.DESCRIPTION
Compares two previously generated sorted PR CSV files (same schema). Any row whose key column value
(defaults to 'Number') does not exist in the baseline file is emitted to a new incremental CSV, preserving
the original column order. If no new rows are found, an empty CSV (with headers when determinable) is written.
.PARAMETER BaseCsv
Path to the baseline (earlier) PR CSV.
.PARAMETER AllCsv
Path to the newer full PR CSV containing superset (or equal set) of rows.
.PARAMETER OutCsv
Path to write the incremental CSV containing only new rows.
.PARAMETER Key
Column name used as unique identifier (defaults to 'Number'). Must exist in both CSVs.
.EXAMPLE
pwsh ./diff_prs.ps1 -BaseCsv sorted_prs_prev.csv -AllCsv sorted_prs.csv -OutCsv sorted_prs_incremental.csv
.NOTES
Requires: PowerShell 7+, both CSVs with identical column schemas.
Exit code 0 on success (even if zero incremental rows). Throws on missing files.
#>
[CmdletBinding()] param(
[Parameter(Mandatory=$false)][string]$BaseCsv = "./sorted_prs_93_round1.csv",
[Parameter(Mandatory=$false)][string]$AllCsv = "./sorted_prs.csv",
[Parameter(Mandatory=$false)][string]$OutCsv = "./sorted_prs_93_incremental.csv",
[Parameter(Mandatory=$false)][string]$Key = "Number"
)
Set-StrictMode -Version Latest
$ErrorActionPreference = 'Stop'
function Write-Info($m) { Write-Host "[info] $m" -ForegroundColor Cyan }
function Write-Warn($m) { Write-Host "[warn] $m" -ForegroundColor Yellow }
if (-not (Test-Path -LiteralPath $BaseCsv)) { throw "Base CSV not found: $BaseCsv" }
if (-not (Test-Path -LiteralPath $AllCsv)) { throw "All CSV not found: $AllCsv" }
# Load CSVs
$baseRows = Import-Csv -LiteralPath $BaseCsv
$allRows = Import-Csv -LiteralPath $AllCsv
if (-not $baseRows) { Write-Warn "Base CSV has no rows." }
if (-not $allRows) { Write-Warn "All CSV has no rows." }
# Validate key presence
if ($baseRows -and -not ($baseRows[0].PSObject.Properties.Name -contains $Key)) { throw "Key column '$Key' not found in base CSV." }
if ($allRows -and -not ($allRows[0].PSObject.Properties.Name -contains $Key)) { throw "Key column '$Key' not found in all CSV." }
# Build a set of existing keys from base
$set = New-Object 'System.Collections.Generic.HashSet[string]'
foreach ($row in $baseRows) {
$val = [string]($row.$Key)
if ($null -ne $val) { [void]$set.Add($val) }
}
# Filter rows in AllCsv whose key is not in base (these are the new / incremental rows)
$incremental = @()
foreach ($row in $allRows) {
$val = [string]($row.$Key)
if (-not $set.Contains($val)) { $incremental += $row }
}
# Preserve column order from the All CSV
$columns = @()
if ($allRows.Count -gt 0) {
$columns = $allRows[0].PSObject.Properties.Name
}
try {
if ($incremental.Count -gt 0) {
if ($columns.Count -gt 0) {
$incremental | Select-Object -Property $columns | Export-Csv -LiteralPath $OutCsv -NoTypeInformation -Encoding UTF8
} else {
$incremental | Export-Csv -LiteralPath $OutCsv -NoTypeInformation -Encoding UTF8
}
} else {
# Write an empty CSV with headers if we know them (facilitates downstream tooling expecting header row)
if ($columns.Count -gt 0) {
$obj = [PSCustomObject]@{}
foreach ($c in $columns) { $obj | Add-Member -NotePropertyName $c -NotePropertyValue $null }
$obj | Select-Object -Property $columns | Export-Csv -LiteralPath $OutCsv -NoTypeInformation -Encoding UTF8
} else {
'' | Out-File -LiteralPath $OutCsv -Encoding UTF8
}
}
Write-Info ("Incremental rows: {0}" -f $incremental.Count)
Write-Info ("Output: {0}" -f (Resolve-Path -LiteralPath $OutCsv))
}
catch {
Write-Host "[error] Failed writing output CSV: $_" -ForegroundColor Red
exit 1
}

View File

@@ -0,0 +1,123 @@
<#
.SYNOPSIS
Export merged pull requests for a milestone into JSON and CSV (sorted) with optional Copilot review summarization.
.DESCRIPTION
Uses the GitHub CLI (gh) to list merged PRs for the specified milestone, captures basic metadata,
attempts to obtain a Copilot review summary (choosing the longest Copilot-authored review body),
filters labels to a predefined allow-list, and outputs:
* Raw JSON list (for traceability)
* Sorted CSV (first label alphabetical) used by downstream grouping scripts.
.PARAMETER Repo
GitHub repository in the form 'owner/name'. Default: 'microsoft/PowerToys'.
.PARAMETER Milestone
Exact milestone title (as it appears on GitHub), e.g. 'PowerToys 0.95'.
.PARAMETER OutputJson
Path for raw JSON output. Default: 'milestone_prs.json'.
.PARAMETER OutputCsv
Path for sorted CSV output. Default: 'sorted_prs.csv'.
.EXAMPLE
pwsh ./dump-prs-information.ps1 -Milestone 'PowerToys 0.95'
.EXAMPLE
pwsh ./dump-prs-information.ps1 -Repo microsoft/PowerToys -Milestone 'PowerToys 0.95' -OutputCsv m1.csv
.NOTES
Requires: gh CLI authenticated with repo read access.
This script intentionally does NOT use Set-StrictMode (per current repository guidance for release tooling).
#>
[CmdletBinding()] param(
[Parameter(Mandatory=$false)][string]$Repo = 'microsoft/PowerToys',
[Parameter(Mandatory=$true)][string]$Milestone,
[Parameter(Mandatory=$false)][string]$OutputJson = 'milestone_prs.json',
[Parameter(Mandatory=$false)][string]$OutputCsv = 'sorted_prs.csv'
)
$ErrorActionPreference = 'Stop'
function Write-Info($m){ Write-Host "[info] $m" -ForegroundColor Cyan }
function Write-Warn($m){ Write-Host "[warn] $m" -ForegroundColor Yellow }
function Write-Err($m){ Write-Host "[error] $m" -ForegroundColor Red }
if (-not (Get-Command gh -ErrorAction SilentlyContinue)) { Write-Err "GitHub CLI 'gh' not found in PATH."; exit 1 }
Write-Info "Fetching merged PRs for milestone '$Milestone' from $Repo ..."
$searchQuery = "milestone:`"$Milestone`""
$ghCommand = "gh pr list --repo $Repo --state merged --search '$searchQuery' --json number,title,labels,author,url,body --limit 200"
try {
Invoke-Expression $ghCommand | Out-File -Encoding UTF8 -FilePath $OutputJson
}
catch {
Write-Err "Failed querying PRs: $_"; exit 1
}
# === STEP 1: Query PRs from GitHub ===
if (-not (Test-Path -LiteralPath $OutputJson)) { Write-Err "JSON output not created: $OutputJson"; exit 1 }
Write-Info "Parsing JSON ..."
$prs = Get-Content $OutputJson | ConvertFrom-Json
if (-not $prs) { Write-Warn "No PRs returned for milestone '$Milestone'"; exit 0 }
$sorted = $prs | Sort-Object { $_.labels[0]?.name }
Write-Info "Fetching Copilot reviews for each PR (longest Copilot-authored body)."
$csvData = $sorted | ForEach-Object {
$prNumber = $_.number
Write-Info "Processing PR #$prNumber ..."
# Get Copilot review for this PR
$copilotOverview = ""
try {
$reviewsCommand = "gh pr view $prNumber --repo $repo --json reviews"
$reviewsJson = Invoke-Expression $reviewsCommand | ConvertFrom-Json
# Collect Copilot reviews (match various author logins). Choose the LONGEST body (more content) vs newest.
$copilotReviews = $reviewsJson.reviews | Where-Object {
($_.author.login -eq "github-copilot[bot]" -or
$_.author.login -eq "copilot" -or
$_.author.login -eq "github-copilot" -or
$_.author.login -like "*copilot*") -and
$_.body -and
$_.body.Trim() -ne ""
}
if ($copilotReviews -and $copilotReviews.Count -gt 0) {
$longest = $copilotReviews | Sort-Object { $_.body.Length } -Descending | Select-Object -First 1
$copilotOverview = $longest.body.Replace("`r", "").Replace("`n", " ") -replace '\s+', ' '
Write-Info " Copilot review selected (author=$($longest.author.login) length=$($longest.body.Length))"
} else {
Write-Warn " No Copilot reviews found for PR #$prNumber"
}
}
catch {
Write-Warn " Could not fetch reviews for PR #$prNumber"
}
# Filter labels to only include specific patterns
$filteredLabels = $_.labels | Where-Object {
($_.name -like "Product-*") -or
($_.name -like "Area-*") -or
($_.name -like "Github*") -or
($_.name -like "*Plugin") -or
($_.name -like "Issue-*")
}
$labelNames = ($filteredLabels | ForEach-Object { $_.name }) -join ", "
[PSCustomObject]@{
Id = $_.number
Title = $_.title
Labels = $labelNames
Author = $_.author.login
Url = $_.url
Body = $_.body.Replace("`r", "").Replace("`n", " ") -replace '\s+', ' ' # Make body single-line
CopilotSummary = $copilotOverview
}
}
# === STEP 3: Output CSV ===
Write-Info "Saving CSV to $OutputCsv ..."
$csvData | Export-Csv $OutputCsv -NoTypeInformation -Encoding UTF8
Write-Info "Done. Rows: $($csvData.Count). CSV: $(Resolve-Path -LiteralPath $OutputCsv)"

View File

@@ -0,0 +1,275 @@
<#
.SYNOPSIS
Export merged PR metadata between two commits (exclusive start, inclusive end) to JSON and CSV.
.DESCRIPTION
Identifies merge/squash commits reachable from EndCommit but not StartCommit, extracts PR numbers,
queries GitHub for metadata plus (optionally) Copilot review/comment summaries, filters labels, then
emits a JSON artifact and a sorted CSV (first label alphabetical) analogous to dump-prs-information.ps1.
.PARAMETER StartCommit
Exclusive starting commit (SHA, tag, or ref). Commits AFTER this one are considered.
.PARAMETER EndCommit
Inclusive ending commit (SHA, tag, or ref). Default: HEAD.
.PARAMETER Repo
GitHub repository (owner/name). Default: microsoft/PowerToys.
.PARAMETER OutputCsv
Destination CSV path. Default: sorted_prs.csv.
.PARAMETER OutputJson
Destination JSON path containing raw PR objects. Default: milestone_prs.json.
.EXAMPLE
pwsh ./dump-prs-since-commit.ps1 -StartCommit 0123abcd
.EXAMPLE
pwsh ./dump-prs-since-commit.ps1 -StartCommit 0123abcd -EndCommit 89ef7654 -OutputCsv delta.csv
.NOTES
Requires: git, gh (authenticated). No Set-StrictMode to keep parity with existing release scripts.
#>
[CmdletBinding()]
param(
[Parameter(Mandatory = $true)][string]$StartCommit, # exclusive start (commits AFTER this one)
[string]$EndCommit = "HEAD",
[string]$Repo = "microsoft/PowerToys",
[string]$OutputCsv = "sorted_prs.csv",
[string]$OutputJson = "milestone_prs.json"
)
<#
.SYNOPSIS
Dump merged PR information whose merge commits are reachable from EndCommit but not from StartCommit.
.DESCRIPTION
Uses git rev-list to compute commits in the (StartCommit, EndCommit] range, extracts PR numbers from merge commit messages,
queries GitHub (gh CLI) for details, then outputs a CSV similar to dump-prs-information.ps1.
PR merge commit messages in PowerToys generally contain patterns like:
Merge pull request #12345 from ...
.EXAMPLE
pwsh ./dump-prs-since-commit.ps1 -StartCommit 0123abcd
.EXAMPLE
pwsh ./dump-prs-since-commit.ps1 -StartCommit 0123abcd -EndCommit 89ef7654 -OutputCsv changes.csv
.NOTES
Requires: gh CLI authenticated; git available in working directory (must be inside PowerToys repo clone).
CopilotSummary behavior:
- Attempts to locate the latest GitHub Copilot authored review (preferred).
- If no review is found, lazily fetches PR comments to look for a Copilot-authored comment.
- Normalizes whitespace and strips newlines. Empty when no Copilot activity detected.
- Run with -Verbose to see whether the summary came from a 'review' or 'comment' source.
#>
function Write-Info($msg) { Write-Host $msg -ForegroundColor Cyan }
function Write-Warn($msg) { Write-Host $msg -ForegroundColor Yellow }
function Write-Err($msg) { Write-Host $msg -ForegroundColor Red }
function Write-DebugMsg($msg) { if ($PSBoundParameters.ContainsKey('Verbose') -or $VerbosePreference -eq 'Continue') { Write-Host "[VERBOSE] $msg" -ForegroundColor DarkGray } }
# Validate we are in a git repo
#if (-not (Test-Path .git)) {
# Write-Err "Current directory does not appear to be the root of a git repository."
# exit 1
#}
# Resolve commits
try {
$startSha = (git rev-parse --verify $StartCommit) 2>$null
if (-not $startSha) { throw "StartCommit '$StartCommit' not found" }
$endSha = (git rev-parse --verify $EndCommit) 2>$null
if (-not $endSha) { throw "EndCommit '$EndCommit' not found" }
}
catch {
Write-Err $_
exit 1
}
Write-Info "Collecting commits between $startSha..$endSha (excluding start, including end)."
# Get list of commits reachable from end but not from start.
# IMPORTANT: In PowerShell, the .. operator creates a numeric/char range. If $startSha and $endSha look like hex strings,
# `$startSha..$endSha` will expand unexpectedly (often to empty/undesired) instead of passing the literal "sha1..sha2".
# Therefore we build the range explicitly as a single string argument.
$rangeArg = "$startSha..$endSha"
$commitList = git rev-list $rangeArg
# Normalize list (filter out empty strings)
$normalizedCommits = $commitList | Where-Object { $_ -and $_.Trim() -ne '' }
$commitCount = ($normalizedCommits | Measure-Object).Count
Write-DebugMsg ("Raw commitList length (including blanks): {0}" -f (($commitList | Measure-Object).Count))
Write-DebugMsg ("Normalized commit count: {0}" -f $commitCount)
if ($commitCount -eq 0) {
Write-Warn "No commits found in specified range ($startSha..$endSha)."; exit 0
}
Write-DebugMsg ("First 5 commits: {0}" -f (($normalizedCommits | Select-Object -First 5) -join ', '))
<#
Extract PR numbers from commits.
Patterns handled:
1. Merge commits: 'Merge pull request #12345 from ...'
2. Squash commits: 'Some feature change (#12345)' (GitHub default squash format)
We collect both. If a commit matches both (unlikely), it's deduped later.
#>
# Extract PR numbers from merge or squash commits
$mergeCommits = @()
foreach ($c in $normalizedCommits) {
$subject = git show -s --format=%s $c
$matched = $false
# Pattern 1: Traditional merge commit
if ($subject -match 'Merge pull request #([0-9]+) ') {
$prNumber = [int]$matches[1]
$mergeCommits += [PSCustomObject]@{ Sha = $c; Pr = $prNumber; Subject = $subject; Pattern = 'merge' }
Write-DebugMsg "Matched merge PR #$prNumber in commit $c"
$matched = $true
}
# Pattern 2: Squash merge subject line with ' (#12345)' at end (allow possible whitespace before paren)
if ($subject -match '\(#([0-9]+)\)$') {
$prNumber2 = [int]$matches[1]
# Avoid duplicate object if pattern 1 already captured same number for same commit
if (-not ($mergeCommits | Where-Object { $_.Sha -eq $c -and $_.Pr -eq $prNumber2 })) {
$mergeCommits += [PSCustomObject]@{ Sha = $c; Pr = $prNumber2; Subject = $subject; Pattern = 'squash' }
Write-DebugMsg "Matched squash PR #$prNumber2 in commit $c"
}
$matched = $true
}
if (-not $matched) {
Write-DebugMsg "No PR pattern in commit $c : $subject"
}
}
if (-not $mergeCommits -or $mergeCommits.Count -eq 0) {
Write-Warn "No merge commits with PR numbers found in range."; exit 0
}
# Deduplicate PR numbers (in case of revert or merges across branches)
$prNumbers = $mergeCommits | Select-Object -ExpandProperty Pr -Unique | Sort-Object
Write-Info ("Found {0} unique PRs: {1}" -f $prNumbers.Count, ($prNumbers -join ', '))
Write-DebugMsg ("Total merge commits examined: {0}" -f $mergeCommits.Count)
# Query GitHub for each PR
$prDetails = @()
function Get-CopilotSummaryFromPrJson {
param(
[Parameter(Mandatory=$true)]$PrJson,
[switch]$VerboseMode
)
# Returns a hashtable with Summary and Source keys.
$result = @{ Summary = ""; Source = "" }
if (-not $PrJson) { return $result }
$candidateAuthors = @(
'github-copilot[bot]', 'github-copilot', 'copilot'
)
# 1. Reviews (preferred) pick the LONGEST valid Copilot body, not the most recent
$reviews = $PrJson.reviews
if ($reviews) {
$copilotReviews = $reviews | Where-Object {
($candidateAuthors -contains $_.author.login -or $_.author.login -like '*copilot*') -and $_.body -and $_.body.Trim() -ne ''
}
if ($copilotReviews) {
$longest = $copilotReviews | Sort-Object { $_.body.Length } -Descending | Select-Object -First 1
if ($longest) {
$body = $longest.body
$norm = ($body -replace "`r", '') -replace "`n", ' '
$norm = $norm -replace '\s+', ' '
$result.Summary = $norm
$result.Source = 'review'
if ($VerboseMode) { Write-DebugMsg "Selected Copilot review length=$($body.Length) (longest)." }
return $result
}
}
}
# 2. Comments fallback (some repos surface Copilot summaries as PR comments rather than review objects)
if ($null -eq $PrJson.comments) {
try {
# Lazy fetch comments only if needed
$commentsJson = gh pr view $PrJson.number --repo $Repo --json comments 2>$null | ConvertFrom-Json
if ($commentsJson -and $commentsJson.comments) {
$PrJson | Add-Member -NotePropertyName comments -NotePropertyValue $commentsJson.comments -Force
}
} catch {
if ($VerboseMode) { Write-DebugMsg "Failed to fetch comments for PR #$($PrJson.number): $_" }
}
}
if ($PrJson.comments) {
$copilotComments = $PrJson.comments | Where-Object {
($candidateAuthors -contains $_.author.login -or $_.author.login -like '*copilot*') -and $_.body -and $_.body.Trim() -ne ''
}
if ($copilotComments) {
$longestC = $copilotComments | Sort-Object { $_.body.Length } -Descending | Select-Object -First 1
if ($longestC) {
$body = $longestC.body
$norm = ($body -replace "`r", '') -replace "`n", ' '
$norm = $norm -replace '\s+', ' '
$result.Summary = $norm
$result.Source = 'comment'
if ($VerboseMode) { Write-DebugMsg "Selected Copilot comment length=$($body.Length) (longest)." }
return $result
}
}
}
return $result
}
foreach ($pr in $prNumbers) {
Write-Info "Fetching PR #$pr ..."
try {
# Include comments only if Verbose asked, otherwise we lazily pull if reviews missing
$fields = 'number,title,labels,author,url,body,reviews'
if ($PSBoundParameters.ContainsKey('Verbose')) { $fields += ',comments' }
$json = gh pr view $pr --repo $Repo --json $fields 2>$null | ConvertFrom-Json
if ($null -eq $json) { throw "Empty response" }
$copilot = Get-CopilotSummaryFromPrJson -PrJson $json -VerboseMode:($PSBoundParameters.ContainsKey('Verbose'))
if ($copilot.Summary -and $copilot.Source -and $PSBoundParameters.ContainsKey('Verbose')) {
Write-DebugMsg "Copilot summary source=$($copilot.Source) chars=$($copilot.Summary.Length)"
} elseif (-not $copilot.Summary) {
Write-DebugMsg "No Copilot summary found for PR #$pr"
}
# Filter labels
$filteredLabels = $json.labels | Where-Object {
($_.name -like "Product-*") -or
($_.name -like "Area-*") -or
($_.name -like "Github*") -or
($_.name -like "*Plugin") -or
($_.name -like "Issue-*")
}
$labelNames = ($filteredLabels | ForEach-Object { $_.name }) -join ", "
$bodyValue = if ($json.body) { ($json.body -replace "`r", '') -replace "`n", ' ' } else { '' }
$bodyValue = $bodyValue -replace '\s+', ' '
$prDetails += [PSCustomObject]@{
Id = $json.number
Title = $json.title
Labels = $labelNames
Author = $json.author.login
Url = $json.url
Body = $bodyValue
CopilotSummary = $copilot.Summary
}
}
catch {
$err = $_
Write-Warn ("Failed to fetch PR #{0}: {1}" -f $pr, $err)
}
}
if (-not $prDetails) { Write-Warn "No PR details fetched."; exit 0 }
# Sort by Labels like original script (first label alphabetical)
$sorted = $prDetails | Sort-Object { ($_.Labels -split ',')[0] }
# Output JSON raw (optional)
$sorted | ConvertTo-Json -Depth 6 | Out-File -Encoding UTF8 $OutputJson
Write-Info "Saving CSV to $OutputCsv ..."
$sorted | Export-Csv $OutputCsv -NoTypeInformation
Write-Host "✅ Done. Generated $($prDetails.Count) PR rows." -ForegroundColor Green

View File

@@ -0,0 +1,85 @@
<#
.SYNOPSIS
Group PR rows by their Labels column and emit per-label CSV files.
.DESCRIPTION
Reads a milestone PR CSV (usually produced by dump-prs-information / dump-prs-since-commit scripts),
splits rows by label list, normalizes/sorts individual labels, and writes one CSV per unique label combination.
Each output preserves the original row ordering within that subset and column order from the source.
.PARAMETER CsvPath
Input CSV containing PR rows with a 'Labels' column (comma-separated list).
.PARAMETER OutDir
Output directory to place grouped CSVs (created if missing). Default: 'grouped_csv'.
.NOTES
Label combinations are joined using ' | ' when multiple labels present. Filenames are sanitized (invalid characters,
whitespace collapsed) and truncated to <= 120 characters.
#>
param(
[string]$CsvPath = "sorted_prs.csv",
[string]$OutDir = "grouped_csv"
)
$ErrorActionPreference = 'Stop'
function Write-Info($msg) { Write-Host "[info] $msg" -ForegroundColor Cyan }
function Write-Warn($msg) { Write-Host "[warn] $msg" -ForegroundColor Yellow }
if (-not (Test-Path -LiteralPath $CsvPath)) { throw "CSV not found: $CsvPath" }
Write-Info "Reading CSV: $CsvPath"
$rows = Import-Csv -LiteralPath $CsvPath
Write-Info ("Loaded {0} rows" -f $rows.Count)
function ConvertTo-SafeFileName {
[CmdletBinding()]
param(
[Parameter(Mandatory=$true)][string]$Name
)
if ([string]::IsNullOrWhiteSpace($Name)) { return 'Unnamed' }
$s = $Name -replace '[<>:"/\\|?*]', '-' # invalid path chars
$s = $s -replace '\s+', '-' # spaces to dashes
$s = $s -replace '-{2,}', '-' # collapse dashes
$s = $s.Trim('-')
if ($s.Length -gt 120) { $s = $s.Substring(0,120).Trim('-') }
if ([string]::IsNullOrWhiteSpace($s)) { return 'Unnamed' }
return $s
}
# Build groups keyed by normalized, sorted label combinations. Preserve original CSV row order.
$groups = @{}
foreach ($row in $rows) {
$labelsRaw = $row.Labels
if ([string]::IsNullOrWhiteSpace($labelsRaw)) {
$labelParts = @('Unlabeled')
} else {
$parts = $labelsRaw -split ',' | ForEach-Object { $_.Trim() } | Where-Object { $_ }
if (-not $parts -or $parts.Count -eq 0) { $labelParts = @('Unlabeled') }
else { $labelParts = $parts | Sort-Object }
}
$key = ($labelParts -join ' | ')
if (-not $groups.ContainsKey($key)) { $groups[$key] = New-Object System.Collections.ArrayList }
[void]$groups[$key].Add($row)
}
if (-not (Test-Path -LiteralPath $OutDir)) {
Write-Info "Creating output directory: $OutDir"
New-Item -ItemType Directory -Path $OutDir | Out-Null
}
Write-Info ("Generating {0} grouped CSV file(s) into: {1}" -f $groups.Count, $OutDir)
foreach ($key in $groups.Keys) {
$labelParts = if ($key -eq 'Unlabeled') { @('Unlabeled') } else { $key -split '\s\|\s' }
$safeName = ($labelParts | ForEach-Object { ConvertTo-SafeFileName -Name $_ }) -join '-'
$filePath = Join-Path $OutDir ("$safeName.csv")
# Keep same columns and order
$groups[$key] | Export-Csv -LiteralPath $filePath -NoTypeInformation -Encoding UTF8
}
Write-Info "Done. Sample output files:"
Get-ChildItem -LiteralPath $OutDir | Select-Object -First 10 Name | Format-Table -HideTableHeaders