Skip to content

Commit

Permalink
(AB-1948154) Add stale content reporting (MicrosoftDocs#9003)
Browse files Browse the repository at this point in the history
Prior to this change, inspection of stale content was mandatory or
relied on externally retrieved and hosted data. This change:

- Adds the new `Get-StaleDocument` function to the GitHub Actions module
  to support inspecting the repository for stale content. For more
  information on how it works, see the comment-based help in the
  function's definition file.
- Adds the new `Get-StaleContentReport` script, which can be used to
  inspect a repository for stale documentation, write a report summary
  in the console and to GitHub Actions, export the report as a CSV,
  mark the exported report for upload via GitHub Actions, and return the
  result object directly. For more information on how it works, see the
  markdown help for the script.
- Defines the new `reporting/stale-content` GitHub Action for enabling
  simplified and cross-repository use of the functionality in the newly
  defined `Get-StaleContentReport` script, handling input, output, and
  the upload of the report as a GitHub Action artifact. For more
  information on how it works, see the markdown help for the action.
- Defines the new `stale-content` workflow to inspect this repository's
  conceptual documentation for stale content.
- Resolves AB#1948154
  • Loading branch information
michaeltlombardi authored Jul 11, 2022
1 parent a0bb032 commit bf217eb
Show file tree
Hide file tree
Showing 10 changed files with 1,418 additions and 0 deletions.
151 changes: 151 additions & 0 deletions .github/actions/.pwsh/module/functions/utility/Get-StaleDocument.ps1
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
function Get-StaleDocument {
<#
.SYNOPSIS
Retrieve documents from a folder that are stale
.DESCRIPTION
This cmdlet searches recursively through one or more folders to find documents with the
`ms.date` key in their frontmatter where the value of that key is older than a specified
date, indicating that the document is stale.
.PARAMETER RelativeFolderPath
Specify the path to one or more folders to search for stale documents relative to the
current working directory. Do not use any wildcard characters. The value for this parameter
is interpreted literally. Additionally, this value is used to determine the root path of any
stale documents.
If you specify any values including a wildcard character, this cmdlet warns you about the
consequences. You can suppress this warning by specifying `SilentlyContinue` for the
**WarningAction** parameter.
.PARAMETER DaysUntilStale
Specify an integer representing how many days can pass before a document is considered
stale. If any document's `ms.date` key is older than this value, it is returned as a stale
document.
.PARAMETER StaleSinceDate
Specify a datetime object representing the point at which any older documents are considered
stale. If any document's `ms.date` key is older than this value, it is returned as a stale
document. This value defaults to `330` days before this cmdlet is called.
.EXAMPLE
```powershell
Get-StaleDocument -RelativeFolderPath ./reference/
```
The cmdlet searches the `reference` folder in the current working directory and returns
every document whose `ms.date` key has a value older than 330 days from the time the command
is called. For every document returned, the **RootPath** property is `reference`, the
**RelativePath** property is the remaining path to the file, and the **MSDate** property is
the datetime value from the document's frontmatter.
.EXAMPLE
```powershell
$Folders = @(
'reference/5.1'
'reference/7.0'
'reference/7.2'
'reference/docs-conceptual'
)
Get-StaleDocument -RelativeFolderPath $Folders -DaysUntilStale 1000
```
The first command enumerates the list of folders to recursively search for stale documents.
The second command searches those folders and returns every document whose `ms.date` key has
a value older than 1000 days from the time the command is called.
.EXAMPLE
```powershell
Get-StaleDocument -RelativeFolderPath reference -StaleSinceDate '2022-06-15'
#>
[CmdletBinding(DefaultParameterSetName='ByDate')]
[OutputType('PSDocs.DocumentInfo[]')]
param(
[Parameter(Mandatory)]
[string[]]$RelativeFolderPath,

[string[]]$ExcludeFolderSegment,

[Parameter(ParameterSetName='ByDays')]
[int]$DaysTilStale,

[Parameter(ParameterSetName='ByDate')]
[datetime]$StaleSinceDate = (Get-Date).AddDays(-330).Date
)

begin {
$MSDatePattern = '^ms\.date: (?<date>\d+\/\d+\/\d+).*$'
$HasWildcard
$GetRelativePathRegex = {
$WorkingPath = $_ -replace '\\', '/' # Normalize paths to forward slashes
$WorkingPath = $WorkingPath.TrimStart('.') # Remove leading relative path dots
$WorkingPath = $WorkingPath.Trim('/') # Remove wrapping path segments
[regex]::Escape($WorkingPath)
}
$ProcessProperties = {
$MSDate = $_.Matches.Groups
| Where-Object -FilterScript { $_.Name -eq 'Date' }
| Select-Object -ExpandProperty Value
| ForEach-Object { ([datetime]$_).Date }

$FilePath = $_.Path -replace '\\', '/' # Normalize paths to forward slashes
$RegexRootPaths = $RelativeFolderPath | ForEach-Object -Process {
$RootPath = $_ -replace '\\', '/' # Normalize paths to forward slashes
$RootPath = $RootPath.TrimStart('.') # Remove leading relative path dots
$RootPath = $RootPath.Trim('/') # Remove wrapping path segments
[regex]::Escape($RootPath)
} | Join-String -Separator '|' # Join as options in a regex-or match
$RelativePathPattern = "(?<RootPath>($RegexRootPaths))\/(?<RelativePath>.+$)"
if ($ExcludeFolderSegment.Count -gt 0) {
$RegexExcludeSegments = $ExcludeFolderSegment
| ForEach-Object -Process $GetRelativePathRegex
| Join-String -Separator '|'
$ExcludePattern = "\/($RegexExcludeSegments)\/"
if ($FilePath -match $ExcludePattern) {
# Skip!
Write-Debug "EXCLUDING: $FilePath"
return
}
}
if ($FilePath -match $RelativePathPattern) {
[PSCustomObject]@{
PSTypeName = 'PSDocs.DocumentInfo'
RootPath = $Matches.RootPath
RelativePath = $Matches.RelativePath
MSDate = $MSDate
}
}
}
}

process {
if ($DaysUntilStale -gt 0) {
$StaleSinceDate = (Get-Date).AddDays((0 - $DaysUntilStale)).Date
}
$FoldersWithWildCards = $RelativeFolderPath -match '(\*|\?|\[|\])'
if ($FoldersWithWildCards.Count -gt 0) {
$Message = @(
"RelativeFolderPath included at least one path with wildcard character(s)."
"Wildcard characters are interpreted literally. Make sure this was intentional."
"If you do not want to see this message, specify 'SilentlyContinue for the"
"WarningAction parameter."
"`n`tPaths:`n`t`t$($FoldersWithWildCards -join "`n`t`t")"
) -join ' '
Write-Warning -Message $Message
}

Get-ChildItem -LiteralPath $RelativeFolderPath -Recurse
| Select-String -Pattern $MSDatePattern
| ForEach-Object -Process $ProcessProperties
| Where-Object -Property MSDate -lt $StaleSinceDate
| Sort-Object -Property RootPath, RelativePath, MSDate
}
}
1 change: 1 addition & 0 deletions .github/actions/.pwsh/module/gha.psd1
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@
'Format-GHAConsoleText'
'Get-ActionScriptParameter'
'Get-GHAConsoleError'
'Get-StaleDocument'
'Get-VersionedContentChangeStatus'
'Get-VersionedContentTableColumnWidth'
'New-CliErrorRecord'
Expand Down
13 changes: 13 additions & 0 deletions .github/actions/.pwsh/module/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,18 @@ It can retrieve errors from the `$error` variable or act on an input object.

For more information, review the [source code][utility-Get-GHAConsoleError]

### `Get-StaleDocument`

This cmdlet searches a specified folder for stale documents as determined by comparing the value of
their `ms.date` metadata key with a specified date. By default, any document not updated in the last
330 days is considered stale.

It returns the list of stale documents with their **RootPath** (the folder searched recursively),
**RelativePath** (the path to the document relative to the **RootPath**), and **MSDate** (the
**System.DateTime** value for their `ms.date` property).

For more information, review the [source code][utility-Get-StaleDocument].

### `Get-VersionedContentStatus`

This cmdlet returns the change status of versioned content for a pull request. This information can
Expand Down Expand Up @@ -149,6 +161,7 @@ For more information, review the [source code][utility-Write-HostParameter]
[utility-Format-GHAConsoleText]: ./functions/utility/Format-GHAConsoleText.ps1
[utility-Get-GHAConsoleError]: ./functions/utility/Get-GHAConsoleError.ps1
[utility-Get-ActionScriptParameter]: ./functions/utility/Get-ActionScriptParameter.ps1
[utility-Get-StaleDocument]: ./functions/utility/Get-StaleDocument.ps1
[utility-Get-VersionedContentStatus]: ./functions/utility/Get-VersionedContentStatus.ps1
[utility-New-CliErrorRecord]: ./functions/utility/New-CliErrorRecord.ps1
[utility-New-InvalidParameterError]: ./functions/utility/New-InvalidParameterError.ps1
Expand Down
Loading

0 comments on commit bf217eb

Please sign in to comment.