Monitoring multiple log files with PowerShell

Saturday, August 5, 2023

PowerShell time again! The gift that keeps on giving...

The case

We have several load-balancing servers running various services, each producing their own log files. Which server serves a particular request is not always predictable, which makes following a request (in our case: a login attempt) cumbersome. The request is first served by either server 1A of 1B, and then by server 2A or 2B. So, in order to track a request, we have to consult four log files.

Traditionally (i.e. in Linux) the tail command is used to monitor a log file. (Or any text file for that matter). At heart, it shows the "log tail", i.e. the last n lines of the file. But it can also periodically check for added lines and display them, too. PowerShell has this ability as well, using the Get-Content cmdlet with the -Wait switch.

To monitor our four log files, we can start four PowerShell prompts and let them run Get-Content on a particular log file each. That works, but it's clumsy: you have to arrange the individual windows so they're all visible, and you still need to scan four windows for the request you're tracking.

I searched for "viewing multiple log files" and there are some solutions but they're mostly for Linux (I need Windows) and none of them seems to quite do what I want. Either that, or they're much too powerful and require a subscription :-/

The hunch

Let's start with Get-Content:

Get-Content [-Path] <string[]> [-ReadCount <long>] [-TotalCount <long>]
[-Tail <int>] [-Filter <string>] [-Include <string[]>] [-Exclude <string[]>]
[-Force] [-Credential <pscredential>] [-Delimiter <string>] [-Wait] [-Raw]
[-Encoding <Encoding>] [-AsByteStream] [-Stream <string>] [<CommonParameters>]

Mimicking Linux' tail, you use it like:

Get-Content some.log -Tail 20 -Wait

This will display the last 20 lines of the file some.log and then check every second it more lines were added and show them, too. This only ends when you press Ctrl+C.

But wait a minute: the -Path argument handles an array of strings! How does that work? Well, not like we want. Without the -Wait switch: fine. It displays the tails of the specified log files and exits. With -Wait? Not so much - it just ignores all but the first log file.

So we need a way to run Get-Content several times, each with their own log file... In other words: in parallel! This can be done in PowerShell 5.1 using Workflow, but that's a concept no longer supported in PowerShell 7, which has its own proper support in:

ForEach-Object -Parallel { <script> }

which is, of course, totally brilliant for this problem.

The solution

So here we go:

Get-ChildItem *.log |
ForEach-Object -Parallel {
Get-Content $_ -Wait -Tail 5
}

Works exactly as expected. It displays the last five lines of each log file, and then each line in each log file that's added later, until you stop it with Ctrl+C.

But wait a minute - I can't see which log file each line belongs to! Sure, I can track my request now - the relevant log lines are shown, regardless of which server handled it, but when I spot the problem I have to figure out which log file the line belongs to.

Fortunately, this is easily remedied: we just send the output of Get-Content to Write-Output, prefixed with the file name:

Get-ChildItem *.log |
ForEach-Object -Parallel {
$log = $_
Get-Content $log -Wait -Tail 2 |
ForEach-Object {
Write-Output "$($log.Name): $_"
}
}

To make everything a little more configurable, we make the log files themselves an input to our cmdlet and we make the -Tail value a parameter. And while we're at it, we'll call the whole contraption Get-LogTail.ps1:

[Cmdletbinding()]
Param(
[Parameter(Mandatory)]
[string[]]$Path,
[int]$Tail
)

$Path | ForEach-Object -ThrottleLimit $Path.Count -Parallel {
$log = $_
Get-Content $log -Wait -Tail $Tail |
ForEach-Object {
Write-Output "$($log.Name): $_"
}
}

We also adds an argument -ThrottleLimit to ForEach-Object, which sets the number of threads used. This has to be equal to the number of log files to show. If it's too low, only the first few log files specified will be handled. The rest will get a chance when the first few calls terminate - which they don't unless you press Ctrl+C and then the whole loop is terminated. Long story short: we need this parameter! This also means that you should not try to process hundreds (of maybe even tens) of file simultaneously...

Unfortunately, this solution is somewhat naïve. The $Tail parameter is not available in the script block - we have to add using: to it to access the "outer" $Tail variable. And we now no longer have a real file object in $_, but only a path name. We'll solve that later, but for now, this fixes it:

[Cmdletbinding()]
Param(
  [Parameter(Mandatory)]
  [string[]]$Path,
  [int]$Tail
)

$Path | ForEach-Object -ThrottleLimit $Path.Count -Parallel {
  $p = $_
  Get-Content $p -Wait -Tail $using:Tail |
  ForEach-Object {
    Write-Output "$($p): $_"
  }
}

(Aside: This works just fine but we have to specify $Path as an argument, as in

.\Get-LogTail.ps1 -Path (Get-ChildItem *.log)

when it's much more natural to pipe the input in, like:

(Get-ChildItem *.log) | .\Get-LogTail.ps1

But this requires the $Path parameter to come from the pipeline and that sort of screws up the whole -Parallel thing. There's a solution at the end of this page, but we'll keep it simple for now.)

Also, when you do use the -Path (Get-ChildItem ...) syntax, the paths are full path names, which clutters the output to no end. If we drop the requirement that the $Path parameter is an array of strings, but rather make it an array of PSObject, we can display the file name again, instead:

[Cmdletbinding()]
Param(
  [Parameter(Mandatory)]
  [PSObject[]]$Path,
  [int]$Tail
)

$Path | ForEach-Object -ThrottleLimit $Path.Count -Parallel {
  $log = $_
  Get-Content $log -Wait -Tail $using:Tail |
  ForEach-Object {
    Write-Output "$($log.Name): $_"
  }
}

Processing log lines

OK, good enough. But now we're here, we get more ambitious. I want to display the name of the log file in a different color and in general, add some processing to the output, so we go all in:

[Cmdletbinding()]
Param(
[Parameter(Mandatory)]
[PSObject[]]$Path,
[int]$Tail
)

$Path | ForEach-Object -ThrottleLimit $Path.Count -Parallel {
$log = $_
Get-Content $log -Wait -Tail $using:Tail |
ForEach-Object {
function ProcessLogLine {
Param($logfile, $line)
"$($logfile.Name): $line"
}
Write-Output (ProcessLogLine $log $_)
}
}

Now we send each log line through the function ProcessLogLine. The function has to be defined inside the -Parallel scriptblock, presumably because the whole script block runs independently on another thread - which is sort of the whole point. It looks weird, but it works. (Story of my professional life ;-))

So now we can do other things: color the output of the log files for instance, depending on the contents of the line or even the name of the log file. I'll leave that to the imagination of the reader. Just one tip: make sure the output of ProcessLogLine is written in one statement. If you split it up into multiple Write-Host -Foreground $color calls, for instance, each with a different color, you'll end up with partial lines because another log file's log line will be written to the output at the same time. Using ANSI escape sequences (in a single line!) prevents this:

function ProcessLogLine {
Param($logfile, $line)
"$($PSStyle.Foreground.Yellow)$($logfile.Name)$($PsStyle.Reset): $line"
}

So, knock yourself out experimenting with nice color schemes - I know I did ;-)

Using the pipeline

Step 1 in updating the script to obtain the log files from the pipeline is the obvious one: add ValueFromPipeline to the parameter.

[Cmdletbinding()]
Param(
[Parameter(Mandatory, ValueFromPipeline)]
[PSObject[]]$Path,
[int]$Tail
)
...

As usual, it's not that simple. When we pass a log file to the script this way, it works, but as soon as we add more, it falls apart. Upon closer inspection it's obvious why. When the first log file enters the script through the pipeline, we start monitoring it until the script is terminated. This does not leave room for the second log file, or the next.

So we have to resort to good old begin/process/end. The plan is simple: we collect each item from the pipeline with begin and process, then start monitoring all of them at end. Not very intuitive (the end part actually does the processing), but once again, it works:

[Cmdletbinding()]
Param(
[Parameter(Mandatory, ValueFromPipeline)]
[PSObject[]]$Path,
[int]$Tail
)

begin { $logs = @() }
process { $logs += $Path }
end {
$logs | ForEach-Object -ThrottleLimit $logs.Count -Parallel {
$log = $_
Get-Content $log -Wait -Tail $using:Tail |
ForEach-Object {
function ProcessLogLine {
Param($logfile, $line)
"$($logfile.Name): $line"
}
Write-Output (ProcessLogLine $log $_)
}
}
}

so we can finally do

(Get-ChildItem *.log) | .\Get-LogTail.ps1 -Tail 5

OK, back to work.