burn-console Digression: An MSH Script Profiler Part 2
Mon, Dec 5, 2005
6-minute read
In the last article, we talked about the theory behind a sampling profiler. Let’s go over the code to implement one. We’ll use this to guide our future optimizations. You can also download the script here: profile-transcript.msh.
################################################################################
##
## profile-transcript.msh
##
## Computes the performance characteristics of a script, based on the transcript
## of it running at trace level 1.
##
## To profile a script:
## 1) Turn on MSH script tracing in the window that will run the script:
## set-mshdebug –trace 1
## 2) Turn on the transcript for the window that will run the script:
## start-transcript
## (Note the filename that Monad provides as the logging destination.)
## 3) Type in the script name, but don't actually start it.
## 4) Open another MSH window, and navigate to the directory holding
## this script. Type in '.\profile-transcript.msh <path-to-transcript>',
## replacing <path-to-transcript> with the path given in step 2. Don't
## press <Enter> yet.
## 5) Switch to the profiled script window, and start the script.
## Switch to the window containing profile-transcript, and press <Enter>
## 6) Wait until your profiled script exits, or has run long enough to be
## representative of its work. To be statistically accurate, your script
## should run for at least ten seconds.
## 7) Switch to the window running profile-transcript.msh, and press a key.
## 8) Switch to the window holding your profiled script, and type:
## stop-transcript
## 9) Delete the transcript.
##
## Note: You can profile regions of code (ie: functions) rather than just lines
## by placing the following call at the start of the region:
## write-debug "ENTER <region_name>"
## and the following call and the end of the region:
## write-debug "EXIT"
## This is implemented to account exclusively for the time spent in that
## region, and does not include time spent in regions contained within the
## region. For example, if FunctionA calls FunctionB, and you've surrounded
## each by region markers, the statistics for FunctionA will not include the
## statistics for FunctionB.
##
################################################################################
param($logFilePath = $(throw "Please specify a path to the transcript log file."))
function Main
{
## Run the actual profiling of the script. $uniqueLines gets
## the mapping of line number to actual script content.
## $samples gets a hashtable mapping line number to the number of times
## we observed the script running that line.
$uniqueLines = @{}
$samples = GetSamples $uniqueLines
"Breakdown by line:"
"----------------------------"
## Create a new hash table that flips the $samples hashtable --
## one that maps the number of times sampled to the line sampled.
## Also, figure out how many samples we got altogether.
$counts = @{}
$totalSamples = 0;
foreach($item in $samples.Keys)
{
$counts[$samples[$item]] = $item
$totalSamples += $samples[$item]
}
## Go through the flipped hashtable, in descending order of number of
## samples. As we do so, output the number of samples as a percentage of
## the total samples. This gives us the percentage of the time our script
## spent executing that line.
foreach($count in ($counts.Keys | sort -desc))
{
$line = $counts[$count]
$percentage = "{0:#0}" -f ($count * 100 / $totalSamples)
"{0,3}%: Line {1,4} -{2}" -f $percentage,$line,
$uniqueLines[$line]
}
## Go through the transcript log to figure out which lines are part of any
## marked regions. This returns a hastable that maps region names to
## the lines they contain.
""
"Breakdown by marked regions:"
"----------------------------"
$functionMembers = GenerateFunctionMembers
## For each region name, cycle through the lines in the region. As we
## cycle through the lines, sum up the time spent on those lines and output
## the total.
foreach($key in $functionMembers.Keys)
{
$totalTime = 0
foreach($line in $functionMembers[$key])
{
$totalTime += ($samples[$line] * 100 / $totalSamples)
}
$percentage = "{0:#0}" -f $totalTime
"{0,3}%: {1}" -f $percentage,$key
}
}
## Run the actual profiling of the script. $uniqueLines gets
## the mapping of line number to actual script content.
## Return a hashtable mapping line number to the number of times
## we observed the script running that line.
function GetSamples($uniqueLines)
{
## Open the log file. We use the .Net file I/O, so that we keep monitoring
## just the end of the file. Otherwise, we would make our timing innacurate
## as we scan the entire length of the file every time.
$logStream = [System.IO.File]::Open($logFilePath, "Open", "Read", "ReadWrite")
$logReader = new-object System.IO.StreamReader $logStream
$random = new-object Random
$samples = @{}
$lastCounted = $null
## Gather statistics until the user presses a key.
while(-not $host.UI.RawUI.KeyAvailable)
{
## We sleep a slightly random amount of time. If we sleep a constant
## amount of time, we run the very real risk of improperly sampling
## scripts that exhibit periodic behaviour.
$sleepTime = [int] ($random.NextDouble() * 100.0)
start-sleep -m $sleepTime
## Get any content produced by the transcript since our last poll.
## From that poll, extract the last DEBUG statement (which is the last
## line executed.)
$rest = $logReader.ReadToEnd()
$lastEntryIndex = $rest.LastIndexOf("DEBUG: ")
## If we didn't get a new line, then the script is still working on the
## last line that we captured.
if($lastEntryIndex -lt 0)
{
if($lastCounted) { $samples[$lastCounted]++ }
continue;
}
## Extract the debug line.
$lastEntryFinish = $rest.IndexOf("\n", $lastEntryIndex)
if($lastEntryFinish -eq -1) { $lastEntryFinish = $rest.length }
$scriptLine = $rest.Substring($lastEntryIndex, ($lastEntryFinish - $lastEntryIndex)).Trim()
if($scriptLine -match 'DEBUG:[ \t]*([0-9]*)\+(.*)')
{
## Pull out the line number from the line
$last = $matches[1]
$lastCounted = $last
$samples[$last]++
## Pull out the actual script line that matches the line number
$uniqueLines[$last] = $matches[2]
}
## Discard anything that's buffered during this poll, and start waiting
## again
$logReader.DiscardBufferedData()
}
## Clean up
$logStream.Close()
$logReader.Close()
return $samples
}
## Go through the transcript log to figure out which lines are part of any
## marked regions. This returns a hastable that maps region names to
## the lines they contain.
function GenerateFunctionMembers
{
## Create a stack that represents the callstack. That way, if a marked
## region contains another marked region, we attribute the statistics
## appropriately.
$callstack = new-object System.Collections.Stack
$currentFunction = "Unmarked"
$callstack.Push($currentFunction)
$functionMembers = @{}
## Go through each line in the transcript file, from the beginning
foreach($line in (get-content $logFilePath))
{
## Check if we're entering a monitor block
## If so, store that we're in that function, and push it onto
## the callstack.
if($line -match 'write-debug "ENTER (.*)"')
{
$currentFunction = $matches[1]
$callstack.Push($currentFunction)
}
## Check if we're exiting a monitor block
## If so, clear the "current function" from the callstack,
## and store the new "current function" onto the callstack.
elseif($line -match 'write-debug "EXIT"')
{
[void] $callstack.Pop()
$currentFunction = $callstack.Peek()
}
## Otherwise, this is just a line with some code.
## Add the line number as a member of the "current function"
else
{
if($line -match 'DEBUG:[ \t]*([0-9]*)\+')
{
## Create the arraylist if it's not initialized
if(-not $functionMembers[$currentFunction])
{
$functionMembers[$currentFunction] = new-object `
System.Collections.ArrayList
}
## Add the current line to the ArrayList
if(-not $functionMembers[$currentFunction].Contains($matches[1]))
{
[void] $functionMembers[$currentFunction].Add($matches[1])
}
}
}
}
return $functionMembers
}
. Main
[Edit: Monad has now been renamed to Windows PowerShell. This script or discussion may require slight adjustments before it applies directly to newer builds.]