Hex Dumper in PowerShell
Marcel has been posting some interesting articles on using PowerShell to generate the MD5 hashes of files. Now, an MD5 hash of a file is just an array of bytes. Typical hashing programs display this in a more friendly manner:
PS:15 C:\Temp >md5sum 71-59-B7.bmp
a05805e638741bb767f97c0e88962952 *71-59-B7.bmp
Although the output of Marcel’s scripts could definitely be crafted to display this output, they currently output the string representation of a byte array:
PS:19 C:\Temp >get-md5 (get-childitem 71-59-B7.bmp)
160 88 5 230 56 116 27 183 103 249 124 14 136 150 41 82
One of the comments in response to Marcel’s post was that PowerShell should, by default, output byte arrays as hex. This is a good suggestion, and we can go even further with it. Let’s write a script to give us a full hex editor-like view of a byte array:
PS:20 C:\Temp >get-md5 (get-childitem 71-59-B7.bmp) | format-hex
0 1 2 3 4 5 6 7 8 9 A B C D E F
00000000 A0 58 05 E6 38 74 1B B7 67 F9 7C 0E 88 96 29 52 X.æ8t.•gù|.??)R
Or even better, let’s use it to dump out a very small bitmap – 10 pixels of the colour (R=0x71 G=0x59 B=0xB7)
PS:21 C:\Temp >Format-Hex 71-59-B7.bmp
0 1 2 3 4 5 6 7 8 9 A B C D E F
00000000 42 4D 5E 00 00 00 00 00 00 00 36 00 00 00 28 00 BM^.......6...(.
00000010 00 00 0A 00 00 00 01 00 00 00 01 00 20 00 00 00 ............ ...
00000020 00 00 00 00 00 00 C4 0E 00 00 C4 0E 00 00 00 00 ......Ä...Ä.....
00000030 00 00 00 00 00 00 B7 59 71 FF B7 59 71 FF B7 59 ......•Yq.•Yq.•Y
00000040 71 FF B7 59 71 FF B7 59 71 FF B7 59 71 FF B7 59 q.•Yq.•Yq.•Yq.•Y
00000050 71 FF B7 59 71 FF B7 59 71 FF B7 59 71 FF q.•Yq.•Yq.•Yq.
To make it easier to determine byte offsets, files are usually broken down into 16-byte rows. The left-hand section gives the offset of the 16-byte chunk. The middle section gives the hex representation of the data at that location. These pieces of data are aligned in columns also, corresponding to their location within the 16-byte chunk. So column “E” in row 0x40 means a file offset of (0x40 + 0x0E) = 0x4E. The last section gives an ASCII representation of the data.
In this representation, it becomes possible to see some of the underlying structure of the bitmap format:
Offset | Length | Comment |
---|---|---|
0x00 | 2 | “BM,” the magic bitmap header |
0x02 | 4 | “0x5E,” the length of the file. Notice that our last data byte is at 0x5D. Since we started counting from zero, this means that we have 0x5E bytes of data. |
(…) | (…) | (…) |
0x0A | 4 | “0x36”, specifies the absolute start of the bitmap data. Notice that the data begins at offset (0x30 + 0x06). |
0x36 | 40 | 10 4-byte pixel representations. In Bitmaps, they are laid out as (B=0xB7 G=0x59 R=0x71 <reserved>) |
Now, for the script:
Note: This is now included in PowerShell by default
##############################################################################
##
## Format-Hex
##
## From Windows PowerShell Cookbook (O'Reilly)
## by Lee Holmes (http://www.leeholmes.com/guide)
##
##############################################################################
<#
.SYNOPSIS
Outputs a file or pipelined input as a hexadecimal display. To determine the
offset of a character in the input, add the number at the far-left of the row
with the the number at the top of the column for that character.
.EXAMPLE
PS >"Hello World" | Format-Hex
0 1 2 3 4 5 6 7 8 9 A B C D E F
00000000 48 00 65 00 6C 00 6C 00 6F 00 20 00 57 00 6F 00 H.e.l.l.o. .W.o.
00000010 72 00 6C 00 64 00 r.l.d.
.EXAMPLE
PS >Format-Hex c:\temp\example.bmp
#>
[CmdletBinding(DefaultParameterSetName = "ByPath")]
param(
## The file to read the content from
[Parameter(ParameterSetName = "ByPath", Position = 0)]
[string] $Path,
## The input (bytes or strings) to format as hexadecimal
[Parameter(
ParameterSetName = "ByInput", Position = 0,
ValueFromPipeline = $true)]
[Object] $InputObject
)
begin
{
## Create the array to hold the content. If the user specified the
## -Path parameter, read the bytes from the path.
[byte[]] $inputBytes = $null
if($Path) { $inputBytes = [IO.File]::ReadAllBytes( (Resolve-Path $Path) ) }
## Store our header, and formatting information
$counter = 0
$header = " 0 1 2 3 4 5 6 7 8 9 A B C D E F"
$nextLine = "{0} " -f [Convert]::ToString(
$counter, 16).ToUpper().PadLeft(8, '0')
$asciiEnd = ""
## Output the header
"`r`n$header`r`n"
}
process
{
## If they specified the -InputObject parameter, retrieve the bytes
## from that input
if($InputObject)
{
## If it's an actual byte, add it to the inputBytes array.
if($InputObject -is [Byte])
{
$inputBytes = $InputObject
}
else
{
## Otherwise, convert it to a string and extract the bytes
## from that.
$inputString = [string] $InputObject
$inputBytes = [Text.Encoding]::Unicode.GetBytes($inputString)
}
}
## Now go through the input bytes
foreach($byte in $inputBytes)
{
## Display each byte, in 2-digit hexidecimal, and add that to the
## left-hand side.
$nextLine += "{0:X2} " -f $byte
## If the character is printable, add its ascii representation to
## the right-hand side. Otherwise, add a dot to the right hand side.
if(($byte -ge 0x20) -and ($byte -le 0xFE))
{
$asciiEnd += [char] $byte
}
else
{
$asciiEnd += "."
}
$counter++;
## If we've hit the end of a line, combine the right half with the
## left half, and start a new line.
if(($counter % 16) -eq 0)
{
"$nextLine $asciiEnd"
$nextLine = "{0} " -f [Convert]::ToString(
$counter, 16).ToUpper().PadLeft(8, '0')
$asciiEnd = "";
}
}
}
end
{
## At the end of the file, we might not have had the chance to output
## the end of the line yet. Only do this if we didn't exit on the 16-byte
## boundary, though.
if(($counter % 16) -ne 0)
{
while(($counter % 16) -ne 0)
{
$nextLine += " "
$asciiEnd += " "
$counter++;
}
"$nextLine $asciiEnd"
}
""
}
[Edit: Monad has now been renamed to Windows PowerShell. This script or discussion may require slight adjustments before it applies directly to newer builds.]