Comparing hashtables in Powershell

Wednesday, May 18, 2022

I'm a huge fan of dictionaries. I know: when the only tool is a hammer, every problem looks like a nail, but dictionaries really do come in handy often. In almost all sorting and selecting algorithms dictionaries are the fastest solution.

Also, I'm a huge fan of PowerShell and it does not seem like a coincidence that dictionaries (albeit under the name Hashtable) are first class citizens there.

So, I'm forever building up dictionaries and then finding the differences between them. You know the drill:

Set up hashtable a
Set up hashtable b
for each key in a: if [not] also inb then...

So, a while ago a colleague of mine mentioned Compare-Object.

Wait - WHAT? Where has Compare-Object been all my life? [Spoiler: it's been around since PoweShell version 3.0 and I missed it all that time!]

Basically, what is does is process two sets of objects and classify them as "only in the left set", "only in the right set" or "in both sets". It does this by sending all objects to output as InputObject and adding a property SideIndicator with three possible values: "<=", "=>" and "==". Normally, only differences are show (meaning SideIndicator "<=" and "=>") but using the switches -IncludeEqual and -ExcludeDifferent you can select which objects are returned. Very straightforward but also very powerful. 

Most of the examples deal with lists of strings, but considering my predisposition with dictionaries, I of course immediately think of reducing my dictionary code to one-liners. The keys in hashtables are often (but not always) strings, so comparing the keys between hashtables should yield the differences straight away:

$a = @{ "one" =  1; "two"= 2 }
$b = @{ "three" =  3; "two"= 22 }
Compare-Object $a.Keys $b.Keys

InputObject SideIndicator
----------- -------------
{three, two} =>
{two, one} <=

Both hashtables contain a key 'two', but the first ($a) also contains 'one' and the second ($b) 'three'. The values of the 'two' items are different but that shouldn't matter. I would expect:

InputObject SideIndicator
----------- -------------
three =>
one <=

to show that key 'one' is present only in $a and key 'three' is only present in $b. But instead, it shows that the hashtable containing keys "one" and "two" is unique to $a, and {"two","three"} is unique to $b! We already knew that... It demonstrates that Compare-Object sees the two collections of keys as just that: two objects that just happen to be different. 

The problem seems to be that the Keys (and Values too, by the way) of a hashtable are of type ICollection, as revealed by Get-Member:

IsSynchronized Property bool IsSynchronized {get;}
Keys Property System.Collections.ICollection Keys {get;}
SyncRoot Property System.Object SyncRoot {get;}
Values Property System.Collections.ICollection Values {get;}

It seems Compare-Object does not know how to handle those. (The fact that they're generic collections of object indicates that hashtables can of course contain any type of object, but also that the key can be any type of object) The solution ought to be to expand the keys into an array, so I first tried using Select-Object on the keys: 

Compare-Object ($a.Keys | Select-Object) ($b.Keys | Select-Object)

That's a one-liner and it works all right but it's super ugly. Fortunately there's another way to turn an ICollection into an array: explicitly make it an array using @():

Compare-Object @($a.Keys) @($b.Keys)

Hooray, that's a one-liner I can remember! It shows:

InputObject SideIndicator
----------- -------------
three =>
one <=

just like it should. Finding only corresponding keys is easy, too:

Compare-Object @($a.Keys) @($b.Keys) -IncludeEqual -ExcludeDifferent

InputObject SideIndicator
----------- -------------
two ==

So, management summary: to quickly compare the keys in two hashtables, use

Compare-Object @($a.Keys) @($b.Keys)

This also works for keys that are objects, by the way. Compare-Object handles sets of objects just fine: it just doesn't know how to deal with ICollections. And adding @() around the keys is just the nudge it needs.

To get back the actual keys, you need to look at the InputObject property of the output of Compare-Object, but that's another step you can avoid by passing -PassThru. This will cause Compare-Object to just output all requested objects from both sets, without the SideIndicator (or InputObject) properties:

Compare-Object @($a.Keys) @($b.Keys) -Passthru -IncludeEqual
two
three
one

You often don't need those when you specifiy -IncludeEqual and/or -ExcludeDifferent. Only if you need to distinguish between 'only in left set' and 'only in right set' you could use:

Compare-Object @($a.Keys) @($b.Keys) |
Where-Object SideIndicator -eq '<=' | # or '=>'
Select-Object -ExpandProperty InputObject

But that's not a nice one-liner at all!

Now to clean up all that old code...