phash: canonicalize order, fix handling of ignored duplicates

Canonicalize the order of the prehash entries, so we don't have to
worry about looking up both pairs of edges.

When we find a collision that we decide to ignore, there is no point
in adding the same edge into the array again; instead, just skip the
current edge.
This commit is contained in:
H. Peter Anvin 2008-05-25 18:44:44 -07:00
parent 14f9ea2925
commit 1df0b9ee2d

View File

@ -79,11 +79,14 @@ sub gen_hash_n($$$$) {
%edges = ();
foreach $k (@keys) {
my ($pf1, $pf2) = prehash($k, $n, $sv);
($pf1,$pf2) = ($pf2,$pf1) if ($pf1 > $pf2); # Canonicalize order
my $pf = "$pf1,$pf2";
my $e = ${$href}{$k};
my $xkey;
if (defined($xkey = $edges{$pf}) && ${$href}{$xkey} != $e) {
if (defined($xkey = $edges{$pf})) {
next if ($e == ${$href}{$xkey}); # Duplicate hash, safe to ignore
if (defined($run)) {
print STDERR "$run: Collision: $pf: $k with $xkey\n";
}