13th February 2018

Unix sort Issue

I wondered why Unix sort behaved strangely.

printf "A0 1\nA  1\n" | sort

delivered

A0 1
A  1

Of course, I expected A to come before A0. This was strange, as printf "A1 1\nA 1\n" | sort produced

A  1
A1 1

just as expected. Also, printf "A0\nA\n" | sort orders A before A0, as expected.

Solution: Use LC_ALL before sort. So

printf "A0 1\nA  1\n" | LC_ALL=C sort

delivered

A  1
A0 1

I realized this when I called sort with --debug flag,

printf "A0 1\nA  1\n" | sort --debug

which shows the empleyed locale:

sort: using ‘en_US.UTF-8’ sorting rules
A0 1
____
A  1
____

To check that my expected sort-order was indeed the "right" order, I wrote the following simple Perl-script to sort, which confirmed my understanding of ASCII sorting:

#!/bin/perl -W
use strict;
my @F = <>;     # slurp
for my $i (sort @F) { print $i; }