## NaN value breaks sorting in Python

Some time ago, I gave a talk at Boston Python meetup. Bottom line – NaN values in a list silently break sorting. Details below.

NaN stands for not a number. It is a numeric data type used for undefined and unrepresentable values. For example:

In :
a = float('inf')
b = float('inf')

In :
a / b

Out:
nan
In :
a - b

Out:
nan

It’s very easy for NaN to enter your data:

In :
from scipy.stats.stats import pearsonr

a = [0, 0, 0, 0, 0]
b = [0, 0, 0, 0, 0]

pearsonr(a, b)

Out:
nan

It’s also very easy for NaN to quietly propagate:

In :
c = float('nan')
c + 4022

Out:
nan

NaN silently breaks sorting

In :
d = [4, float('nan'), 2, 1]


NaN values break sorting, because they are not smaller, larger, or equal to any number. They don’t even compare as equal to themselves.

In :
e = float('nan')

In :
e < 15234

Out:
False
In :
e > 15234

Out:
False
In :
e == 15234

Out:
False
In :
e == e

Out:
False

Ways to get around sorting with NaN values

In :
import math
f = [4, float('nan'), 2, 1]

In :
sorted([x for x in f if not math.isnan(x)])

Out:
[1, 2, 4]
In :
sorted([x for x in f if x == x])

Out:
[1, 2, 4]
In :
sorted(f, key=lambda x: x if not math.isnan(x) else 0, reverse=True)

Out:
[4, 2, 1, nan]