Sort a list of real numbers mixed with letters

I have a list of data that I have to sort, and, unfortunately, the naming scheme for these objects is not very consistent. The data is a list of strings that are most often real numbers, but sometimes have a letter at the end. Some examples of valid values ​​in this list are as follows:

# this is how it should be sorted
['1', '1.1', '1.2', '2', '2.1A', '2.1B', '2.2A', '101.1', '101.2']

Since they are in the database, my first thought was to use the following django method to return sorted results, but it returns it as follows.

#took out unneeded code
choices = [l.number for l in Locker.objects.extra(
               select={'asnumber': 'CAST(number as BYTEA)'}).order_by('asnumber')]
print choices
==> ['1', '1.1', '101.1', '101.2', '2', '2.1A', '2.1B', '2.2A']

Unfortunately, he could not sort it as he should. So my new plan is to write a method that will work with the python method sorted, but I'm still not sure how to do this. I need to find a way to sort by the natural part of the string, and then as a secondary sort, sort by the attached letter to the end.

Any tips on where to go with this?

+3
source share
5 answers

Let the DBMS do the sorting, which is very good. You can hardly compete with performance in your application.

If you get only fractional numbers with the addition of A or B, you can simply:

SELECT *
FROM  (
   SELECT unnest(
    ARRAY['1', '1.1', '1.2', '2', '2.1A', '2.1B', '2.2A', '101.1', '101.2']) AS s
   ) x
ORDER  BY rtrim(s, 'AB')::numeric, s;

. ARRAY unnest() . ORDER BY - rtrim() .

, , .

+4
x = ['1', '1.1', '1.2', '2', '2.1A', '2.1B', '2.2A', '101.1', '101.2']

#sort by the real number portion

import string

letters = tuple(string.ascii_letters)

def change(x):
    if x.endswith(letters):
        return float(x[:len(x) -1])
    else:
        return float(x)

my_list = sorted(x, key = lambda k: change(k))

:

>>> my_list
['1', '1.1', '1.2', '2', '2.1A', '2.1B', '2.2A', '101.1', '101.2']
+1

:

from itertools import takewhile

def sort_key(value):
    cut_point = len(value) - len(list(takewhile(str.isalpha, reversed(value))))
    return (float(value[:cut_point]), value[cut_point:])

sorted((
    l.number
    for l in Locker.objects.extra(select={'asnumber': 'CAST(number as BYTEA)'})
), key = sort_key)
0

- ( float decimal) . python (timesort), .

, , , 1e10.

, , . , , . ( , python 2.x 3.x).

0, 1 ..

cmp 3.x.

0

, .

Then I would highly recommend storing it as two integers and a text field. Sorting by major_number, minor_number, the revision will work exactly as expected. You can either define asnumber as a representation at the database level, or as a class based on three base numbers with an associated one __cmp__().

0
source

All Articles