How can I sequentially convert strings like "3.71B" and "4M" to numbers in Python?

I have a rather distorted code that almost produces a tangible price / book from Yahoo Finance for companies (a good module called ystockquotegets an intangible price / book value already).

My problem is this:

For one of the variables in the calculation it matters , I get strings like 10.89B and 4.9M , where B and M are for billions and millions, respectively . I have problems converting them to numbers, here where I am:

shares=''.join(node.findAll(text=True)).strip().replace('M','000000').replace('B','000000000').replace('.','') for node in soup2.findAll('td')[110:112]

It's pretty dirty, but I think it will work if instead

.replace('M','000000').replace('B','000000000').replace('.','') 

I used a regex with variables. I guess the question is simple regex and variables. Other suggestions are also good.

EDIT:

To be specific, I hope to get something that works for numbers with zero, one, or two decimal places, but these answers all look useful.

+5
source share
5 answers
>>> from decimal import Decimal
>>> d = {
        'M': 6,
        'B': 9
}
>>> def text_to_num(text):
        if text[-1] in d:
            num, magnitude = text[:-1], text[-1]
            return Decimal(num) * 10 ** d[magnitude]
        else:
            return Decimal(text)

>>> text_to_num('3.17B')
Decimal('3170000000.00')
>>> text_to_num('4M')
Decimal('4000000')
>>> text_to_num('4.1234567891234B')
Decimal('4123456789.1234000000000')

You can int()get the result if you also want

+14
source

Parse the numbers as a float and use a multiplier mapping:

multipliers = dict(M=10**6, B=10**9)
def sharesNumber(nodeText):
    nodeText = nodeText.strip()
    mult = 1
    if nodeText[-1] in multipliers:
        mult = multipliers[nodeText[-1]]
        nodeText = nodeText[:-1]
    return float(nodeText) * mult
+4
source
num_replace = {
    'B' : 1000000000,
    'M' : 1000000,
}

a = "4.9M" 
b = "10.89B" 

def pure_number(s):
    mult = 1.0
    while s[-1] in num_replace:
        mult *= num_replace[s[-1]]
        s = s[:-1]
    return float(s) * mult 

pure_number(a) # 4900000.0
pure_number(b) # 10890000000.0

:

pure_number("5.2MB") # 5200000000000000.0

- , , , , , .lower() .upper(), .

+2
num_replace = {
    'B' : 'e9',
    'M' : 'e6',
}

def str_to_num(s):
    if s[-1] in num_replace:
        s = s[:-1]+num_replace[s[-1]]
    return int(float(s))

>>> str_to_num('3.71B')
3710000000L
>>> str_to_num('4M')
4000000

So '3.71B''3.71e9'3710000000Letc.

+2
source

It may be able to use eval safely !! :-)

Consider the following snippet:

>>> d = { "B" :' * 1e9', "M" : '* 1e6'}
>>> s = "1.493B"
>>> ll = [d.get(c, c) for c in s]
>>> eval(''.join(ll), {}, {})
1493000000.0

Now put it all together in a neat single liner:

d = { "B" :' * 1e9', "M" : '* 1e6'}

def human_to_int(s):
    return eval(''.join([d.get(c, c) for c in s]), {}, {})

print human_to_int('1.439B')
print human_to_int('1.23456789M')

Returns:

1439000000.0
1234567.89
+1
source

All Articles