It took hours to understand how the regular expression works. I briefly explains the symbols in the codes.
. any single character
? zero or one of the previous character
+ one or more of the previous character
* zero or more of the previous character
[0-9] number characters from 0 to 9
Three floating point numbers
I have a data file that contains points in 3D space.
V = [
[ 0.000000, 0.000000, 30.000000], // 0 0-0
[10.298345, 0.000000, 28.177013], // 1 1-0
[ 3.182365, 9.794311, 28.177012], // 2 1-1
[-8.331539, 6.053200, 28.177017], // 3 1-2
[-8.331539, -6.053200, 28.177017], // 4 1-3
# …
];
I need to extract three floating point numbers for each point.
‘[-]?[0-9]+\.[0-9]+’ is the pattern that represents floating point number I’m interest.
[-]? negative sign or not
[0-9]+ at least one number
\. decimal point
[0-9]+ at least one number
The pattern skips integer numbers, 0, 1, or 1_0, because these numbers don’t have the dot character.
re.findall(pattern, string) searches the pattern in string multiple times, return list of matching substring.
s = ‘[ 0.000000, 0.000000, 30.000000], // 0 0-0’
x, y, z = re_search(s, ‘[-]?[0-9]+\.[0-9]+’, is_single=False)
The result of example code is [‘0.000000’, ‘0.000000’, ‘30.000000’]. When you creates multiple variables that have equal number of elements in a list, Python assigns each element to the variables one by one.
x, y, z = [‘0.000000’, ‘0.000000’, ‘30.000000’] does the four lines of code.
t = [‘0.000000’, ‘0.000000’, ‘30.000000’]
x = t[0]
y = t[1]
z = t[2]
The return values are number strings. Applying round(float(n), 6) over each variable, x, y, and z will have 0.000000, 0.000000, and 30.000000 numbers.
import re def re_search(src, pattern, is_single=True): if is_single: m = re.search(pattern, src) return m.group() else: m = re.findall(pattern, src) return m s = '[ 0.000000, 0.000000, 30.000000], // 0 0-0' x, y, z = re_search(s, '[-]?[0-9]+\.[0-9]+', is_single=False)
Two integers and one floating point number
a, b, and c will have 0, 1, 0.
length will have 16.400000.
s = '[ 0, 1, 16.400000, "-"], // 0' a, b, length, c = re_search(s, '[0-9]+\.?[0-9]*', is_single=False)
Three integers
a, b, c, and d will have 0, 1, 2, 0.
s = '[ 0, 1, 2], // 0' a, b, c, d = re_search(s, '[0-9]+\.?[0-9]*', is_single=False)