Below is an example of a test case:
inpoot = "A.p.p.l.e (45) Orange (5.11) Kiwi" # WE HAVE
outpoot = "A.p.p.l.e () Orange () Kiwi" # WE WANT
The only reason I spelled inpoot
incorrectly is because input
is a reserved language keyword.
One might think that the following would work:
import string
def kill_numbers(text: str) > str:
text = str(text)
return "".join(filter(lambda ch: ch not in string.digits, text))
However, the decimal point (.
) in a decimal numbers will be preserved.
inpoot = "A.p.p.l.e (45) Orange T5.11T Kiwi 99 Apricot"
outpoot = kill_numbers(inpoot)
print(repr(outpoot))
# prints 'A.p.p.l.e () Orange T.T Kiwi'
# We want `TT` not `T.T`
# the output contains a stray decimal point.
outpoot = kill_numbers("Strawberry 3.145 Plum")
print(repr(outpoot))
# fails to delete the `.` in `3.145`
INPUT  BAD OUTPUT  DESIRED OUTPUT 

"3.14" 
"." 
"" (empty string) 
So, how can we delete all numbers, including decimal numbers?
A substitution using regular expressions is theoretically possible.
import re
test_case = "(.4) A.p.p.l.e (44) Orange .... (4.44) Kiwi . . . . ."
result = re.sub("[09]+.?[09]*.[09]+", "", test_case)
print(result) # () A.p.p.l.e () Orange .... () Kiwi . . . . .
The regular expression shown above works for that one test case, but not all test cases.
The table below shows how various regular expressions perform on various test inputs.
KEY FOR TABLE

means that the regex does NOT match the string+
means that the regex matches the entire stringmeh
means that the regex matches a small part of string, but not the whole thing.
REGEX  ' 1 ' 
'2' 
'3' 
'365' 
'9.43' 
'5000' 
'+10' 
'3.10.4' 
'0001' 
'.5' 
'.' 
'591.' 
'' 
'0x77F' 
'3.456e11' 

[09]+\.?[09]*\.[09]+ 
–  –  –  –  –  meh  meh  meh  –  –  +  –  +  meh  meh 
[+]?[09]+\.?[09]*\.[09]+ 
–  –  –  –  –  –  –  meh  –  –  +  –  +  meh  meh 
[+]?([09]+\.?[09]*\.[09]+) 
–  –  –  –  –  –  –  meh  –  –  +  –  +  meh  meh 
[09]*\.?[09]* 
meh  –  –  –  –  meh  meh  meh  –  –  –  –  –  meh  meh 
[09]+\.?[09]+ 
+  +  +  –  –  meh  meh  meh  –  +  +  meh  +  meh  meh 
[09]+\.?[09]* 
–  –  –  –  –  meh  meh  meh  –  meh  +  –  +  meh  meh 
[09]*\.?[09]+ 
–  –  –  –  –  meh  meh  meh  –  –  +  meh  +  meh  meh 
\d+ 
–  –  –  –  meh  meh  meh  meh  –  meh  +  meh  +  meh  meh 
[09] 
–  –  –  meh  meh  meh  meh  meh  meh  meh  +  meh  +  meh  meh 
\d 
–  –  –  meh  meh  meh  meh  meh  meh  meh  +  meh  +  meh  meh 
\d* 
meh  –  –  –  meh  meh  meh  meh  –  meh  meh  meh  –  meh  meh 
The same table in ASCII form might be easier to read and understand:
' 1 ' '2' '3' '365' '9.43' '5000' '+10' '3.10.4' '0001' '.5' '.' '591.' '' '0x77F' '3.456e11'
[09]+.?[09]*.[09]+      meh meh meh   +  + meh meh
[+]?[09]+.?[09]*.[09]+        meh   +  + meh meh
[+]?([09]+.?[09]*.[09]+)        meh   +  + meh meh
[09]*.?[09]* meh     meh meh meh      meh meh
[09]+.?[09]+ + + +   meh meh meh  + + meh + meh meh
[09]+.?[09]*      meh meh meh  meh +  + meh meh
[09]*.?[09]+      meh meh meh   + meh + meh meh
d+     meh meh meh meh  meh + meh + meh meh
[09]    meh meh meh meh meh meh meh + meh + meh meh
d    meh meh meh meh meh meh meh + meh + meh meh
d* meh    meh meh meh meh  meh meh meh  meh meh
In my humble opinion, regular expressions are a nightmare.
To digress, it took me a long time to realize that:
IMHO = In my humble opinion`. I don't speak acronym very well.
Back to business…
I cannot find a regex which satisfies the following requirements:
 the regex must not match the empty string (
""
)  the regex must not match any substring of a version number, such as
"3.10.4"
At most one decimal point is allowed to appear in what we call a “number”  the regex must not match freefloating decimal points (
"."
).
Desired behavior is as follows:
PSEUDONUMBER  IS_A_NUMBER() 
NOTES 

"1" 
Yes  int 
"2" 
Yes  int 
"365" 
Yes  int 
"365." 
No  365. is a float equivalent to 365.0 However, I do not want to delete the (. ) at the end of the string "The number of houses was 44." 
"9.43" 
Yes  one decimal points 
"5000" 
Yes  
"+10" 
Yes  
"0001" 
Yes  
".5" 
Yes  .5 is equivalent to 0.5 
"1" 
Yes  
"0x77F" 
Yes  
"3.456e11" 
Yes  pseudoscientificnotation 
"3.10.4" 
Not a number  two decimals points 
"." 
Not a number  
"" 
Not a number  do not match the empty string 
The following are defined to be seed numbers …
(1
, 365
, 9.43
, 5000
, +10
, 0001
, .5
, .5
, 0x77F
, 3.456e11
)
A valid number is defined to be any seed number or a string formed by a seed number by doing one of the following:
 Iteratively replacing any digit in a seed number with
99
 Replacing any digit in a valid number with a different digit.
 Replacing
F
in0xF
with2F
orF2
orA
,B
,C
,D
orE
.
For example, you could replace the 5
in 5000
with 9
to get 9000
Also, you could replace the 5
in .5
with 99
to get .99
The above defines language L.
My question could be reworded as follows:
What algorithm A will return s′ from input string s such that:
 s is any finitelength string of ASCII characters.
 string s′ is like string s except that all maximal substrings of s which are in language Lhave been replaced by empty strings.
A substring t of string s is maximal and t is in language L if it is not possible to tack on one more character to the left or to the right of t to form t’such that t’ is a string in language L and t’ is a substring of s.
In layman’s terms, if you see “apple 12.345” you should go after “12.345” not “2.34”.
Indices matter. Sometimes, it makes no sense to say that the letter "a"
is a substring of "abracadabra"
. Which letter “a” is it? It it the letter “a” thirdfromtheleft, or secondfromthe left?
We define a string to a mathematical mapping M from a finite subset of the natural numbers to the ASCii character set such that the absolute difference between the maximum of the domain of mapping M and the minimum of the domain of mapping M is the sum of one and the cardinality of the domain of mapping M.
For any string SML and any string LRGwe say that SML is a substring of LRG if and only if SML[k] = LRG[k] for all k in the domain of string SML