茫茫網海中的冷日
         
茫茫網海中的冷日
發生過的事,不可能遺忘,只是想不起來而已!
 恭喜您是本站第 1673015 位訪客!  登入  | 註冊
主選單

Google 自訂搜尋

Goole 廣告

隨機相片
PIMG_00082.jpg

授權條款

使用者登入
使用者名稱:

密碼:


忘了密碼?

現在就註冊!

爪哇咖啡屋 : [轉貼]Regular expression for floating point numbers

發表者 討論內容
冷日
(冷日)
Webmaster
  • 註冊日: 2008/2/19
  • 來自:
  • 發表數: 15771
[轉貼]Regular expression for floating point numbers
Regular expression for floating point numbers

I have a task to match floating point numbers. I have written the following regular expression for it :
[-+]?[0-9]*\.?[0-9]*

But it shows an error saying :
Invalid escape sequence (valid ones are  \b  \t  \n  \f  \r  \"  \'  \\ )

But as per my knowledge we need to use an escape character for the . also. Please correct me where I am wrong.



TL;DR
Use [.] instead of \. and [0-9] instead of \d to avoid escaping issues in some languages (like Java).
Thanks to the nameless one for originally recognizing this.
One relatively simple pattern for matching a floating point number is
[+-]?([0-9]*[.])?[0-9]+

This will match:
    123
    123.456
    .456

See a working example
If you also want to match 123. (a period with no decimal part), then you'll need a slightly longer expression:
[+-]?([0-9]+([.][0-9]*)?|[.][0-9]+)

See pkeller's answer for a fuller explanation of this pattern
If you want to include non-decimal numbers, such as hex and octal, see my answer to How do I identify if a string is a number?.
If you want to validate that an input is a number (rather than finding a number within the input), then you should surround the pattern with ^ and $, like so:
^[+-]?([0-9]*[.])?[0-9]+$

Irregular Regular Expressions
"Regular expressions", as implemented in most modern languages, APIs, frameworks, libraries, etc., are based on a concept developed in formal language theory. However, software engineers have added many extensions that take these implementations far beyond the formal definition. So, while most regular expression engines resemble one another, there is actually no standard. For this reason, a lot depends on what language, API, framework or library you are using.
(Incidentally, to help reduce confusion, many have taken to using "regex" or "regexp" to describe these enhanced matching languages. See Is a Regex the Same as a Regular Expression? at RexEgg.com for more information.)
That said, most regex engines (actually, all of them, as far as I know) would accept \.. Most likely, there's an issue with escaping.
The Trouble with Escaping
Some languages have built-in support for regexes, such as JavaScript. For those languages that don't, escaping can be a problem.
This is because you are basically coding in a language within a language. Java, for example, uses \ as an escape character within it's strings, so if you want to place a literal backslash character within a string, you must escape it:
// creates a single character string: "\"
String x = "\\";

However, regexes also use the \ character for escaping, so if you want to match a literal \ character, you must escape it for the regexe engine, and then escape it again for Java:
// Creates a two-character string: "\\"
// When used as a regex pattern, will match a single character: "\"
String regexPattern = "\\\\";

In your case, you have probably not escaped the backslash character in the language you are programming in:
// will most likely result in an "Illegal escape character" error
String wrongPattern = "\.";
// will result in the string "\."
String correctPattern = "\\.";

All this escaping can get very confusing. If the language you are working with supports raw strings, then you should use those to cut down on the number of backslashes, but not all languages do (most notably: Java). Fortunately, there's an alternative that will work some of the time:
String correctPattern = "[.]";

For a regex engine, \. and [.] mean exactly the same thing. Note that this doesn't work in every case, like newline (\\n), open square bracket (\\[) and backslash (\\\\ or [\\]).
A Note about Matching Numbers
(Hint: It's harder than you think)
Matching a number is one of those things you'd think is quite easy with regex, but it's actually pretty tricky. Let's take a look at your approach, piece by piece:
[-+]?

Match an optional - or +
[0-9]*

Match 0 or more sequential digits
\.?

Match an optional .
[0-9]*

Match 0 or more sequential digits
First, we can clean up this expression a bit by using a character class shorthand for the digits (note that this is also susceptible to the escaping issue mentioned above):
[0-9] = \d

I'm going to use \d below, but keep in mind that it means the same thing as [0-9]. (Well, actually, in some engines \d will match digits from all scripts, so it'll match more than [0-9] will, but that's probably not significant in your case.)
Now, if you look at this carefully, you'll realize that every single part of your pattern is optional. This pattern can match a 0-length string; a string composed only of + or -; or, a string composed only of a .. This is probably not what you've intended.
To fix this, it's helpful to start by "anchoring" your regex with the bare-minimum required string, probably a single digit:
\d+

Now we want to add the decimal part, but it doesn't go where you think it might:
\d+\.?\d* /* This isn't quite correct. */

This will still match values like 123.. Worse, it's got a tinge of evil about it. The period is optional, meaning that you've got two repeated classes side-by-side (\d+ and \d*). This can actually be dangerous if used in just the wrong way, opening your system up to DoS attacks.
To fix this, rather than treating the period as optional, we need to treat it as required (to separate the repeated character classes) and instead make the entire decimal portion optional:
\d+(\.\d+)? /* Better. But... */

This is looking better now. We require a period between the first sequence of digits and the second, but there's a fatal flaw: we can't match .123 because a leading digit is now required.
This is actually pretty easy to fix. Instead of making the "decimal" portion of the number optional, we need to look at it as a sequence of characters: 1 or more numbers that may be prefixed by a . that may be prefixed by 0 or more numbers:
(\d*\.)?\d+

Now we just add the sign:
[+-]?(\d*\.)?\d+

Of course, those slashes are pretty annoying in Java, so we can substitute in our long-form character classes:
[+-]?([0-9]*[.])?[0-9]+

Matching versus Validating
This has come up in the comments a couple times, so I'm adding an addendum on matching versus validating.
The goal of matching is to find some content within the input (the "needle in a haystack"). The goal of validating is to ensure that the input is in an expected format.
Regexes, by their nature, only match text. Given some input, they will either find some matching text or they will not. However, by "snapping" an expression to the beginning and ending of the input with anchor tags (^ and $), we can ensure that no match is found unless the entire input matches the expression, effectively using regexes to validate.
The regex described above ([+-]?([0-9]*[.])?[0-9]+) will match one or more numbers within a target string. So given the input:
apple 1.34 pear 7.98 version 1.2.3.4

The regex will match 1.34, 7.98, 1.2, .3 and .4.
To validate that a given input is a number and nothing but a number, "snap" the expression to the start and end of the input by wrapping it in anchor tags:
^[+-]?([0-9]*[.])?[0-9]+$

This will only find a match if the entire input is a floating point number, and will not find a match if the input contains additional characters. So, given the input 1.2, a match will be found, but given apple 1.2 pear no matches will be found.
Note that some regex engines have a validate, isMatch or similar function, which essentially does what I've described automatically, returning true if a match is found and false if no match is found. Also keep in mind that some engines allow you to set flags which change the definition of ^ and $, matching the beginning/end of a line rather than the beginning/end of the entire input. This is typically not the default, but be on the lookout for these flags.



I don't think that any of the answers on this page at the time of writing are correct (also many other suggestions elsewhere on SO are wrong too). The complication is that you have to match all of the following possibilities:
No decimal point (i.e. an integer value)
Digits both before and after the decimal point (e.g. 0.35 , 22.165)
Digits before the decimal point only (e.g. 0. , 1234.)
Digits after the decimal point only (e.g. .0 , .5678)
At the same time, you must ensure that there is at least one digit somewhere, i.e. the following are not allowed:
a decimal point on its own
a signed decimal point with no digits (i.e. +. or -.)
+ or - on their own
an empty string
This seems tricky at first, but one way of finding inspiration is to look at the OpenJDK source for the java.lang.Double.valueOf(String) method (start at http://hg.openjdk.java.net/jdk8/jdk8/jdk, click "browse", navigate down /src/share/classes/java/lang/ and find the Double class). The long regex that this class contains caters for various possibilities that the OP probably didn't have in mind, but ignoring for simplicity the parts of it that deal with NaN, infinity, Hexadecimal notation and exponents, and using \d rather than the POSIX notation for a single digit, I can reduce the important parts of the regex for a signed floating point number with no exponent to:
[+-]?((\d+\.?\d*)|(\.\d+))

I don't think that there is a way of avoiding the (...)|(...) construction without allowing something that contains no digits, or forbidding one of the possibilities that has no digits before the decimal point or no digits after it.
Obviously in practice you will need to cater for trailing or preceding whitespace, either in the regex itself or in the code that uses it.



what you need is:
[\-\+]?[0-9]*(\.[0-9]+)?

I escaped the "+" and "-" sign and also grouped the decimal with its following digits since something like "1." is not a valid number.
The changes will allow you to match integers and floats. for example:
0
+1
-2.0
2.23442




This is simple: you have used Java and you ought to use \\. instead of \. (search for character escaping in Java).



This one worked for me:
(?P<value>[-+]*\d+\.\d+|[-+]*\d+)

You can also use this one (without named parameter):
([-+]*\d+\.\d+|[-+]*\d+)

Use some online regex tester to test it (e.g. regex101 )



[+-]?(([1-9][0-9]*)|(0))([.,][0-9]+)?

[+-]? - optional leading sign

(([1-9][0-9]*)|(0)) - integer without leading zero, including single zero

([.,][0-9]+)? - optional fractional part




^[+]?([0-9]{1,2})*[.,]([0-9]{1,1})?$

This will match:
    1.2
    12.3
    1,2
    12,3




I want to match what most languages consider valid numbers (integer and floats):
'5' / '-5'
'1.0' / '1.' / '.1' / '-1.' / '-.1'
'0.45326e+04', '666999e-05', '0.2e-3', '-33.e-1'
Notes:
preceding sign of number ('-' or '+') is optional
'-1.' and '-.1' are valid but '.' and '-.' are invalid
'.1e3' is valid, but '.e3' and 'e3' are invalid
In order to support both '1.' and '.1' we need an OR operator ('|') in order to make sure we exclude '.' from matching.
[+-]? +/- sing is optional since ? means 0 or 1 matches
( since we have 2 sub expressions we need to put them in parenthesis
\d+([.]\d*)?(e[+-]?\d+)? This is for numbers starting with a digit
| separates sub expressions
[.]\d+(e[+-]?\d+)? this is for numbers starting with '.'
) end of expressions
For numbers starting with '.'
[.] first character is dot (inside brackets or else it is a wildcard character)
\d+ one or more digits
(e[+-]?\d+)? this is an optional (0 or 1 matches due to ending '?') scientific notation
For numbers starting with a digit
\d+ one or more digits
([.]\d*)? optionally we can have a dot character an zero or more digits after it
(e[+-]?\d+)? this is an optional scientific notation
Scientific notation
e literal that specifies exponent
[+-]? optional exponent sign
\d+ one or more digits
All of those combined:
[+-]?(\d+([.]\d*)?(e[+-]?\d+)?|[.]\d+(e[+-]?\d+)?)




[+/-] [0-9]*.[0-9]+

Try this solution.



for javascript
const test = new RegExp('^[+]?([0-9]{0,})*[.]?([0-9]{0,2})?$','g');

Which would work for 1.23 1234.22 0 0.12 12
You can change the parts in the {} to get different results in decimal length and front of the decimal as well. This is used in inputs for entering in number and checking every input as you type only allowing what passes.

原文出處:regex - Regular expression for floating point numbers - Stack Overflow
冷日
(冷日)
Webmaster
  • 註冊日: 2008/2/19
  • 來自:
  • 發表數: 15771
[轉貼]Matching Floating Point Numbers with a Regular Expression

Matching Floating Point Numbers with a Regular Expression

This example shows how you can avoid a common mistake often made by people inexperienced with regular expressions. As an example, we will try to build a regular expression that can match any floating point number. Our regex should also match integers and floating point numbers where the integer part is not given. We will not try to match numbers with an exponent, such as 1.5e8 (150 million in scientific notation).

At first thought, the following regex seems to do the trick: [ -+ ] ? [ 0 - 9 ] * \. ? [ 0 - 9 ] *. This defines a floating point number as an
optional sign, followed by an optional series of digits (integer part), followed by an optional dot, followed by another optional series of digits (fraction part).

Spelling out the regex in words makes it obvious: everything in this regular expression is optional. This regular expression considers a sign by itself or a dot by itself as a valid floating point number. In fact, it even considers an empty string as a valid floating point number. If you tried to use this regex to find floating point numbers in a file, you'd get a zero-length match at every position in the string where no floating point number occurs.


Not escaping the dot is also a common mistake. A dot that is not escaped matches any character, including a dot. If we had not escaped the dot, both 4.4 and 4X4 would be considered a floating point numbers.

When creating a regular expression, it is more important to consider what it should not match, than what it should. The above regex indeed matches a proper floating point number, because the regex engine is greedy. But it also match many things we do not want, which we have to exclude.

Here is a better attempt: [ -+ ] ? ( [ 0 - 9 ] * \. [ 0 -
9 ] + | [ 0 - 9 ] + )
. This regular expression matches an optional sign, that is either followed by zero or more digits followed by a dot and one or more digits (a floating point number with optional integer part), or that is followed by one or more digits (an integer).

This is a far better definition. Any match must include at least one digit. There is no way around the [ 0 - 9 ] + part. We have successfully excluded the matches we do not want: those without digits.

We can optimize this regular expression as: [ -+ ] ? [ 0 -
9 ] * \. ? [ 0 - 9 ] +
.

If you also want to match numbers with exponents, you can use: [ -+ ] ? [ 0 - 9 ] * \. ? [ 0 - 9 ] + ( [ eE ] [ -+ ] ? [ 0 - 9 ] + ) ?. Notice how I made the entire exponent part optional by
grouping it together, rather than making each element in the exponent optional.

Finally, if you want to validate if a particular string holds a floating point number, rather than finding a floating point number within longer text, you'll have to anchor your regex: ^ [ -+ ] ? [ 0 - 9 ] * \. ? [ 0 - 9 ] + $ or ^ [ -+ ] ? [ 0 - 9 ] * \. ?
[ 0 - 9 ] + ( [ eE ] [ -+ ] ? [ 0 - 9 ] + ) ? $
. You can find additional variations of these regexes in RegexBuddy's library.


原文出處:Example: Matching Floating Point Numbers with a Regular Expression
冷日
(冷日)
Webmaster
  • 註冊日: 2008/2/19
  • 來自:
  • 發表數: 15771
[轉貼]冷日 Regular Expression 小數樣本
冷日樣本:

最後冷日決定寫成幾個 Method 來判斷與處理小數的各種問題,分別如下:

檢查是否正小數:
	public static boolean isPositiveDecimal(String orginalStr2Check) {
		return isMatch("\\+{0,1}[0]\\.[1-9]*|\\+{0,1}[1-9]\\d*\\.\\d*", orginalStr2Check);
	}


檢查是否負小數:
	public static boolean isNegativeDecimal(String orginalStr2Check) {
		return isMatch("^-[0]\\.[1-9]*|^-[1-9]\\d*\\.\\d*", orginalStr2Check);
	}


檢查是否小數:
	public static boolean isDecimal(String orginalStr2Check) {
		return isMatch("[-+]{0,1}\\d+\\.\\d*|[-+]{0,1}\\d*\\.\\d+", orginalStr2Check);
	}


檢查是否科學表示式小數:
	public static boolean isScientificDecimal(String orginalStr2Check) {
		return isMatch("^[-+]?[0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?$", orginalStr2Check);
	}


檢查是否實數:
	public static boolean isRealNumber(String orginalStr2Check) {
		return isWholeNumber(orginalStr2Check) || isDecimal(orginalStr2Check) || isScientificDecimal(orginalStr2Check);
	}

前一個主題 | 下一個主題 | 頁首 | | |



Powered by XOOPS 2.0 © 2001-2008 The XOOPS Project|