Re: Untaint IP address/hostname question
by Juerd (Abbot) on Mar 08, 2004 at 17:15 UTC
|
Regexp::Common's $RE{net}{IPv4} and $RE{net}{domain}{-nospace}
Note that 2130706433 is in fact a valid IP address (equal to 127.0.0.1) and that you might just want to try inet_ntoa inet_aton $ip instead. (These can be found in the standard module Socket).
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
remember that umlauts (äöüÄÖÜ) are valid for german url´s (.de) since 01.March.
i don´t think Regexp::Common covers that.
| [reply] [Watch: Dir/Any] |
|
Regexp::Common covers what's in RFC 2396 and RFC 2626 when it
comes to HTTP URIs. If those RFC's are superseeded, I'd be interested in hearing about them.
Abigail
| [reply] [Watch: Dir/Any] |
|
|
|
|
2130706433 is not a valid IP address as most people think of them. It is the decimal integer corresponding to the binary IP address for 127.0.0.1. The Unix inet_ntoa accepts all kinds of non-standard forms for IP addresses. Everyone else thinks that IP addresses are represented as four decimal numbers sepated by periods. Using anything else will confuse people and programs that expect the standard form.
| [reply] [Watch: Dir/Any] |
|
2130706433 is not a valid IP address as most people think of them.
Likewise, "login=juerd" is not a valid cookie as most people think of them. They expect them to be edible. What most people think and what is technically correct isn't always the same.
The Unix inet_ntoa accepts all kinds of non-standard forms for IP addresses.
Yes, like the ones formed like "127.0.0.1". This is only a de-facto standard, not an official one. It happens to be accepted by almost everything that takes an IP address. Decimal numbers like "2130706433" are also a de-facto standard; they are just not used as much. The libraries found in Unix, Linux, Windows and Mac OS all think "2130706433" and "127.0.0.1" are the same address.
Everyone else thinks that IP addresses are represented as four decimal numbers sepated by periods. Using anything else will confuse people and programs that expect the standard form.
We could argue about the meaning of "everyone else" or about "anything else", or even about who you think "people" are. Or we could just stick to your point and discuss the "standard" status of dotted decimal IP addresses. That some applications and even some protocols require IP addresses to be stringified like that does not mean that it is the only standard - or that it even is a standard.
Should you have an STD, RFC or another official document that says more on this subject, I'll be happy to hear about it.
| [reply] [Watch: Dir/Any] |
|
|
Re: Untaint IP address/hostname question
by UnderMine (Friar) on Mar 08, 2004 at 17:39 UTC
|
(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|(?:\w+\.)+\w{2,3})
Solves the above hostname issue but misses a good few invalid values.
(\d+|(?:\d{1,3}\.){3}\d{1,3}|(?:\w+\.)+\w{2,3})
Is better but you are far better using Regex::Common functions.
Hope it helps
UnderMine
| [reply] [Watch: Dir/Any] [d/l] [select] |
Re: Untaint IP address/hostname question
by fokat (Deacon) on Mar 09, 2004 at 05:04 UTC
|
<PLUG CLASS="shameless">
Consider using NetAddr::IP, as it recognizes most IP address formats in common (and not so common) use.
</PLUG>
Best regards
-lem, but some call me fokat
| [reply] [Watch: Dir/Any] |
Re: Untaint IP address/hostname question
by imcsk8 (Pilgrim) on Mar 08, 2004 at 19:53 UTC
|
if you really want to use a regexp you culd rewrite your current one to look as this one:
(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|(\w+\.)+\w{2,3}$)
ignorance, the plague is everywhere
--guttermouth
| [reply] [Watch: Dir/Any] [d/l] |
|
Not sure that regex above works all that well as an untainter:)... it allows:
999.000.999.000
as an IP address and look what it does to the legal domain name
neonutt.firstpart-secondpart.co.uk
Just for your IP addresses (not for your domain names), maybe something like this regex gets closer to what you need?
/((\d | [01]?\d\d | 2[0-4]\d | 25[0-5] )\.){3}(\d | [01]?\d\d | 2[0-4]
+\d | 25[0-5] )/
Do people really test for the binary representation of the address too? I haven't seen it that often... but, then again, I dont' get out often.
-hsinclai
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
| [reply] [Watch: Dir/Any] |
|
Re: Untaint IP address/hostname question
by ambrus (Abbot) on Mar 09, 2004 at 16:31 UTC
|
It depends on how you will use the host name.
If you convert it directly with gethostbyname (which is the
safest solution), you can probably accept any hostname.
If you pass it to some external program or shell,
you'll have to check what characters that program accepts.
The important point here is not to check that the
hostname is a valid hostname, but rather that it
using it won't do something bad. That is, even if a hostname
is valid, it can screw your program if whatever you pass it
misinterprets it. If the hostname for example starts with a
hyphen (I don't know if that can be valid or not), and
you call a program with it and it interprets it as a
switch, that's bad, even though the user
gave you a valid hostname.
| [reply] [Watch: Dir/Any] |