Bug #5062
closedis_subnet() is stilll useless for validation
0%
Description
This was redone in a5e2a35f03c8e176dd005faa9aad81c29a16a704 however it still doesn't verify that the input given is a valid network ID at all, it basically verifies that the input is a valid IP address. Things like ::1/128 or 192.168.0.234/24 all "validate".
Updated by Stilez y over 10 years ago
Unless there's a good reason, I think - so they should.
It's not necessarily the router's job to make sure the subnets (or any "real-world" IP data entered) is going to always make real world sense. It can't know that. Both of those have unambiguous meaning and can be routed or interpreted - the former is "all IPs with these bits this way and those bits "who cares" (which need not be a real-world valid network) and the latter has a similar definition but noting 192.168.0.234/24 is the same subnet as 192.168.0.235/24 and so on. But in both cases, the question "is a given IP in the subnet or not" is unambiguous. They just aren't very sensible definitions, and the first of them isn't very useful either. But they are entirely meaningful ones and the router will know what to make of them. For me, that's the test. Beyond that, we limit the user without providing benefits to them. So for me this isn't a bug, it's inherent in the way IPs and subnets are left flexible. It's just not very ordinary or helpful flexibility - but it's RFC flexibility, not bug.
(As an edge case to an edge case, examples like the 2nd case may well have meaning to that user if they're being used for real - for example they may prefer personally to define a subnet by the static IP they allocated to its gateway/router and the bit mask, so they define the /8 with router at 10.11.12.13 as 10.11.12.13/8. I'm not aware of a rule that says they mustn't do that if it pleases them. It's just ... rather creative ;') )
Updated by Jim Pingle over 10 years ago
- Status changed from New to Not a Bug
I agree it doesn't seem like a bug. Areas that are known to be sensitive to wanting an actual network address can run the parts through gen_subnet($ipaddr, $bits), gen_subnetv4(), or gen_subnetv6() before putting the subnet id/cidr to use. gen_subnet() will calculate the network address/subnet id when passed an IP address and cidr mask. Seems a bit harsh to force the user to calculate that on their own when the code can do that for them. Some areas like aliases already fix the addresses up when given.
If someone wants to force the user to enter a valid network address only, the validation could test is_subnet() and if true, compare the original input given to what gen_subnet() outputs.
Updated by Stilez y over 10 years ago
Two other reasons why RFC-compliant but unhelpful cases shouldn't be blocked:
One scenario is when the router might be deployed in a situation where its security aspects are key (threat and malformed traffic detection, unusual packets or packets with some bogus fields attempting to find exploits or deny service). If the router stops the user putting such things into rules, the user is unable to write rules for such an environment even if they want to. As a security-conscious router with a lot of capacity for security packages/IDS to be used, blocking valid data that may actually have real-world use in cases where bad data meeting criteria must be matched, feels a minus. But providing messages like "you entered 192.168.0.234/24, this really ought to be written 192.168.0.0/24 but I know what you mean, just change it will you" isn't that big a plus ;-)
(A real-world example of this - some years ago, and still not completely irrelevant, there was a phase where bad packets would be sent that attempted to probing for mishandling of "port 0". Rules matching "port=0" can't be defined much less used, if the router is going to "helpfully" think that "okay it's technically valid but no real-world application uses port=0 so let's tell the user they made an error".)
Second example - suppose a user is playing round with their own offline network, and does want to be able to route to some odd or unexpected subnet that "doesn't make sense normally". Perhaps they want to do it specifically because it doesn't normally make sense, perhaps just to see how routers handle it, or because it serves them in some weird way. There's no good reason to forbid it, if that's their wish. Allow unless it would break something :)
Updated by Kill Bill over 10 years ago
Shrug. I assumed it's a generally usable function for input validation, e.g. in packages. Apparently, it isn't.
Updated by Jim Pingle over 10 years ago
There are comparatively few things that expect an exact network ID/CIDR compared to things that are more forgiving. It's easy to code around the things that need the more strict values using gen_subnet() and friends. The is_subnet() function just needs to tell if something is in the proper format of address/bits where address is a valid address and bits are in the right range for the address type.
Updated by Jim Pingle over 10 years ago
That said, if someone wanted to add an optional parameter to force strict validation perhaps both needs could be fulfilled.
e.g. is_subnet($subnet, $strict=false)
And then if $strict == true then it would pass the component parts to gen_subnet() and compare to what was passed in.
That's still a feature request and not a bug, but it doesn't look too difficult to add if it is desirable.
Updated by Kill Bill over 10 years ago
Sounds good. I'll figure out something that doesn't change the existing behaviour unless strict validation is explicitly requested and submit a PR... Thanks.
Updated by Stilez y over 10 years ago
If it's done, then make it IPv4/v6 agnostic (will check using whichever family it verified against).
Probably shouldn't need saying, but everything in util.inc (and other IP handling functions used generally) should be v4/v6 agnostic, or have v4/v6 versions, or ... well, whatever the ones in util.inc do. But I'm not sure if it's not said, that this won't happen :)
I can add the PR it you like, Phil, save you time. Let me know
Updated by Stilez y over 10 years ago
Sorry - one thing Phil, could be important.
KB just sent me an email. As I understand it, he says that the issue is really, that pf itself can misprocess some things, if subnets are "technically ok but not sorted out as above". He says that blocklists of the kind pfBlocker handles, has these issues routinely. Is that so?
If it is, then the issue is much more about hardening the function that takes data and puts it into tables or builds pf.conf, and that's where the correction is needed - not in telling the user that they entered "bad data", because it isn't. the functionality that rebuilds pf.conf or imports URL/IP tables, may need to check their data and make sure it's converted to "standardised" subnets so that pf won't have problems in digesting them. Because if this is an issue, then an imported list of IPs (of any purpose) could be entirely rejected or be constantly "non standard" in this way, and the user just shouldn't be constantly interrupted to fix it.
If that's the "real" issue leading to this issue report, then I don't think a "strict" arg is really needed or indeed the answer. As it is, the user can currently enter only "valid" subnets. The issue is that pf itself has indigestion over some valid, unambiguous, meaningful subnets, so a bit of detection/conversion is needed in the background on subnets that pf might be "fed". Different problem.
Updated by Chris Buechler over 10 years ago
pf doesn't care whether or not the subnet has the correct network address, it'll figure it out. For instance with an alias with contents 1.1.1.3/29 you end up with:
$ pfctl -t test -T show 1.1.1.0/29
People seem to get the wrong network address almost as much as the right one and I've never seen that hurt anything in any part of pf. Some other things can be a problem, though most don't care. One that comes to mind is if you push a route to Windows OpenVPN clients that doesn't have a proper network address, Windows fails to add the route.
Updated by Stilez y over 10 years ago
Following another email and examples this is the issue:
The issue intended isn’t in pf, and the issue title needs changing. It’s ultimately an issue created by ip2long() in PHP and has been closed in their bug tracker twice.
https://bugs.php.net/bug.php?id=26170
https://bugs.php.net/bug.php?id=44892
Apparently a leading zero in an IP means “octal”. (ip2long() doesn’t do octal quads either so that would still be a bug but it doesn’t help us)
Instead ip2long() treats leading zero padded IPs as “not valid”.
Both times closed as “not a bug”.
Not sure what this means for pfSense data – should leading zero IPs be treated as
1) octal (technically correct but very rare if ever and little known); or
2) treated as decimal needing stripping before processing (likely intended but technically not correct per RFC)?
Both are easy to do – but what should the platform overall do with when faced with a zero padded IP in some list or input?
This is more a "pfSense platform decision" than a bug. But a decision is needed.
Personally I'd go for the second of these - perhaps backed up by a setting "strip zero padding from IPs and treat as decimal" as the default and "treat zero padded IPs as octal" as an alternative for purists, or just the first and ignore the second. But this is really Clive's call or something as it's an issue about "how should the platform handle an external issue in its current language".
Updated by Stilez y over 10 years ago
"... or just the second [decimal] and ignore the first [octal]..." (typo)
Updated by Stilez y over 10 years ago
Clive /Phil - left to itself, how does pf (and other FreeBSD packages used in the router such as dhcpd, tcpdump, etc etc) handle zero padding in IPs?
Whatever pf will do, I think ought to be used as the guide to correct behaviour here, since in the end all the PHP code is doing is sanitising data to be used within pf and the various other *nix packages of the router.
Updated by Phillip Davis over 10 years ago
I think the poster's name lookup table is a bit out of sync here. I haven't posted on this yet. Maybe you mean:
Clive -> Chris
Phil -> Kill Bill
:)
I would keep the "standard" behavior of octal/hex interpretation:
1.2.0xA.0xB 1.2.10.11
1.2.011.012 1.2.9.10
If people put their IP addresses in like that then they get what they ask for.
For routes pushed to Windows OpenVPN clients and other like things where the recipient really wants a "normalised" subnet spec, the code can use gen_subnet() anyway to make it right on-the-fly. That is probably just as efficient as having an is_subnet() that can check if something really is in "normalised" form and then have to call gen_subnet() when it is not.