Bug #1607
closedMBUF usage grows geometrically
0%
Description
MBUF usage reported in the dashboard grows by roughly 800 per day. When the max value hits the value of nmbclusters pfsense locks or panicks (not sure which, as I've managed to avoid this for almost two months now). This problem dates back to approximately November 2010 for me.
2.0-RC2 (amd64)
built on Fri Jun 17 21:21:19 EDT 2011
Platform nanobsd (1g)
[2.0-RC2][root@pfsense.tfcg.co]/root(10): pciconf -lvb | grep -A11 em0
em0@pci0:2:0:0: class=0x020000 card=0x060a15d9 chip=0x10d38086 rev=0x00 hdr=0x00
class = network
subclass = ethernet
bar [10] = type Memory, range 32, base 0xfeae0000, size 131072, enabled
bar [18] = type I/O Port, range 32, base 0xdc00, size 32, enabled
bar [1c] = type Memory, range 32, base 0xfeadc000, size 16384, enabled
em1@pci0:3:0:0: class=0x020000 card=0x060a15d9 chip=0x10d38086 rev=0x00 hdr=0x00
class = network
subclass = ethernet
bar [10] = type Memory, range 32, base 0xfebe0000, size 131072, enabled
bar [18] = type I/O Port, range 32, base 0xec00, size 32, enabled
bar [1c] = type Memory, range 32, base 0xfebdc000, size 16384, enabled
Files
Updated by David Burgess over 13 years ago
Found my first mention of it in the forums:
http://forum.pfsense.org/index.php?topic=28169.0
Arch should include i386 then too.
Updated by Jim Pingle over 13 years ago
- Status changed from New to Feedback
- Affected Architecture All added
- Affected Architecture deleted (
amd64)
And it's still happening on that snapshot? (It's from yesterday, hasn't been 24hrs yet). The em driver was changed out about a week or so ago, it may have made a difference.
It would also help to have a general overview of all of the features you have in use (traffic shaping, captive portal, vpns, etc, etc) and an idea of how loaded the system is (some general traffic stats, x amount per day/hour/etc, y pps). A sanitized version of the output from /status.php would also help. If you do not want to post that here, you can e-mail it to one of us.
We'll need a lot more information to go on in order to determine what is going on.
Updated by David Burgess over 13 years ago
- File status.php.bz2 status.php.bz2 added
Summary:
7 lines mlppp
2 em NICs
14 or so vlans on both NICS, total
openvpn client
no packages currently installed, although I think some package info is still in the config file.
daily transfer is usually 9-50 GB
The last snapshot that ran for more than three days was May 1 full install, up for 40 days, mbufs were >22K. This snapshot show some initial growth in mbuf count. Time will tell if that levels off.
I'll email my status.php to jimp since it's too
For future reference, these commands came in handy:
sed -i 's/.*<prv>\(.*\)<\/prv>.*/.*<prv>xxxxxx<\/prv>/g' status.php.txt sed -i 's/.*<crt>\(.*\)<\/crt>.*/.*<crt>xxxxxx<\/crt>/g' status.php.txt
Updated by Jim Pingle over 13 years ago
Of those I would probably be most inclined to point a finger an mlppp since it's the least common used feature among them. We have plenty of people using tons of VLANs in the wild without issues.
Also you seem to have not only a large number of states, but the majority of them are in the CLOSED:SYN_SENT or SYN_SENT:CLOSED state - How many states do you normally have? What is the max? (Check the states graph). Perhaps switching the firewall optimization to aggressive might help keep those cleared out. Do you do a lot of port scans or something else that might initiate a large number of connections that never fully negotiate?
Updated by David Burgess over 13 years ago
- File states.png states.png added
The RRD graphs mostly didn't survive the config restore, so the screenshot is the best I can do. I had firewall set to Conservative. I don't know why I would have had a bunch of non-negotiated connections, but I do have a lot of clients on the network that I don't control.
I set the firewall optimization to normal and cleared all my custom system tunables. We'll see how that fares on RC3.
Updated by David Burgess over 13 years ago
After 10 days uptime my MBUF Usage has almost completely levelled off at 6062 /9856. The second number was at 9730 for three days or so, so I think I can say that the change to firewall optimization and/or the deletion of some custom sysctl values has fixed this problem for me.
Updated by Ermal Luçi over 13 years ago
- Status changed from Feedback to Resolved