Bug #12095
openMemory leak in pcscd
0%
Description
The PCSC daemon looks to have a memory leak even when it's not in use. Or even when there are no IPSec tunnels defined which might be tied to it.
last pid: 99559; load averages: 0.25, 0.18, 0.11 up 12+01:45:56 18:38:11 69 processes: 1 running, 68 sleeping CPU: 0.2% user, 0.0% nice, 1.2% system, 0.0% interrupt, 98.6% idle Mem: 31M Active, 1047M Inact, 502M Wired, 116M Buf, 347M Free Swap: 1908M Total, 1908M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 7760 root 3 20 0 990M 976M select 1 24:23 0.14% pcscd 9887 root 1 52 0 105M 39M accept 0 1:20 0.00% php-fpm 87665 root 1 52 0 102M 36M accept 0 0:43 0.00% php-fpm 4918 root 1 52 0 102M 36M accept 1 0:40 0.00% php-fpm 77558 root 1 52 0 102M 36M accept 1 0:38 0.00% php-fpm
Stopping and then startingthe services resets the memory use to ~4MB. If you have IPSec tunnels defined which doing that the IPSec service will also need to be stopped and then started.
There looks to be a limit at ~1GB so it could be only lower memory systems that are affected.
Tested: CE 2.5.1, Plus 21.05 and 21.09.
Files
Related issues
Updated by Jim Pingle over 3 years ago
Could be partially mitigated by #11933 -- That daemon should be made optional and off by default except for the few people who need its capabilities for PKCS#11.
Updated by Jim Pingle over 3 years ago
- Related to Todo #11933: PC/SC Smart Card Daemon ``pcscd`` running on all devices at all times, should be optional added
Updated by Marcos M over 3 years ago
Here are some stats on various 21.05 VMs:
- Uptime: 6d2h
- RAM: 1G
- pcscd usage: 326M
- 2 VTI IPsec tunnels
last pid: 11696; load averages: 1.32, 0.95, 0.89 up 6+02:41:39 13:14:03 66 processes: 1 running, 65 sleeping CPU: 0.0% user, 0.0% nice, 0.0% system, 0.8% interrupt, 99.2% idle Mem: 58M Active, 398M Inact, 223M Wired, 100M Buf, 257M Free Swap: 410M Total, 410M Free PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND 7702 root 5 20 0 326M 320M select 3:51 0.02% pcscd 70890 root 1 20 0 103M 38M getblk 0:00 0.00% php-cgi 85503 root 1 52 0 103M 36M accept 0:00 0.00% php-fpm 15450 root 1 52 0 103M 38M accept 0:01 0.00% php-fpm
- Uptime: 11d16h
- RAM: 1G
- pcscd usage: 620M
- No IPsec tunnels
last pid: 51378; load averages: 0.38, 0.55, 0.57 up 11+16:07:38 13:13:21 54 processes: 1 running, 53 sleeping CPU: 2.7% user, 3.8% nice, 13.7% system, 0.0% interrupt, 79.8% idle Mem: 30M Active, 246M Inact, 355M Laundry, 210M Wired, 136K Buf, 89M Free ARC: 71M Total, 40M MFU, 26M MRU, 1573K Anon, 564K Header, 3446K Other 38M Compressed, 95M Uncompressed, 2.50:1 Ratio Swap: 2048M Total, 192M Used, 1856M Free, 9% Inuse PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND 7648 root 3 20 0 620M 496M select 7:19 0.05% pcscd 352 root 1 52 0 131M 28M accept 1:06 0.00% php-fpm 1716 root 1 52 0 103M 28M accept 0:21 0.00% php-fpm 46386 root 1 52 0 103M 27M accept 0:15 0.00% php-fpm
- Uptime: 5d17h
- RAM: 8G
- pcscd usage: 312M
- No IPsec tunnels
last pid: 29474; load averages: 0.50, 0.39, 0.30 up 5+17:21:00 13:23:51 69 processes: 2 running, 67 sleeping CPU: 2.9% user, 0.0% nice, 2.9% system, 0.0% interrupt, 94.1% idle Mem: 61M Active, 547M Inact, 897M Wired, 240K Buf, 6391M Free ARC: 340M Total, 106M MFU, 224M MRU, 1344K Anon, 1224K Header, 7573K Other 93M Compressed, 268M Uncompressed, 2.87:1 Ratio Swap: 2048M Total, 2048M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 13305 root 3 20 0 312M 301M select 3 3:26 0.02% pcscd 6196 unbound 4 20 0 149M 79M kqread 2 2:08 0.31% unbound 355 root 1 52 0 135M 48M accept 3 0:29 1.19% php-fpm 356 root 1 20 0 135M 48M piperd 0 0:24 0.33% php-fpm
Updated by Marcos M over 3 years ago
Additional note:
Stopping the service while IPsec is in use leads to the following log spam:
Jul 4 14:21:28 charon 11632 02[CFG] C_GetSlotInfo failed: SLOT_ID_INVALID Jul 4 14:21:28 charon 11632 02[CFG] error in C_WaitForSlotEvent: GENERAL_ERROR
This ends up causing continuous log rotation in very short periods which leads to quickly growing disk usage and 100% CPU usage.
Updated by Steve Y over 3 years ago
There looks to be a limit at ~1GB
If you meant a limit for pcscd's RAM usage, I pulled up a few...a 3100 (21.05) is at 960 MB and a couple of 4860s just over 1000, but our lone 2.5.2 has almost 2 GB:
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
7457 root 3 20 0 1946M 1929M select 1 22:00 0.00% pcscd
That's with uptime 18 days. On all four, the dashboard shows memory usage % that is lower than even just the pcscd process.
Updated by Sean M over 3 years ago
I haven't run into this issue before but just today I noticed swap usage at 100% and memory was very high, turns out pcscd had leaked and consumed the memory.
Currently on:
21.05-RELEASE (amd64)
FreeBSD 12.2-STABLE
Updated by Alexander Arques over 3 years ago
Just registered to report the same issue. I have never used smart cards or IPSec tunnels and today I noticed all swap space was used up and RAM usage was at 80%, turns out the pcscd daemon was using more than 1GB of memory.
Running:
2.5.2-RELEASE (amd64)
built on Fri Jul 02 15:33:00 EDT 2021
FreeBSD 12.2-STABLE
Uptime: 34 days
Updated by Michael Smith about 3 years ago
Can confirm the bug on my system. Was a clean upgrade from the last version.
2.5.2-RELEASE (amd64)
built on Fri Jul 02 15:33:00 EDT 2021
FreeBSD 12.2-STABLE
Updated by Uwe Dippel about 3 years ago
Same-same. 7 days of uptime, over night it ramped up and killed DNS ('no space left'). 2.5.2-RELEASE (amd64) clean install. Though, the culprit is well-known.
To me, priority of 'normal' is questionable since my users reported 'the internet is down'. No dig @pfsense. Just restarted the service and noticed the horrendous memory usage. Disabled pcscd and immediately everything was back to normal.
Updated by Jim Pingle about 3 years ago
Uwe Dippel wrote in #note-9:
Same-same. 7 days of uptime, over night it ramped up and killed DNS ('no space left'). 2.5.2-RELEASE (amd64) clean install. Though, the culprit is well-known.
To me, priority of 'normal' is questionable since my users reported 'the internet is down'. No dig @pfsense. Just restarted the service and noticed the horrendous memory usage. Disabled pcscd and immediately everything was back to normal.
We agree that it's a sizable problem, but the main reason it's set to "normal" is because the service is disabled by default now (See #11933 ) and very few people need to enable it. It's only necessary for those who need to use PKCS#11 Smart Cards for certificates. We have not seen any reports that the memory leak happens when pcscd is latched onto actual hardware, so it's possible that this isn't an issue for those who actively use the daemon. It's also possible that there are no reports because not enough people use it yet to notice.
Either way, the problem resides in the daemon which we do not control, so it would need to be fixed upstream, hence our focus on a more practical workaround/mitigation.
Until the next release which includes the changes from #11933 you can add the main commit on that issue as a system patch to disable the pcscd service (See #11933#note-7 )
Updated by Charles Ng about 3 years ago
I see the same log spam as described in https://redmine.pfsense.org/issues/12095#note-4 if pcscd is stopped.
The log rotation bzip2 processes consume all the CPU cycles.
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND root 3328 16.1 0.9 18244 9512 - R 21:59 4:28.92 bzip2 -f /var/log/ipsec.log.0 root 487 15.8 0.9 18244 9500 - R 22:06 3:10.48 bzip2 -f /var/log/ipsec.log.6 root 28366 15.8 0.9 18244 9504 - R 22:01 3:59.19 bzip2 -f /var/log/ipsec.log.6 root 88479 15.6 0.9 18244 9504 - R 21:58 4:43.51 bzip2 -f /var/log/ipsec.log.4 root 80103 15.5 0.9 18244 9504 - R 21:57 4:52.81 bzip2 -f /var/log/ipsec.log.3 root 99250 15.1 0.9 18208 9460 - R 22:33 0:06.21 bzip2 -f /var/log/ipsec.log.0 root 55746 14.9 0.9 18244 9504 - R 22:03 3:39.07 bzip2 -f /var/log/ipsec.log.3 root 86455 14.9 0.9 18244 9500 - R 22:05 3:12.95 bzip2 -f /var/log/ipsec.log.5 root 50899 14.6 0.9 18244 9504 - R 22:02 3:47.29 bzip2 -f /var/log/ipsec.log.2 root 75157 14.6 0.9 18244 9504 - R 22:04 3:29.48 bzip2 -f /var/log/ipsec.log.4 root 27619 14.1 0.9 18244 9504 - R 22:00 4:09.67 bzip2 -f /var/log/ipsec.log.1 root 20617 13.6 0.9 18244 9504 - R 22:00 4:15.34 bzip2 -f /var/log/ipsec.log.5
Updated by Jim Pingle about 3 years ago
Charles Ng wrote in #note-11:
I see the same log spam as described in https://redmine.pfsense.org/issues/12095#note-4 if pcscd is stopped.
That is known and expected. Do not just stop the service if you have IPsec enabled. Read my previous comment for a link to a viable workaround.
Updated by Bug Reporter about 3 years ago
Just found out where 1.3GiB of my free memory went. Returned to normal as soon as I killed the pcscd.
Updated by Philip Cook about 3 years ago
This memory leak is rather problematic.
2.5.2-RELEASE (amd64)
Uptime 64 Days 22 Hours 02 Minutes 06 Seconds
last pid: 891; load averages: 0.90, 0.77, 0.78 up 64+22:00:17 02:18:37 76 processes: 1 running, 75 sleeping CPU: 2.3% user, 0.0% nice, 1.2% system, 0.5% interrupt, 96.0% idle Mem: 2264M Active, 2039M Inact, 2227M Laundry, 1019M Wired, 554M Buf, 273M Free Swap: 4096M Total, 2159M Used, 1937M Free, 52% Inuse PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 9896 root 3 20 0 4557M 2389M select 4 54:08 0.04% pcscd
Updated by Marcos M about 3 years ago
Philip Cook wrote in #note-14:
This memory leak is rather problematic.
2.5.2-RELEASE (amd64)
Uptime 64 Days 22 Hours 02 Minutes 06 Seconds[...]
See JimP's comments https://redmine.pfsense.org/issues/12095#note-10
You can apply the patch that disables it when not needed to avoid the issue.
Updated by Phil K about 3 years ago
The same happened to me today. I realized it when I started receiving e-mails with lines like
bzip2: I/O or other error, bailing out. Possible reason follows. bzip2: No such file or directory Input file = /var/log/ipsec.log.23, output file = /var/log/ipsec.log.23.bz2 newsyslog: `/usr/bin/bzip2 -f /var/log/ipsec.log.23' terminated with a non-zero status (1)
Apparently, this is caused by the high CPU load and bzip2 processes taking a long time to compress logfiles (I found https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200741 about this. Seems to be another (unresolved?) issue).
Anyway, all of a sudden I was in a situation in which pcscd had been killed, not by me but by the kernel due to it consuming too much memory (Nov 10 18:15:01 neptun kernel: pid 7452 (pcscd), jid 0, uid 0, was killed: out of swap space
) and then, the log spam kicked in.
I tried both manually restarting the pcscd and applying the patch mentioned in #11933, but couldn't stop it. I ended up restarting the pfSense, hoping that it would work. As this specific instance's only connection to the Internet is through an IPSec tunnel it felt a bit risky restarting it if there is an IPSec related issue.
Would there have been any other way to stop the log spam? I suppose this could be helpful for others that experience the same issue.
Updated by Pedro Ribeiro about 3 years ago
I politely disagree with the assigned priority for this bug, particularly given that a CE release is likely months away. Initially I hit the problem with pcscd leaking memory on two pfSense boxes yesterday, with the first sign being no DNS due to unbound having been killed, probably due to out-of-memory conditions, so then I disabled pcscd on both boxes. But the next day I stumbled upon high CPU usage on both boxes, with difficulty to get both the web interface/ssh responding. In case it helps anyone, here are the steps I used to regain control.
1. I applied the patch afcc0e9c97c1993ae6b95f886665fcb4375d26c7
as suggested.
2. Stopped and started the IPSec service (charon) as clicking restart didn't seem to make a difference at all.
3. Meanwhile syslog continued misbehaving. I had to kill bzip2 from the command line several times while doing the next steps.
4. Cleared all logs.
5. Importantly, I changed the compression for log retention on purpose, as given the text WARNING: Changing this value will remove previously rotated compressed log files! I figured this would be exactly what I needed, kill any log rotation given the issue with bzip2.
The issue with syslog/bzip2 on its own seems intriguing. One would expect that disabling a service - whatever it may be - wouldn't have catastrophic consequences in terms of repeated, unbounded logging, eventually maxing out the CPU, which makes regaining control of a box challenging.
Updated by Jim Pingle about 3 years ago
The problems you're hitting are a mix of somewhat but not really related things.
This issue being the memory leak itself, which is beyond our control, hence the other issues we have made to work around it. The workarounds are on #11933 and have already been implemented in the code base, but have not yet been included in a release. The next release will be coming sometime in the few months ahead. The snapshots are quite stable if people want to run them (in a lab first to test).
Pedro Ribeiro wrote in #note-17:
1. I applied the patch
afcc0e9c97c1993ae6b95f886665fcb4375d26c7
as suggested.
2. Stopped and started the IPSec service (charon) as clicking restart didn't seem to make a difference at all.
The best thing to do there is reboot after applying the patch. Though some have seen that pressing Save on the advanced settings tab in IPsec is sufficient, it's safest to reboot. The discussion around that should stay on #11933 and not here as it isn't relevant to the memory leak, but the changes we have made to control pcscd.
3. Meanwhile syslog continued misbehaving. I had to kill bzip2 from the command line several times while doing the next steps.
This is because IPsec is logging like crazy because it can't communicate with the (stopped) pcscd daemon which its configuration still references at that point, since applying the patch doesn't do anything but adjust the code. The daemon configuration needs to be refreshed. We have another separate issue for this in #12468 so going forward the daemons are stopped and started together properly if/when pcscd is enabled.
5. Importantly, I changed the compression for log retention on purpose, as given the text WARNING: Changing this value will remove previously rotated compressed log files! I figured this would be exactly what I needed, kill any log rotation given the issue with bzip2.
Sort of, but it depends on what is causing it. The problem again is the high rate of logging at the time. Disabling compression can help there since compression will bog down the log rotation process but syslogd itself will still be taking quite a bit of load. It also doesn't kill the backlog of existing bzip2 processes. See #12011 for more there.
Updated by Viktor Gurov almost 3 years ago
pcscd bugreport:
https://github.com/LudovicRousseau/PCSC/issues/55
Updated by Simon Quigley almost 3 years ago
Not sure there's a lot of value in my post, if this service has now been set to disabled by default..
I just wanted to report that my pfsense instance was throwing a lot of errors related to swap in the kernel log, and when I looked, this daemon was eating 15G of the 16G of ram in my box.
ast pid: 72748; load averages: 0.56, 0.41, 0.50 up 143+06:16:53 18:57:54
216 processes: 1 running, 215 sleeping
CPU: 4.5% user, 0.3% nice, 1.7% system, 0.2% interrupt, 93.4% idle
Mem: 455M Active, 1906M Inact, 11G Laundry, 2010M Wired, 1540M Buf, 468M Free
Swap: 4096M Total, 4088M Used, 8260K Free, 99% Inuse
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
39410 root 1 52 0 133M 32M accept 3 0:39 0.59% php-fpm
85739 root 1 52 0 133M 32M accept 3 0:46 0.10% php-fpm
66234 root 1 52 0 133M 32M accept 1 0:38 0.10% php-fpm
61691 root 3 20 0 15G 11G select 2 217:40 0.00% pcscd
Which is pretty horrendous. I stopped the service, and I will disable it, until I'm able to upgrade and reboot.
Updated by Mr Sparkles almost 3 years ago
- File PFsence memory leak.png PFsence memory leak.png added
- File PFsence memory leak2.png PFsence memory leak2.png added
This may be redundant information, just mentioning that this mem leak is not only an issue on lower memory systems.
System:
2.5.2-RELEASE (amd64)
built on Fri Jul 02 15:33:00 EDT 2021
FreeBSD 12.2-STABLE
Uptime: 16 days
Updated by → luckman212 over 2 years ago
Update: looks like the PCSC maintainer has fixed the mem leak: https://github.com/LudovicRousseau/PCSC/issues/55#issuecomment-1143815658
Updated by Jim Pingle almost 2 years ago
- Related to Feature #13826: Update pcsc-lite added