Project

General

Profile

Actions

Bug #3069

closed

traceroute6 fails to timeout and hangs the webconfigurator GUI

Added by Doktor Notor almost 11 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Normal
Category:
Operating System
Target version:
Start date:
07/03/2013
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
All
Affected Architecture:

Description

As simple as trying to run IPv6 traceroute to www.google.com from the GUI:

2 gige-g2-20.core1.prg1.he.net 3.317 ms 11.719 ms 2.712 ms
3 nixcz-v6.net.google.com 45.940 ms 10.852 ms 10.970 ms
4 2001:4860::1:0:4ca2 45.398 ms 11.265 ms 11.076 ms
5 2001:4860::8:0:5039 11.545 ms 11.095 ms 10.917 ms
6 2001:4860::8:0:3097 25.850 ms
2001:4860::8:0:3098 26.843 ms 26.517 ms
7 2001:4860::2:0:612 24.934 ms
2001:4860::2:0:6e0 25.945 ms 30.676 ms

Same thing from shell:

2 gige-g2-20.core1.prg1.he.net 2.725 ms 2.681 ms 11.847 ms
3 nixcz-v6.net.google.com 10.874 ms 21.850 ms 11.070 ms
4 2001:4860::1:0:4ca2 11.037 ms 10.983 ms 10.820 ms
5 2001:4860::8:0:5039 10.994 ms 18.642 ms 28.842 ms
6 2001:4860::8:0:3098 24.807 ms 24.396 ms 24.448 ms
7 2001:4860::2:0:612 26.017 ms
2001:4860::2:0:6e0 31.798 ms 25.013 ms

Hangs indefinitely.

Compare to traceroute run from Windows box:

@ 3 4 ms 2 ms 2 ms gige-g2-20.core1.prg1.he.net [2001:470:0:221::1]

4    11 ms    10 ms    11 ms  nixcz-v6.net.google.com [2001:7f8:14::1d:1]
5 11 ms 11 ms 10 ms 2001:4860::1:0:4ca2
6 11 ms 11 ms 11 ms 2001:4860::8:0:5038
7 26 ms 26 ms 26 ms 2001:4860::8:0:3098
8 30 ms 29 ms 28 ms 2001:4860::2:0:6e0
9 * * * Request timed out.
10 26 ms 26 ms 26 ms bk-in-x93.1e100.net [2a00:1450:4008:c01::93]@

This renders the web GUI completely unresponsive and unusable until you kill the traceroute6 process via console.

Actions #1

Updated by Jim Pingle almost 11 years ago

  • Status changed from New to Feedback

I can't reproduce this on current 2.1 code.

In the GUI we pass "-w 2" which waits a max of two seconds for a reply from the target server for each trace attempt. For me, from the GUI to www.google.com, it hits hop 8 and times out three times then hits the last hop, so there is a ~6 second pause but it does proceed.

From the shell if you don't pass it -w X, then it does hang indefinitely waiting for a reply.

Edit the source of /usr/local/www/diag_traceroute.php, uncomment the line that echos the traceroute command when executed, and then try it again and paste the output here. And try it from the shell with -w 1 or -w 2.

Actions #2

Updated by Doktor Notor almost 11 years ago

Not really required to uncomment anything there. It's endlessly visible in the process listing from console, till you kill it manually.


62161 ?? S 0:00.03 /usr/sbin/traceroute6 -w 2 -m 18 www.google.com

And it's the same story with -w X from commandline of course, timeout never ever happens.

Actions #3

Updated by Doktor Notor almost 11 years ago

BTW, installed mtr-nox11, no such issue:

HOST: gw.example.com Loss% Snt Last Avg Best Wrst StDev
...
2.|-- gige-g2-20.core1.prg1.he. 0.0% 10 3.1 6.3 2.8 11.3 3.8
3.|-- nixcz-v6.net.google.com 0.0% 10 11.7 12.0 11.1 17.8 2.1
4.|-- 2001:4860::1:0:4ca2 0.0% 10 10.9 12.8 10.9 20.6 3.7
5.|-- 2001:4860::8:0:5039 0.0% 10 11.0 12.4 11.0 21.8 3.3
6.|-- 2001:4860::8:0:3097 0.0% 10 26.0 26.4 25.9 28.3 0.8
7.|-- 2001:4860::2:0:6e0 0.0% 10 25.9 26.5 25.9 28.6 0.9
8.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
9.|-- bk-in-x69.1e100.net 0.0% 10 25.9 26.0 25.7 26.4 0.2

Actions #4

Updated by Jim Pingle almost 11 years ago

  • Status changed from Feedback to New

I was able to reproduce it finally. I tried it on a few different pfSense boxes and FreeBSD systems, and I only could reproduce it on i386 on both FreeBSD and pfSense.

Amd64 didn't seem to have any problems in the same situations. My other test systems either never had a timeout/skipped entry, or they timed out as expected (* * *) and kept moving.

Not sure what we can effectively do to counter that.

Actions #5

Updated by Doktor Notor almost 11 years ago

Hmmm well, not sure either, beyond either a shiny red warning (think about remotely managed boxes, cutting yourself off the GUI kinda sucks :-P) or maybe the mtr utility would be a good replacement (it'd need some polishing, the current mtr GUI does not offer IPv6 dropdown).

Actions #6

Updated by Jim Pingle almost 11 years ago

MTR is an entirely different type of test. Useful, but probably not one we'd include by default. And yes its GUI does need a bit of polish.

It is likely that it's a bug in FreeBSD's traceroute6. I didn't any have i386 systems on FreeBSD 9 or 10 that also had a path which included a timeout. If the bug is gone in a current traceroute6, we may be able to patch in a fix.

Actions #7

Updated by Doktor Notor almost 11 years ago

Looks like the code was last touched (beyond irrelevant cosmetics) almost 4 years ago. Unlikely to have any fix.

Actions #8

Updated by Doktor Notor almost 11 years ago

FWIW, tried with truss /usr/sbin/traceroute6 -w 2 -m 18 www.google.com - it looks like it does actually make it thru to the last hop, however, there it gets stuck completely in a stupid loop.

sendto(4,"\^V\b\0\0Q\M-W\^V\M-_\0\r9\M-j",12,0x0,{ AF_INET6 [2a00:1450:4008:c01::69]:33456 },0x1c) = 12 (0xc)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050592.741271 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 136 (0x88)
gettimeofday({1373050592.858897 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050593.752360 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050594.761495 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050595.771728 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050596.781864 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050597.791833 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050598.802189 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050599.812327 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050600.822777 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050601.832704 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050602.843511 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 136 (0x88)
gettimeofday({1373050603.140628 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050603.853568 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050604.863216 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050605.873673 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050606.883456 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050607.843754 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050607.893793 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050608.904070 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050609.914040 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 136 (0x88)
gettimeofday({1373050610.481887 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050610.924084 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 136 (0x88)
gettimeofday({1373050611.912224 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050611.934322 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050612.944385 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050613.955584 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050614.965010 },0x0)           = 0 (0x0)
^Vpoll({3/POLLIN},1,2000)                                = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
gettimeofday({1373050615.975144 },0x0)           = 0 (0x0)
poll({3/POLLIN},1,2000)                          = 1 (0x1)
recvmsg(0x3,0x804ebfc,0x0,0xdf16d751,0xea390d00,0x51d716df) = 24 (0x18)
...

until aborted via CTRL+C. Afraid cannot help in any way with this.

Actions #9

Updated by Chris Buechler over 9 years ago

  • Category set to Operating System
  • Status changed from New to Confirmed

it's pf that makes this hang somehow. disable pf, and traceroute6 finishes no problem. No blocked traffic being logged.

Actions #10

Updated by Chris Buechler almost 9 years ago

  • Status changed from Confirmed to Feedback

this doesn't seem to be an issue in 2.2.x

Actions #11

Updated by Kill Bill almost 9 years ago

Well, it still hangs here exactly the same as ever. I tried pfctl -d before running this and it did not help in any way either.

Actions #12

Updated by Chris Buechler almost 8 years ago

  • Status changed from Feedback to Confirmed
  • Target version set to 2.3.2
  • Affected Version changed from 2.1-IPv6 to All

Denny Page tracked down the source of this issue and opened this FreeBSD PR with a patch.

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=210286

Actions #13

Updated by Renato Botelho almost 8 years ago

  • Assignee set to Renato Botelho

I'll make some tests and import the patch to our tree

Actions #14

Updated by Renato Botelho almost 8 years ago

  • Status changed from Confirmed to Feedback

Imported traceroute6 patch to FreeBSD-src repo. It'll be available on next round of snapshots

Actions #15

Updated by Chris Buechler almost 8 years ago

  • Status changed from Feedback to Resolved

works

Actions

Also available in: Atom PDF