Skip to main content

Asterisk High CPU between 200% and 450% with only 15 active calls

Posted by matthewmalk248 on Tue, 03/20/2018

Hey gang, I'm in a bit of a pickle and hoping someone could assist (free or paid). Started getting choppy calls and latency with extensions showing as LAGGED. Noticed Asterisk is currently between 200% and 450% on our MT server. Currently have another asterisk based deployment with less vCPUs on the same VMware host with the same number of active calls hanging out at around 9%. Haven't installed any new software on the TLMT server. I can temporarily relieve the call problems with a core restart, but asterisk shoots right back up to 200%+ and eventually calls will start dropping again after a few hours.

Here's my 'top'...

Tasks: 407 total, 1 running, 405 sleeping, 0 stopped, 1 zombie
Cpu(s): 2.8%us, 1.4%sy, 1.2%ni, 73.3%id, 0.0%wa, 0.0%hi, 21.3%si, 0.0%st
Mem: 33012812k total, 14638936k used, 18373876k free, 386404k buffers
Swap: 2097148k total, 0k used, 2097148k free, 8085912k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
16737 asterisk 20 0 6853m 144m 15m S 283.8 0.4 24:01.48 asterisk
4676 root 20 0 18.8g 3.1g 24m S 90.7 9.9 131:20.22 java
23399 root 30 10 410m 97m 6436 S 42.4 0.3 41:47.78 plackup
68 root 20 0 0 0 0 S 19.5 0.0 20:49.06 events/1
69 root 20 0 0 0 0 S 16.6 0.0 7:38.44 events/2
23397 root 30 10 279m 51m 3768 S 12.6 0.2 19:03.77 perl
3760 root 20 0 104m 32m 2252 S 10.6 0.1 1162:32 perl
77 root 20 0 0 0 0 S 3.3 0.0 45:11.88 events/10
3 root RT 0 0 0 0 S 2.6 0.0 1:01.26 migration/0
3189 root 20 0 10.9g 254m 16m S 0.7 0.8 74:15.20 java
22711 root 20 0 15296 1516 948 R 0.7 0.0 0:00.29 top
17 root 20 0 0 0 0 S 0.3 0.0 2:03.29 ksoftirqd/3
41 root 20 0 0 0 0 S 0.3 0.0 3:04.97 ksoftirqd/9
2538 root 20 0 2047m 65m 1428 S 0.3 0.2 52:49.65 turnserver
13331 root 20 0 4968m 253m 13m S 0.3 0.8 38:22.88 java
16784 asterisk 20 0 15472 2080 1856 S 0.3 0.0 0:00.67 mpg123
22247 asterisk 20 0 15472 2352 2128 S 0.3 0.0 0:00.21 mpg123
1 root 20 0 19340 1532 1232 S 0.0 0.0 0:24.95 init

Here's the ps -LlFm output showing multiple threads using oddly high cpu..

1 S asterisk - - 16841 19 - 80 0 - - poll_s - 15 10:24 - 00:02:32 -
1 R asterisk - - 16846 24 - 80 0 - - - - 10 10:24 - 00:03:10 -
1 S asterisk - - 16889 0 - 80 0 - - poll_s - 9 10:24 - 00:00:00 -
1 S asterisk - - 16891 0 - 80 0 - - hrtime - 11 10:24 - 00:00:00 -
1 S asterisk - - 16892 0 - 80 0 - - hrtime - 11 10:24 - 00:00:00 -
1 S asterisk - - 16893 0 - 80 0 - - futex_ - 9 10:24 - 00:00:00 -
1 S asterisk - - 16894 0 - 80 0 - - futex_ - 6 10:24 - 00:00:00 -
1 S asterisk - - 16895 0 - 80 0 - - futex_ - 9 10:24 - 00:00:00 -
1 S asterisk - - 16896 0 - 80 0 - - poll_s - 11 10:24 - 00:00:00 -
1 S asterisk - - 16905 0 - 80 0 - - futex_ - 8 10:24 - 00:00:00 -
1 S asterisk - - 18385 0 - 80 0 - - inotif - 2 10:26 - 00:00:00 -
1 S asterisk - - 21158 5 - 80 0 - - poll_s - 9 10:31 - 00:00:21 -
1 S asterisk - - 21612 7 - 80 0 - - poll_s - 8 10:31 - 00:00:26 -
1 S asterisk - - 22103 9 - 80 0 - - poll_s - 0 10:32 - 00:00:29 -
1 S asterisk - - 22622 8 - 80 0 - - poll_s - 0 10:33 - 00:00:20 -
1 S asterisk - - 22712 8 - 80 0 - - poll_s - 0 10:34 - 00:00:18 -
1 S asterisk - - 23365 11 - 80 0 - - poll_s - 1 10:35 - 00:00:15 -
1 S asterisk - - 23388 3 - 80 0 - - poll_s - 9 10:35 - 00:00:04 -
1 S asterisk - - 23834 4 - 80 0 - - poll_s - 7 10:36 - 00:00:03 -
1 S asterisk - - 23930 7 - 80 0 - - poll_s - 0 10:36 - 00:00:04 -
1 S asterisk - - 24160 7 - 80 0 - - poll_s - 3 10:37 - 00:00:00 -
1 S asterisk - - 24210 3 - 80 0 - - futex_ - 9 10:37 - 00:00:00 -

and here's the pstack of one of those high threads..

Thread 1 (process 16846):
#0 0x0000003d366db7ed in write () from /lib64/libc.so.6
#1 0x000000000056e5df in tcptls_stream_write ()
#2 0x0000003d36666e49 in _IO_cookie_write () from /lib64/libc.so.6
#3 0x0000003d3667180a in _IO_new_file_xsputn () from /lib64/libc.so.6
#4 0x0000003d366678bd in fwrite () from /lib64/libc.so.6
#5 0x000000000057ec0f in ast_careful_fwrite ()
#6 0x00000000004fe954 in process_events ()
#7 0x0000000000511d9f in session_do ()
#8 0x000000000056f573 in handle_tcptls_connection ()
#9 0x000000000057c3f8 in dummy_start ()
#10 0x0000003d36a07aa1 in start_thread () from /lib64/libpthread.so.0
#11 0x0000003d366e8bcd in clone () from /lib64/libc.so.6


Submitted by matthewmalk248 on Tue, 03/20/2018 Permalink

Update: After stopping and restarting iptables, asterisk calmed back down to 9%. Now to figure out why :/

Submitted by matthewmalk248 on Tue, 03/20/2018 Permalink

I didn't try a reboot before restarting iptables. But ever since I restarted iptables, the CPU% of Asterisk seems to be crawling back up around 5% every few hours. It got up to 40% so I restarted iptables again to bring it back down to normal (9% during 15 active calls). I plan on doing a reboot in the middle of the night tonight to see if it matters.

Submitted by volodya on Tue, 03/20/2018 Permalink

Hi!

Can you see something unusual in Asterisk CLI?

Submitted by matthewmalk248 on Tue, 03/20/2018 Permalink

Hi Volodya, nothing out of the ordinary. Just the usual error spam from users using thirdlane connect on their phones.

"chan_sip.c:17208 register_verify: 'UDP' is not a valid transport for '100-TENANTNAME'. we only use 'WS'! ending call."

In reply to by volodya