VoIP Monitoring Tools

Monitoring tools come in these categories:

  • Reactive Passive Monitoring
  • Comprehensive Passive Monitoring
  • Active Probing

Reactive Passive Monitoring is when you run tcpdump/wireshark to troubleshoot individual call problems. This is the most critical to have -- but managing and analyzing real-time data streams is no easy skill.

Comprehensive Passive Monitoring (CPM) monitors all calls, all the time. They watch signaling and media. Examples:

  • Acme Packet QoS Module
  • Empirix Hammer Call Analyzer
  • RadCom passive probes
  • Psytechnics Experience Manager
  • RTCP Analyzers

Active Probing is where you have a system actively placing end-to-end calls, possibly including an analog or POTS portion, and analyzing call setup delays and audio quality. Examples are:

  • Minacom (now sold as Tektronix as PowerProbe Active Assurance)
  • Brix
  • RadCom active probes
  • NetAlly

The most valuable is Reactive Passive Monitoring. This can just be a box where you can run wireshark on live traffic. You're in bad shape without this as a VoIP Service Provider.

The second-most-valuable is Active Probing, because it can tell you about the actual experience your subscribers are having (excluding effects of the CPE SIP phone it self). They can detect failing PSTN gateways, slow call setup, end-to-end delay, packet loss, jitter, problems checking voicemail, or problematic DSP code in your gateway. This type of system can compare call quality going through one SIP peer versus another PSTN gateway. For example, do calls sound better going via Verizon, or via Level(3)? Active Probing systems with analog ports can tell you.

Comprehensive Passive Monitoring systems are the Big Sellers right now. They're nice to have, but can also lie. Because of *fundamental* limitations, one of these may say calls are perfect while in reality the speech is unintelligible. They can only see traffic as it passes one point in the network. They can never tell what it's like at a customer location.

----1. RTP from CPE---> ----2. RTP from CPE---->
[CPE] [CPM] [PSTN Gateway]
<---3. RTP from PSTN--- <---4. RTP from PSTN----

A CPM can only detect network degradation along link 1 and link 4. CPM Vendors try to solve this by selling you lots of CPM Probes to sprinkle through your network, but there are always blind spots.

MOS values calculated based on packet performance (e.g., Empirix) are nearly useless. I've done research on this with a Ph.D Engineer, and I've read the vendor technical white papers. MOS values are, at best, SWAGs at the actual call quality.

The Acme Packet QoS CDRs do give some useful data. It gives packet loss and jitter data for the RTP as it arrives at the Acme Packet. It suffers the same fundamental problem -- the SD cannot tell what happens to the RTP after it leaves the SD itself. But seeing even "half" of the RTP path is better than nothing.

Psytechnics has the same fundamental limitations as other CPM systems, but it is the only system I've studied that can actually detect audio problems like echo, cell phone effects, bad speakerphones, or DSP encoding problems. Empirix or Radcom (passive probes) would look at a call with echo and say the call is perfect. Psytechnics and Empirix would both fail to detect problems if the RTP is degraded along the path *after* the probe analyzes it.