EXPLANATION
Ports 5060 and 5061, both on TCP and UDP, are associated to the Session Initiation Protocol (SIP) by
IANA.
In particular, port 5060 is assigned to clear text SIP, and port 5061
is assigned to encrypted SIP, also known as SIP-TLS (SIP over a TLS,
Transport Layer Security, encrypted channel). Unfortunately, the
standard TLS (successor of SSL) can only be established over TCP. Does
this mean SIP UDP-based communications have to travel unencrypted?
Fortunately not. An IETF draft covering
SIP-DTLS is on the queue (DTLS is Datagram TLS, that is, UDP)... but... what is SIP?
The Session Initiation Protocol (SIP) is a signaling protocol for
Internet multimedia conferencing and communications, such as Voice over
IP (VoIP). It is extensively used on VoIP architectures, and is becoming
the IETF alternative and "natural" replacement to the complex H.323
signaling protocol. As a signaling protocol, SIP is mainly focused on
managing and negotiating all the features of VoIP sessions (or other
multimedia-based protocols), such as setting up and tearing down VoIP
calls.
SIP is probably one of the most complex protocols based on the length of its RFC,
RFC 3261,
and its flexibility and extensibility, what has caused the publication
of multiple standards that update and add new functionality to it.
Today, a
search by SIP on RFC Editor returns 157 references.
However, it is a text-based request and response protocol, very
similar to HTTP (web): SIP messages are made of headers and body, it
allows multiple request methods (REGISTER, INVITE, ACK, BYE, etc), it
defines entities as URI's (Uniform Resource Indicators), and the set of
responses is categorized in different groups (success (2xx), redirection
(3xx), etc); all them common features available in HTTP. SIP messages
are ASCII encoded and can make use of the standard Multipurpose Internet
Mail Extensions (MIME) capabilities. This protocol is tightly
integrated with the most commonly used media protocol, RTP (Real-time
Transport Protocol), used to carry the streaming and real-time
multimedia sessions. The ports used for RTP communications are
dynamically negotiated by SIP when a new VoIP or multimedia session is
established. This behavior is very similar to FTP, where the control
channel manage the communication (TCP/21), and negotiates the ports for
the data channel, the one that transports the information (TCP/20 or
dynamically negotiated). For those readers interested on inspecting a
SIP packet capture, check
this sample from the Wireshark repository, where SIP Digest authentication is used.
Why SIP is (or can be) an important protocol? As VoIP and other
multimedia capabilities, such as video on demand (VoD) and triple play
communications, are being widely deployed by service providers and
companies, SIP is silently entering in our networks. However, some
providers do not use SIP over the standard ports, and select their own
set of ports; this obviously breaks communication capabilites between
providers as it is like setting up your web server on a port different
than TCP/80 (you need to know in what port it is listening on).
Additionally, if SIP P2P (peer to peer) communications get adoption,
where intermediate proxies and servers are not required for two (or
more) users to establish direct multimedia communications, we might end
up opening these ports (TCP & UDP / 5060 & 5061) in most
firewalls, and specially, on SOHO environments.
You can start today looking for SIP traffic on your networks. Capture
some traffic and analyze it. Using Wireshark you can apply different
display filters to confirm if SIP is there:
udp.port == 5060 or udp.port == 5061
tcp.port == 5060 or tcp.port == 5061
sip or sdp
frame contains "SIP"
ssl or dtls