SIP FAQ |
Subcategories:![]() ![]() ![]() ![]() |
![]() SIP Basics |
Subcategories:
Answers in this category: |
![]() ![]() What is SIP? |
SIP stands for Session Initiation Protocol. SIP is an Internet proposed standard documented in RFC 2543 for setting up, controlling and tearing down sessions in the Internet. Sessions include, but are not limited to, Internet telephone calls and multimedia conferences. |
SIP can initiate calls between regular telephones, if
the switch is suitably equipped (so-called PINT mechanism). |
![]() ![]() Who is implementing SIP? |
Lots of people. Take a look at the partial list at http://www.cs.columbia.edu/sip/implementations.html http://www.pulver.com/sip/products.html, and http://www.sipforum.org/. At the last SIP bake-off, there were approximately 60 implementations. |
![]() ![]() Where can I find more information about SIP? |
The
SIP home page contains
additional information about SIP.
Until September 1999, discussion about SIP took place on the MMUSIC mailing list. The working group and mailing list is still the appropriate place for discussions related to SDP, SAP and RTSP, among other topics. SIP standardization has now moved to the SIP working group, with its own mailing list. The SIP Forum at http://www.sipforum.org also provides information about SIP. |
![]() ![]() What is RFC2543bis? |
RFC2543bis is an updated version of SIP protocol that
incorporates bug fixes and enhancements that have surfaced since RFC2543
was issued. Most of the changes in RFC2543bis were discussed extensively
on SIP mailing list; hence, most of them are very likely to be included
into the next official version of SIP.
The latest version of this document can be found at http://www.cs.columbia.edu/~hgs/sip |
![]() ![]() What is not described in the SIP spec? |
The SIP specification describes what is necessary to
create interoperable implementations. It does not describe or limit the
functionality offered by implementations or impose minimum functionality
beyond request handling. Thus, SIP doesn't tell you how to do any of the
following: - the color of the phones that use SIP; - what operating system is used to implement a SIP phone; - what language is used to display error messages; - what sound, if any, is used to indicate ringback or ringing; - whether a registrar, proxy or redirect server are implemented in one process, on one machine or distributed across a network; - which media types or codecs the SIP implementation supports; - which type of network SIP is used in; - what type of security measures are required to be used; - the GUI for SIP calls; - how to build a network of proxy servers; - whether a phone can make calls while it's in a call; - qualifications for human beings to make use of SIP softphones; - how generous an implementation should be in dealing with requests and responses that have mistakes and how much error detail information it should provide; - how many simultaneous calls the device can be in. - what parser generator tool, if any, should be used by implementations - what threading model to use |
![]() ![]() There are lots of SIP extensions. Will the protocol ever be stable? |
Innovation will not stop, since this isn't the phone
system. However, the basic protocol has been stable for quite a while and
is sufficient to implement all basic phone services. Enhancements tend to
be for specialized services, such as ISUP interworking, QoS negotiation,
liveness detection, caller preferences or presence/instant messaging. All
of these are backward-compatible with the basic protocol, with extensions
negotiated if both sides support them. Basic calls will succeed without
the extensions. Note also that just because something is proposed doesn't mean it will ever become a standard. (Only about 10% of Internet drafts ever get published as RFCs.) |
![]() SIP Functionality |
Describes how to provide standard services with SIP.
|
Subcategories:
Answers in this category: |
![]() ![]() Does SIP support the standard telephone features? |
Yes. SIP supports, among others: - call forwarding unconditional, busy, ... - call transfer (call control spec) - caller ID - call hold - 3-way conferences and multiparty conferencing (call control spec) - call return ("*69") - call park (with NOTIFY) - follow-me - find-me - call waiting - IVR systems - multiple line presences - camp on - call queueing - automatic call distribution - do not disturb Some services, like repetitive dialing, station speed dialing, last number redial, and distinctive ringing, are implemented purely in the end system and require no support from the signaling protocol. The Telecommunications Industry Association (TIA) is working on a recommendation http://www.tiaonline.org/standards/ip/ for business PBX-style services and other Internet phone requirements. |
![]() ![]() How does SIP provide message waiting services? |
First, there may well be other protocols that are more
appropriate to indicating that voice or email messages are available, for
example POP or IMAP. In particular, once one moves beyond a simple message
waiting light to indication of message counts, urgency, senders, etc., it
is likely that any SIP-based solution starts replicating POP or IMAP
functionality.
A SIP-based solution has not been standardized yet. Using SUBSCRIBE and
NOTIFY appears to be the most appropriate approach. |
![]() ![]() Why does CSeq not have a compact form? |
This was an oversight, as all other SIP mandatory
headers have a compact form. However, it's too late to change this since
older implementations would not know what to do with the single-character
form. Fortunately, the penalty is only three bytes. |
![]() ![]() How does the location server communicate with a proxy? |
Location server and proxy server are logical entities,
not physical ones. They can reside in one host or application or be
distributed across several hosts or applications. The API or protocol used
between them is not specified, as it is an implementation decision.
(Generally, most implementations combine a location server with a proxy.)
|
![]() ![]() How does SIP support caller ID? |
Caller-ID is provided by the From SIP header
containing the caller's name and "number". The number would most likely be
placed in the user field of a SIP URL or appear in a tel: URL.
Since the callee generally does not know or trust the callee's server, only cryptographic signatures can be used to ensure that the information is valid. For example, the outgoing proxy might be operated by an ISP, enterprise or phone company and sign for the identity of the caller, using the signed-by parameter, with the identity of the company verified by a public key certificate similar to those used by web sites. See section 13.2 of RFC2543bis for encryption details. |
![]() ![]() How do I charge/bill for Internet telephony using SIP? |
Dean Willis wrote with regards to billing for SIP services: Why can't service providers make a living providing (at a fixed cost) access to "free services"? Do carriers do per HTTP-transfer billing now? How much should they charge for an email? For a call, what parameters might be used? Bandwidth, duration, distance -- the Big Factors of the POTS bill -- are not issues that SIP is concerned with. |
![]() ![]() How do prepaid calling cards work in a SIP network? |
Note that, in general, prepaid calling cards only make sense in an IP network if there is a special-purpose VoIP internet, calls traverse a IP-to-PSTN gateway or VoIP packets receive special treatment. The SIP requests are forced to traverse a stateful proxy or a back to back User Agent (B2BUA), which controls the Internet telephony gateway, router QOS function or firewall, depending on the architecture. When the time is used up, the B2BUA or gateway issues a BYE request to both parties, using the existing call ID. It also disables the gateway connection, turns off any special QOS treatment for the RTP packets or closes the firewall for that stream. This requires no additions to either caller or callee. Sending a BYE suffices only if at least one of the end systems can be trusted to actually terminate the call when a BYE is received. |
![]() ![]() Does SIP do conference control? |
SIP leaves conference control, such as the election of a chair or floor control, to other protocols. SIP can be used for non-conferencing applications and floor control may be used outside the scope of SIP-initiated calls, so it seemed best to separate the functionality. However, SDP may be used to indicate which media are subject to floor control and what tools and protocols are to be used. Unfortunately, there is no IETF-standardized floor control protocol. |
![]() ![]() How do I put a call on hold? |
The party wishing to put the other party on hold sends a (re)INVITE, with a session description containing a null (0.0.0.0) address. When used with SDP, the "c" address field of one or more media types is set to zero. |
![]() ![]() How does SIP do "call progress tones" or "ring back"? |
100 Message received 100 Looking up number 100 Found number, looking up carrier according to profile 100 Finding cheapest carrier which doesn't do animal testing 100 Found carrier "AT&T" 100 Dialing number 180 Ringing 182 Queued, 3 people in front of you 182 Queued, 2 people in front of you The language of the status message should be determined based on the Accept-Language request header in the call. A 183 (Session Progress) status response will appear in RFC2543bis. It can be used for both progress tones as well as error messages. One would use the 183 only if you:
One can also use 183 if the gateway is able to determine that an error has occured, but that there is a tone or announcement accompanying it (e.g., an ACM with a cause code present). In that case, the gateway can send a 183 to set up the media for the announcement (ideally with the announcement text as the text string), wait for a timer (on the order of 30 seconds), and then send an appropriate SIP error message. However, this should only be done if the caller is likely a human being, as sending 183 would otherwise only delay failure handling. Take a look at (now expired) draft-ietf-sip-183-00.txt for some details on using 183 responses for early media announcements. |
![]() ![]() Does SIP do admission control? |
Since this offers no real security (calls could always bypass a server), admission control is not supported by SIP. If an "outbound proxy" is used for outgoing calls, that proxy may control the firewall and thus restrict outgoing calls ans resources used. |
![]() ![]() Does SIP administer bandwidth? |
No, that is the role of a resource reservation
protocol. There is no reason to assume that any Internet telephony
signaling server (such as a proxy) would know the available bandwidth in
real networks. Having such a central server would not scale. Administering
bandwidth separately for each application is also likely to be difficult
and inefficient.
There is a proposal for an SDP extension that allows SIP INVITE requests and responses to indicate that resource reservation must succeed before the callee is alerted. Several proposals can also be found at http://www.cs.columbia.edu/~hgs/sip/drafts_qos.html. |
![]() ![]() How does SIP support multipoint conferences? |
SIP has four ways to do multiparty conferencing:
1. Dialup conference bridge: call the bridge just like you call a person. Conference is identified by request URI. Works with rfc2543 - no extensions needed. 2. Distributed multiparty conferencing - no server. Fully distributed. This is what is described in the now-expired draft , and is work in progress (see http://www.softarmor.com/sipwg/teams/callcontrol/index.html for some recent drafts). 3. Multicast conferences - you can run your conference on multicast. Simply INVITE the person to join the multicast session. Works with baseline SIP. In fact, this was the initial purpose of SIP. In this case, there is not a full mesh of SIP signaling. 4. 3-way with local MC/MP function: A and B are talking. A wishes to add C to the call. A can simply call C, but also act as a mixer, so that the media it sends to B contains the A+C media, and the media to C is the A+B media. RTCP CSRC indicate who is in the conference. This also works with baseline SIP. It imposes additional burden on the UA, but otherwise provides this standard feature in a simple way. There is no preferred way. It depends on the particular application. (Jonathan Rosenberg) |
![]() ![]() How does SIP support text chat and instant messaging? |
Text chat can be supported as part of SIP sessions
using the RTP text conversation payload format (RFC 2793). Typically, text
is sent one character at a time, although almost any number of characters
can be packed into a single RTP packet. Instant messaging (IM) differs from text chat in that: - messages are sent as a whole and can be in any format, such as HTML; - messages are independent of a session setup; - store-and-forward or translation to email are feasible; SIP can support IM through the use of the proposed MESSAGE method.
|
![]() ![]() Does SIP support video or multimedia conferences? |
SIP was designed to support multimedia conferences,
including video conferences, from the very beginning. It can use any media
type, including one or more audio, video, shared application or text chat
streams. For multiple participants, SIP can be used for - dial-in-style conferences with a central "bridge" (MCU) or multicast - dial-out conferences with a central bridge or multicast - multicast conferences - full or partial mesh conferences, where end systems replicate media streams. |
![]() ![]() When do I use a proxy server vs. a redirect server? |
I believe the difference is fundmentally one of
control. You use a proxy when you want to control processing of the call from the point you receive it, and forward. A proxy can see the provisional and final responses to the request. It can record route so that it sees whats going on during the call. A redirect server, however, hands off control to the device that sent it the INVITE. It will never see the final response to the request, and not be contacted again for the remainder of the call. Its a one-shot deal. As such, redirect servers are really good for high volume, lookup style transactions. They can almost be considered a form of database query, albeit a SIP specific one. Proxies are generally needed for services and more complex routing problems. 2000-Nov-14 10:50am mailto:jdrosen@dynamicsoft.com?subject=SIP FAQ |
![]() SIP Protocol Operation |
Contains FAQ and clarifications on protocol operation.
|
Subcategories:
Answers in this category: |
mailto:islepchin@dynamicsoft.com?subject=SIP
FAQ, mailto:jdrosen@dynamicsoft.com?subject=SIP
FAQ, mailto:hgs@cs.columbia.edu?subject=SIP
FAQ |
![]() ![]() What does the [H14.17] in RFC 2543 stand for? |
This is explained in Section 3 of RFC 2543. It refers
to the section number in the HTTP/1.1 specification. |
![]() ![]() Do callers need to know the location of the Location Server? |
Also, callers don't register with the location server.
|
![]() ![]() What is the difference between a call leg and a call id? |
A call leg refers to the one-to-one signaling
relationship between two user agents (UAs). The Call-ID is an identifier,
carried in the SIP messages, that refers to the call. A call is a
collection of call legs. A UAC starts by sending an INVITE; because of
forking, it may receive multiple 200 OKs from different UAs. Each
corresponds to a different call leg within the same call. Call is thus a
grouping of call legs. In the call control spec, additional call legs are
created through special mechanisms.
Call legs refer to end-to-end connections between user agents, rather than any relationship with proxies. Within a call leg, there are numerous transactions in both directions. The request URI is not used in call leg identification. The To and From field relate to local and remote in the following way. When Alice sends a request on a call leg to Bob, the From field contains the local address (Alice), and the To field the remote address (Bob). When a request is received by Bob, the To field is matched to Bob's local address, and the From field to the remote address (Alice). The CSeq spaces in the two directions of a call leg are independent.
Within a single direction, the sequence number is incremented for each
transaction. |
![]() ![]() What is the difference between tag and branch-id? |
Branch IDs allow proxies to match responses to forked
requests. Without them, a proxy wouldn't be able to tell which branch a
response corresponds to. Tags, in To headers, are of no help here since
they are not known until responses arrive. Tags are used by the UAC to
distinguish multiple final responses from different UASs.
A UAS has no reliable way of determining if the request has been forked or not. Thus, to be safe it needs to add a tag. Proxies only insert tags into the final responses they generate themselves; they never insert tags into requests or responses they forward. Since a request can be forked several times on its way to UAS, a single
"tag" (or whatever you like to call it) added to the request by one of the
proxies is not sufficient for the next forking proxy along the chain to
match responses on its own branches; every proxy that forked the request
would need to add its own unique IDs to the branches it created. This is
precisely what's being achieved by the branch parameter in the Via header.
|
![]() ![]() How can one recognize a retransmitted, duplicate or looped request? |
header retransmitted duplicate matching response From same same same To same same same, but tag may have been added Call-ID same same same request URI same same n/a CSeq same same same Via - - must be local host; check for branch parameter to identify which branch Looped request are recognized by one of the following:
|
![]() ![]() What is the relationship between the From, Contact, Via and Record-Route/Route headers? |
All these headers determine how requests and responses
are routed in a network of SIP proxy servers. Roughly, the distinction is:
- From: From: Alice sip:alice@example.orgto Bob, an INVITE request from Bob to Alice would use sip:alice@example.orgas the To header and Request-URI. - Contact: - Record-Route/Route: - Via: Generally, in short, requests should be sent to Route if present,
Contact if there is no Route, From if there is no Contact. |
![]() ![]() How are URLs compared? |
Two SIP URLs are compared for equality according to
the following rules:
- the display name is ignored; - tags must match; - user, password, host, port and parameters of the URI must match. If a component is omitted, it matches based on its default value. - string comparisons are case-sensitive except for the domain part; - Characters other than those in the "reserved" and "unsafe" sets (see RFC 2396) are equivalent to their ""%" HEX HEX" encoding. - An IP address that is the result of a DNS lookup on a hostname does
not match that hostname. |
This is TBD. |
![]() ![]() What's the difference between the request URIs tel:+12125551212 and sip:12125551212@gw.com? |
Non-SIP URLs, such as tel:+12125551212 for a telephone
number, may be used as request URIs in SIP INVITE requests. This only
makes sense if all outbound calls are handled by a proxy server. In the
case of a tel: URL, the proxy server would then translate the request URL
to a SIP URL of a gateway server, if it is not handling the gateway duty
itself. The proxy server might use the Telephony Routing over IP protocol
(TRIP) to find the appropriate next-hop SIP server. The To header may
always be a tel: URL even if the Request-URI is a SIP URL, although that
breaks with the common practice that Request-URI and To start out the
same. |
![]() ![]() Do I always need a proxy or redirect server? |
No, two SIP user agents can contact each other
directly. |
![]() ![]() How does a caller find its local registrar? |
The local registrar is either manually configured or,
more likely, the SIP client issues a multicast registration request to the
sip.mcast.net standard multicast address, which all registrars listen to.
Another possibility is to use SLP through an extensions defined in http://www.cs.columbia.edu/~hgs/sip/drafts/draft-kempf-sip-findsrv-00.txt.
|
![]() ![]() How do I ensure registrar reliability? |
There are several techniques that can be used to
minimize the impact of registrar/proxy server failures for a server in a
local area network:
- Run several servers that all respond to the same multicast registration address ("warm standby"). As long as multicast requests are mostly reliable, this ensures a consistent registration picture. - If a registration server is rebooted and does not have complete knowledge of the local UA population, it could multicast any incoming INVITE requests. - Use some of the available hardware redundancy solutions. For servers separated from their client by a wide-area network, use of
multicast is not appropriate, so that these servers have to rely on
traditional backup techniques to achieve reliability. For example, the
designated registrar could multicast registration updates within its local
network to keep standby servers synchronized. |
![]() ![]() Are ACK requests retransmitted? |
No. An ACK is sent when a response retransmission is
received. Reliability is achieved because the response is retransmitted
until an ACK arrives, and the ACK is retransmitted on response
retransmissions. ACK is only used for INVITE. |
![]() ![]() How are BYE requests routed? |
Since a Contact header MUST be present in INVITE and
200, the BYE will go directly to the user agent if there is no
Record-Route header. If there is a Record-Route, it will traverse the list
of proxies indicated there.
If the caller decides to send a BYE before receiving a 200 from the
callee, the BYE is be handled by the proxies just as the corresponding
INVITE was handled, i.e., it may be forked. |
![]() ![]() Can I CANCEL requests other than the first INVITE? |
Yes, any request can be cancelled before it has been
executed by the UAS. However, it is likely that this will only make sense
in practice for the initial INVITE and subsequent re-INVITEs. In the
latter case, the call remains, just any changes requests are cancelled.
|
![]() ![]() How does a caller find its proxy server? |
Calls typically proceed directly to the callee's
domain. For example, when calling alice@example.com, the INVITE request
would be sent to the SIP server for the domain example.com, found via DNS.
If a "local" (outbound) proxy is needed for outgoing calls, it
currently needs to be manually configured, similar to the configuration of
web proxies in browsers. Another possibility is to use DHCP or SLP through
one of the extensions listed in http://www.cs.columbia.edu/~hgs/sip/drafts_dhcp.html.
|
![]() ![]() What's the difference between a stateless and a stateful proxy server? |
Stateless proxies forget about the SIP request once it
has been forwarded. Stateful proxies remember the request after it has
been forwarded, so they can associate the response with some internal
state. In other words, stateful proxies maintain transaction state.
Stateful implies transaction state, not call state.
Stateless proxies scale very well, and can be very fast. They are good for network cores. Stateful proxies can do more (they can fork, for example, see the next question) and can provide services stateless ones can't (call forward busy, for example). They don't scale as much as stateless ones. An admininstrator gets to decide which to use. These are also logical entities; a physical proxy is likely to act as a stateless proxy for some calls, stateful for others, and as a redirect server for even others. Neither stateful nor stateless proxies need to maintain call state, although they can, but will need to make sure that they are part of subsequent transactions via the Record-Route header. A proxy must be stateful if one of the following conditions hold: 1. It uses TCP, 2. It uses multicast, 3. It forks. |
![]() ![]() Why can a forking SIP proxy not be stateless? |
A forking SIP proxy cannot be stateless because it
needs to perform a filtering operation, returning (in general) one
response out of the many it receives. For example, a forking proxy with
three branches, that receives a 200-class, 400-class, and 500-class
response on each branch respectively, should return only the 200-class
response upstream. If the proxy were stateless, it would end up returning
all three of the responses upstream (since it won't remember that it had
received prior responses when it gets another one). The result of this is
(1) response implosion at the client, and (2) inconsistent responses at
the client. (In this example, depending on the order the responses
received, the client might think that the call failed, just to get a
success indication some time later.) Thus, a forking proxy must be
stateful.
Also note that a proxy that uses TCP must be stateful as well, whether it forks or not. This has to do with reliability issues. Why do you want state in a proxy? Certain services (like forking)
simply require it. A sequential search proxy requires state; sequential
search is the heart of services like follow-me and personal mobility. It's
at the discretion of the implementor whether to use a stateful or
stateless proxy. You can even be "super stateful", and use the
Record-Route header to allow a proxy to be on the signaling path of all
subsequent exchanges. This allows a stateful proxy to maintain call state
in addition to transaction state. |
![]() ![]() How does a caller find the remote SIP client of the callee? |
The process is similar to the delivery of email: The
caller uses the SIP host name to look up the destination host, first
trying a SRV record and then "regular" DNS, just like an email client
(MTA) looks up the MX record. (SRV records are generalized MX records
applicable to any network service, including, but not limited to, SIP and
RTSP.) For example, when contacting sip:bell@cs.columbia.edu,the client finds a SRV record pointing to erlang.cs.columbia.eduas the SIP server for the domain cs.columbia.edu. As for email, a single domain name can resolve to multiple servers, allowing load sharing and redundancy. The server located in this manner can then proxy or forward the call to
another server. |
![]() ![]() How does SIP get through a firewall? |
There are several possible approaches to SIP-capable
firewalls. One of the difficulties is that, unlike for, say, HTTP,
connections are originated both by hosts inside and outside the firewall.
A likely arrangement is that a SIP proxy sits "on" the firewall and relays
SIP requests between the Internet and the intranet. Thix proxy would also
open up the necessary ports in the firewall to let audio and video flow
through, for example using Socks V5.
As an alternative, if a firewall or NAT allows outgoing TCP connections, the inside client can open up a TCP connection to an outside proxy. All outgoing and incoming calls would then be handled by that TCP connection. (The client would still have to use SOCKS or similar mechanism to convince the firewall to let RTP packets through.) Take a look at the two dratfs at http://www.cs.columbia.edu/~hgs/sip/drafts_firewall.html
for a more detaled discussion of getting SIP through firewalls and
NATs. |
![]() ![]() Does SIP do keep-alive? |
SIP itself does not have a keep-alive mechanism during
the call. It was felt that loss of connectivity would be detected rapidly
by the absence of media packets, typically sent at a much higher rate than
any signaling keep-alive messages could be sent. In addition, the
signaling path is not needed during the conversation and may well be
completely different (due to proxy and redirect servers) than the media
path, so that keep-alives have a limited functionality. If it is desired
to test the liveness of a signaling server, it is always possible to send
either OPTIONS or (re)INVITE messages.
However, knowing the call state might be useful for certain
applications (e.g., when billing is involved, when firewall permissions
need to be set etc.). Session timer extension has been defined to solve
this. The draft can be found at http://www.cs.columbia.edu/~hgs/sip/drafts/draft-ietf-sip-session-timer-01.txt
and it basically allows the servers indicate a desired refresh interval.
The call is considered terminated if a re-INVITE is not received within
that interval. |
![]() ![]() Why does SIP not have a Content-Transfer-Encoding header? |
The Content-Transfer-Encoding header was primarily
meant to allow message bodies to be transformed into formats that could be
transferred on channels that were not 8 bit clean. HTTP, which makes use
of many of the MIME headers, is 8 bit clean, and thus did not need
Content-Transfer-Encoding. SIP followed suit, and so does not use it
either. Content-Encoding is used for things like compression, which is
different. (J. Rosenberg)
See also RFC 2616 (HTTP/1.1), Section 19.4.5. |
![]() ![]() I want SIP to be more compact. What can I do? |
First, one should realize that in general, SIP
exchanges are only going to be a tiny fraction of the overall session
bandwidth. A typical SIP call setup takes less than 1000 bytes, or the
equivalent of one second of highly compressed (G.729) audio. Some
additional space savings can be realized by using short headers. (A
realistic example for an audio call setup takes a total of about 640
bytes, of which about 69 bytes are SIP headers.)
In general, more substantive savings are possible by using either
payload compression (RFC 2393) or link-layer compression, e.g., at the PPP
layer. For the example above, the total size is reduced to about 520 bytes
with gzip compression. |
![]() ![]() What are the different addresses in SIP? |
SIP INVITE requests involve three addresses:
1.The host address where the request came from. Responses are sent back to the same host address, regardless of what the From header indicates. Note that different requests for the same call can come from different hosts. 2.The From address contains the logical source of the request. It remains unmodified as a SIP request traverses proxies, for example. The From address may not be the same as the host address that generated the SIP request, although that's the typical case. 3.The session description (e.g., SDP) contains one or more addresses
where the caller expects media data (audio, video) to be sent. For some
services, this address may not be the same as the From
address. |
![]() ![]() The BNF for header <put your favorite header here> allows a parameter to appear more than once. What does this mean? |
ABNF is not sufficient to express all the rules about
what constitutes a legal or illegal message. Rules about repitition, for
example, are difficult to express in ABNF. You must also look at the text
that defines a header to obtain a complete picture about proper message
syntax and semantics. |
![]() ![]() Can the request URI include a port number and/or transport parameter? |
It can have a port number. But, let me explain when
this is needed and when its not. Lets say I send a request to joe@example.com, and the server for example.com is listening on 5061. The request URI might look like: INVITE sip:joe@example.com:5061 SIP/2.0 this arrives at example.com. Since the request is for that server, it looks up joe in some database and translates the request URI (for example, to sip:joe@engineering.example.com). It looks up engineering.example.com in DNS, and finds an A record for that machine, forwarding the request to the given IP address. The outgoing request URI looks like: INVITE sip:joe@engineering.example.com SIP/2.0 Note that in this case, the presence of port 5061 in the request URI sent to example.com didn't make a difference. Thats because the example.com just translated the request URI. Whether it had contained the port number or not would have had no effect on processing. However, had the request instead been sent to a local outbound proxy instead of example.com, the port number would NEED to be there. Thats because the local outbound proxy won't translate the request URI, it will example it, determine its not for itself, look up the domain in the request URI in DNS, and forward the request there. So, the request URI needs to contain this port so that the local outbound proxy knows to forward it to 5061 as opposed to 5060 at example.com. So, the rule of thumb is this: if you send a request to the server listed in the domain of the request URI, URI parameters like port, transport, ttl etc MAY be present but are not needed. If you send a request to a server which is NOT the one identified in the request URI, you MUST include these parameters if they are not the defaults. Always inclduing them, when not default, means you don't need to determine which is the case, and is always the safest bet. |
![]() ![]() Transport in Via |
If a proxy sends a request by UDP (TCP), the spec
currently does not disallow placing TCP (UDP) in the transport parameter
of the Via field, which it should.
If such a request is received, it should be responded to with a 400 Bad
Request. The protocol used for the request should also be used for the
response. |
![]() ![]() What should I do if my re-INVITE fails? |
Here is what you should _not_ do: you should _not_
terminate the call. re-INVITE failure means that the request to change the
media was declined and you should keep using the old media codecs you
negotiated previously. |
![]() ![]() How long can SIP host names be? |
DNS (RFC 1035, Section 3.1) limits labels (each
component of a host name) to 63 characters. The total length of a domain
name (i.e.,label octets and label length octets) is restricted to 255
octets or less. http://www.networksolutions.com/help/long-domains.html,
however, claims that host names can be up to 80 characters long.
|
Note, however, that SIP implementations MUST be
prepared to handle host names of any length, subject to any maximum
message size restrictions that are part of local policy. |
![]() ![]() Can a User Agent also act as a Registrar? |
Being a UA is a logical role, as is a registrar. If an
entity accepts REGISTERs and stores location information, its a registrar.
So, you can write a piece of code that is both a UA and a registrar if you
want. (Jonathan Rosenberg) |
![]() ![]() Can I remove an m= line from SDP in response or re-INVITE? |
No. Once an "m=" line made it into SDP of a request or
response, it cannot be removed until the call is terminated. The only way
to decline a media session is by setting its port number to 0. The only
way to offer a new media session is by adding it to the end of the list.
The reason for this is that we need to ensure that it is always possible to match media sessions (i.e., "m=" lines) in requests and responses. Consider an INVITE with the following SDP: ... c=IN IP4 1.2.3.4 m=audio 54678 RTP/AVP 0 1 3 m=video 7346 RTP/AVP 28 31 (face) m=video 7880 RTP/AVP 26 28 (presentation) If the response contained something like ... c=IN IP4 3.4.5.6 m=audio 6540 RTP/AVP 0 1 m=video 6578 RTP/AVP 28 the caller would not be able to tell which of the two offered video
streams was accepted. |
![]() ![]() I'm a UAC. I sent an INVITE, and then decide I want to hang up before getting a final response. Do I send BYE or CANCEL? |
If the caller wants to hang up a call, but hasn't yet
received a final response, it can send a CANCEL or a BYE. Sending a BYE
would seem easiest, but there are issues. First off, you won't have gotten
a tag yet from the UAS, nor will you have received Record-Route or Contact
headers (obtained in the 200 OK response). This means the BYE will be
routed "afresh" by proxies. Its possible that routing logic may have
changed (perhaps there was some time of day routing or randomized
routing), in which case the BYE may reach a different set of participants
than reached by the original INVITE. So, if the original INVITE forked,
and reached A and B, and the BYE reaches B and C, B will send a 200 OK,
and C a 481. The forking proxy forwards the 200 OK upstream, and the
caller gets the 200 OK. However, A is still ringing, and might later send
a 200 OK. This yield inconsistent call state, which will persist until the
UAS times out, as it will never get an ACK. Sending CANCEL helps solve many of these problems. CANCEL will reach the same set of recipients as the original INVITE, and it doesn't need a REcord-Route or tag in it. The drawback, however, is that the CANCEL and a 200 OK from one of the UAS might pass on the wire. Thus, the UAC may still need to ACK the 200 OK, and then send BYE. The other drawback is that you wouldn't send CANCEL if the call was already established, you'd send BYE. Folks complained they didn't want to have state-dependent mechanisms for hanging up. Given the unlikelihood of the problems with sending BYE, it seems reasonable to allow it. |
![]() ![]() I'm a proxy, and I forked a request, and forwarded multiple 200 OK upstream. Now, I get an ACK. What do I do with it? |
Normally, using Route headers which should be present
in the ACK. In the bis draft, the final 200 OK response MUST contain a
Contact header. This means that either (1) the proxy record-routes, in
which case the ACKs will each contain (different) route headers which tell
the proxy where to send the request, or (2) the proxy doesn't
record-route, in which case it gets sent directly to the UAS, since there
was a contact. That aside, should it arrive anyway, the ACK should be routed just as any other new request. Apply routing logic, which presumably causes it to be forked to both locations. The tags will help identify for which UAS the ACK is meant. |
![]() ![]() If I get a new SDP body in the ACK, and I don't like the media type, how can I indicate its unacceptable to me? |
It doesn't work that way. You should not get a "new"
SDP in the ACK. SDP goes in ACK only if there was no SDP in INVITE, then
SDP was included in the 200 OK. This SDP should represent a subset of the
media "offered" in the 200 OK. In other words, a normal SIP transaction
has one SDP in the INVITE, and another in 200 OK. Now, we have the same
two phase process, but the first SDP is in the 200 OK, and the second in
the ACK. |
![]() ![]() Can a SIP UA register with multiple registrars? |
Yes. A SIP UA will register with registrars of any
domain where it maintains an identity. For example, a UA belonging to
Alice, with identities alice@big-company.com, alice@myportal.com and
alice@home-isp.net will register with three different registrars.
|
![]() ![]() Is it possible for a UA to make a call to itself, and have the result be two separate calls on the same machine? |
If you have a table of transactions by transaction
keys (To, From, Call-ID, CSeq), you should be sure to have the direction
of the request (sent or received) as part of the key, or use separate
transaction tables even for incoming and outgoing requests. If you have a table of calls by Call-ID only, you will run into problems as you may think this is a re-INVITE for the same call. You really want the incoming and outgoing messages to be associated with totally different calls. It can be distinguished as a separate call if, again, you allow for direction in your lookups. There are many ways to do this; the easiest is to include your local address as part of the call key. For incoming requests, the key is callID + from, for outgoing, callID+ to. |
![]() ![]() Does a UAS use the request-URI or To field to determine if a call is for it? |
It uses the request URI. A UAS should be prepared to
receive calls with the request URI set to values that it has registered
(and placed in the Contact header of REGISTER). It should also be prepared
to receive calls with the request URI set to the value it placed in the To
field of the REGISTER. Its not likely to see such a request URI, unless
its receiving a direct client to client call. |
![]() ![]() How are SIP parsers implemented? |
Parsers can be implemented either directly from the
ABNF or via parser-generator tools. Some tools that have been used
include - http://www.gnu.org/software/flex/flex.html flex for lexical analysis - bison (http://www.gnu.org/software/bison/bison.html) or lemon (http://www.hwaci.com/sw/lemon/) for parsing. A somewhat outdated grammar summary can be found at http://www.cs.columbia.edu/~hgs/sip/SIPgrammar.html |
![]() ![]() Is it possible to use Hide with Record-Route? |
No, only Via can be hidden. Hiding a Record-Route
header in the same manner is impossible because it would need to be
decrypted by the upstream proxy for subsequent requests from the callee to
the caller; however, the secret key would only be known to the server that
encrypted the header. |
![]() ![]() How does a proxy handle a method other than the standard INVITE, ACK, BYE, etc.? |
See SIP RFC, section 4.2. Proxies always forward
requests, regardless of method. A UAS returns 501 (Not Implemented) if it
does not support a particular request method. |
![]() ![]() Why does a proxy server doing TCP need to be stateful? |
The situation comes up in a proxy which is TCP on the
upstream side and UDP in the downstream side (see diagram below).TCP UDP UA1 -------- Proxy --------- UA2 Lets assume that this proxy is stateless, meaning that it holds the TCP
connection state but otherwise does not store transaction state. According
to the specification, requests are not retransmitted over TCP. So, the UA1
sends its request, say an INVITE, just once over the TCP connection. The
proxy receives this, and forwards it to UA2. Its lost. UA1 will not
retransmit (since its TCP), and neither will the proxy, since its
stateless, thus, the message is lost. |
![]() ![]() In computing the Content-Length, does the newline in a body count as one byte or two (CR vs. CR-LF)? |
The Content-Length is always the number of BYTES in
the body. If the body is text and has newlines, a CR-LF would be counted
as two bytes, while just a CR is one byte. |
![]() ![]() Can a proxy fork a non-INVITE request? If yes, what happens if it gets multiple responses? |
Yes, a proxy can fork a non-INVITE request. However,
it must forward only a single response upstream, 200 or otherwise. Thus,
only a single 200 is ever forwarded upstream. This is in contrast to
INVITE, where all 200's received are forwarded upstream. Why is that? The
reliability mechanism of non-INVITE requests dictates that. Response
retransmissions are triggered on request retransmissions. Thus, the client
retransmits its request until it gets *a* response. So, upon receiving the
first final response, the client would cease retransmitting the request,
and then there would be no way to reliably send the other final
responses. As a result of this, forking of non-INVITE requests is only useful when the method has semantics that meet certain criteria. Specifically, (1) the client doesn't care which server gets the request, (2) the client doesn't care which server sent the response, or even if multiple servers sent a response, (3) the service provided by each server is identical. In essence, forking of non-INVITE requests is useful only for an anycast type of service. |
![]() ![]() Should responses be sent to the host specified in Via? Is From ever used for sending responses? |
The rules for forwarding responses are explained in
section 6.47.5 of 2543bis.
Note that if you are a UAS, you are the entity responsible for adding
"received" parameter to the top Via so you may want to interpret the last
two rules in the section as requiring to send the response to the source
address of the request. |
![]() ![]() Once a SIP registrar gets a REGISTER request, how does it update the Location Server with the contact information? |
This depends on whether the location server is
co-located with the SIP registrar or not. If it is, a functional interface
(API) suffices to save the location information and any registration
payload. If it is not, any protocol or middleware like CORBA, LDAP, RPC,
or a database access protocol (e.g., SQL over TCP), can be used. The
details are implementation-dependent and outside the scope of the
protocol. (Vijay Gurbani) |
![]() ![]() Is a SIP URI without a user name valid? |
SIP request URIs such as sip:phone123.example.com are
valid if the device being addressed does not have a notion of a 'user'.
Generally, it would not be sent to a proxy, but a proxy may translate, for
example, sip:joe@example.com into the request URI
sip:phone123.example.com, if that phone has registered with that Contact
address. |
![]() ![]() Is there a specific order for header fields? |
Header fields can appear in any order, except within a
header field type (a list of headers separated by a comma or several
fields with the same name). For example, Route, Record-Route and Via need
to be kept in order. Responses can re-order header fields found in the
request. HTTP/1.1 says: "The order in which header fields with differing field names are received is not significant." "The order in which header fields with the same field-name are received is therefore significant to the interpretation of the combined field value, and thus a proxy MUST NOT change the order of these field values when a message is forwarded." If a set of fields is authenticated, proxies must not re-order or otherwise modify these field, as this would break the authentication. |
![]() ![]() When is a CANCEL used? |
- A proxy has forked an INVITE request, and it
receives a 200 or 600 response on one of the branches, the proxy CANCELs
unanswered branches; - The time described in the Expires header of the request has elapsed; - No response, including provisional, was ever received from downstream nodes; - Internal logic determines its time to end the transaction (a CPL or sip-cgi script, for example). |
![]() ![]() What do I need to do to use SRV records? |
Any bind implementation after 4.9.5 supports SRV
records, as does Windows 2000._sip._tcp SRV 0 0 5060 sip-server.cs.columbia.edu. SRV 1 0 5060 backup.ip-provider.net. _sip._udp SRV 0 0 5060 sip-server.cs.columbia.edu. SRV 1 0 5060 backup.ip-provider.net.DNS SRV records are supported by BIND 4.9.6 and newer, generally installed as named. Configuring named for Linux is discussed in a HOWTO at http://www.linuxdoc.org/HOWTO/DNS-HOWTO.html Currently registered SRV records: sip.tcp.cs.columbia.edu SRV 0 0 5060 erlang.cs.columbia.edu sip.udp.sip-happens.com SRV 0 0 5060 sip.sip-happens.comUse, for example, host -v -t srv sip.tcp.cs.columbia.edu host -v -t srv sip.udp.cs.columbia.edu host -v -t srv _sip._udp.cs.columbia.edu |
![]() Relationship to Other Protocols |
Answers commonly asked questions about relationship
between SIP and other protocols. |
Subcategories:
Answers in this category: |
mailto:islepchin@dynamicsoft.com?subject=SIP
FAQ, mailto:jdrosen@dynamicsoft.com?subject=SIP
FAQ, mailto:hgs@cs.columbia.edu?subject=SIP
FAQ |
![]() ![]() What is the relationship between MGCP and SIP? |
The details of combining the two in a system are still
being fleshed out. MGCP is a device control protocol, where a slave
(gateway (MG)) is controlled by a master (media gateway controller (MGC),
call agent). SIP may be used between controllers, in a peer-to-peer
relationship. Note that to the SIP side, the MGC looks like a node with a
large number of connections, but otherwise the same as a "native" SIP
device. Similarly, the MG is completely unaware that the call between MGCs
is established via SIP. Only the MGC needs to understand both protocols.
Additional details provided by Tom Taylor: The basic architecture assumed by the Megaco Working Group postulates two functional entities: a Media Gateway Controller (MGC), which owns the call model and is responsible for call signalling, and a Media Gateway (MG), responsible for manipulating (directing, transforming) media flows under the control of the MG. MGCP and Megaco/H.GCP are both protocols used between the MGC and MG when they are realized in separate physical elements. MGCP (Media Gateway Control Protocol) was a major source of the ideas in the current Megaco/H.GCP protocol draft, and is being deployed in a number of products being announced over the next few months. It is best suited for IP telephony gateway applications. The Megaco protocol is also called H.GCP because it is being developed cooperatively between the Megaco WG and ITU-T Study Group 16. H.323 is a complete system specification, including call signalling
protocols which would run between an MGC and another MGC or other H.323
entities (Gatekeepers, endpoints). SIP can also be used as a call
signalling protocol, and can therefore be viewed as a competitor to H.323.
Both protocols are capable of supporting multipoint multimedia
conferences. H.323 was first standardized in 1996 and has been improved
since then; current standardization is focusing on networking aspects such
as translations data exchange and interworking with legacy telephony
signalling. SIP just reached Proposed Standard status, but has attracted
wide interest which may speed its maturing stages. The Megaco/H.GCP
protocol will complement both protocols by also providing support for
multipoint, multimedia calls at the media level. |
![]() ![]() What is SIP+ and how does it relate to SIP |
SIP+ was a proposal by Level3 on how to extend SIP to
interconnect two MGCs. This functionality is now being provided by various
orthogonal SIP extensions, including the carriage of multipart MIME types,
the INFO method and others. These are being documented in a BCP draft. The
name SIP+ is obsolete and should not be used to avoid confusion.
|
![]() ![]() How does SIP compare to H.323? |
See SIP vs. H.323 comparison at http://www.cs.columbia.edu/~hgs/sip/h323-comparison.html. |
The brief answer: The main advantage of SIP is its
full integration with other Internet protocols and functions, such as
email, web and instant messaging. For example, it is very easy to
"forward" calls to web pages or email; web pages can be included in
responses to call attempts. SIP is also codec-neutral and has been used to
set up anything from audio and video to Doom distributed games and screen
sharing. There a number of open-standard programming interfaces, SIP
servlets, sip-cgi and CPL, that are particularly suited to SIP-based
devices. These programming interfaces make creating SIP-based services
very similar to writing web scripts or web pages. It also supports a number of services, such as ACD and follow-me, that are much harder to implement in H.323. |
![]() ![]() Can H.323 and SIP be used together? |
Yes. SIP can locate the called party and determine its
capabilities, including H.323. H.323 is then used to connect the two
parties.
Unfortunately, there is currently no standard on translating between the two. Conversion is made more difficult by the multiple versions of H.323 (v1, v2, v3). However, there is at least one product (Lucent PacketStar IP) that allows SIP and H.323 terminals to call each other. There is an ongoing effort within SIP Working Group to define SIP-H.323
interoperation standard. Some details on the effort are available at http://www.softarmor.com/sipwg/teams/sip-h323/index.html.
The group has produced an internet dratf that can be found at http://www.cs.columbia.edu/~hgs/sip/drafts/draft-singh-sip-h323-00.txt.
|
![]() ![]() How do I interconnect Q.931 (ISDN signaling) and SIP? |
A gateway that initiates an ISDN call based on a SIP
call or vice versa is reasonably straightforward, as sketched here: ![]() Some additional information and links to related Internet Drafts can be
found at http://www.softarmor.com/sipwg/teams/sipt/index.html. |
![]() ![]() How do I interconnect ISUP (SS7 signaling) and SIP? |
SIP can be used either between SS7 nodes or to trigger
a phone call in an SS7 network. While all the details have not been worked
out, the basic call flow is similar to the ISDN case: ![]() Some additional information and links to related Internet Drafts can be
found at http://www.softarmor.com/sipwg/teams/sipt/index.html. |
![]() ![]() Where do I find description of SDP? |
SDP (Session Description Protocol) specification can
be found in RFC2327. SIP uses SDP to describe media capabilities of call
participants and to negotiate the common media set media for a call.
Appendix B of RFC2543 describes the usage of SDP in SIP
messages. |
![]() ![]() Can SIP be used for Internet telephony gateways (ITGs)? |
Yes, in two ways. First, it can indicate to the
Internet-based caller that the callee is reachable via an ITG, via the
Contact header. Secondly, two ITGs connecting parties on the PSTN can
signal new calls to each other, with the destination phone number
contained in the request URL. |
![]() ![]() What is sip-cgi and how does it relate to CPL? |
Both are viewed as different approaches for creating
VoIP services. Both are written offline, and both are executed when
messages arrive in order to execute features.
CPL is an XML-based language, while sip-cgi is a mechanism for invoking scripts or programs written in any language. sip-cgi is very similar to web cgi scripts. In its current version, CPL is only invoked when INVITE requests and responses arrive, while sip-cgi can intercept any request. sip-cgi is designed to be used by SIP, while CPL can probably be used by a number of signaling protocols such as Q.931 or H.323. CPL and sip-cgi differ in their applicability. CPL is designed for end user service creation. It is intentionally limited in capabilities and is not a general purpose programming language. Its execution on a server is generally very fast. CGI is more powerful - you can do nearly anything. It is programming language independent. It incurs a process-spawning overhead, so its less efficient than CPL. (CPL is usually executed in the same process as the server). As a service provider, I would not want to execute CGI scripts sent to me by end users. However, I would prefer to use CGI to develop my own services. Note that CGI may be used as the execution environment for a CPL
script. (Jonathan Rosenberg) |
![]() ![]() Is there a SIP interoperability certification? How can I test interoperability with others? |
There
currently is no certification that attests to the functionality and
compatibility of a SIP implementation. However, there are regular SIP bake-offs where
implementors can test their work. Also, some sites have set up public SIP servers.
|
![]() ![]() Why use SIP-T as opposed to tunneling SS7 using SCTP? |
Using SCTP (aka SIGTRAN) to send SS7 between
softswitches works fine assuming that you know the terminating device is
an SS7 enabled softswitch, and that you are not interested in services
provided by SIP. By using SIP instead, a softswitch can talk the same call
control protocol to other softswitches, PC clients, gateways, IP phones,
and so on. Furthermore, the softswitch does not need to know the identity
of the terminating device ahead of time. In real networks, it will be
unlikely that the originating softswitch knows. Calls will terminate in
networks owned by other providers, in which case the type of terminating
device cannot be known ahead of time. SIP-T is ideal in that the extra
ISUP information carried is ignored by non-SS7 devices, so it works for
all devices. |
![]() ![]() How does SIP carry DTMF (touch tones)? |
First, in most cases it is not clear that SIP is the
right mechanism for this, since DTMF detection is being done in devices
that generate RTP, not SIP. RTP can be used to carry DTMF, as described in RFC 2833. RFC 2833 uses "forward error correction", retransmitting DTMF digits periodically. Thus, unless there are extremely long bursts of packet errors, digits are transmitted reliably. Retransmission by SIP, either at the application layer or via TCP, is based on exponential back-off, with delays of a few seconds after several consecutive losses. If a human generates the touch tone commands, it is possible that such long retransmission delays will cause the user to press the button again, resulting in duplicate digits. DTMF over RTP is also required to synchronize audio and touch tones at VoIP-to-PSTN gateways. Gateways that are only interested in detecting tones do not need to buffer audio and can simply forward the audio packets while doing playout buffering and DTMF detection locally. A number of proposals exist for carrying DTMF in SIP INFO messages, but the working group has not decided which of the approaches, if any, to pursue. |
![]() ![]() What is the relationship of a "softswitch" to SIP? |
The term "softswitch" is primarily a marketing term,
with no well-defined technical meaning. It is often used to designate a
collection of software providing telephony interworking services,
including a signaling gateway (SG), media gateway controller (MGC) and a
SIP user agent (UA). It can use any number of protocols, depending on the
particular application and network configuration, including ISUP, CAS,
MGCP, Megaco, H.323 and SIP. Many "softswitches" use SIP for communicating
between softswitches. |
![]() ![]() How does SIP/SDP relate to T.38 fax calls? |
An SDP addition to allow SIP/SDP to set up T.38 fax
calls is specified in Annex D of ITU Rec. T.38 ("SIP/SDP Call
Establishment Procedures"). It can also be found at ftp://standards.nortelnetworks.com/itu_to_ietf/SG8/February00/
|
![]() ![]() Where is the use of SIP defined in 3GPP? |
The exact signalling and call control protocols are
defined in 3GPP Technical Specification 3G TS 24.228: "Signalling flows
for the IP multimedia call control based on SIP and SDP" and 3GPP
Technical Specification 3G TS 24.229: "IP Multimedia Call Control Protocol
based on SIP and SDP". (Jack Yu) 3GPP documents can be found at http://www.3gpp.org/ The draft version for TS24.228 is available at http://www.3gpp.org/ftp/Specs/2000-12/Drafts/24_series/ |