Assessing and improving user experience Voice and Video Over IP communication
What is a Codec ?
Codec is a hardware or computer programs that encode/decode digital data streams or signals. Codec shrinks large movie files, and makes them playable on your computer. Codec programs are required for your media player to play your downloaded music and movies.
Codec = Coder + Decoder .
A codec encodes a data stream/signal for transmission, storage or encryption and decodes it for playback or editing.
Do we need Codecs ?
Video and music files are large(Uncompressed 1080i high-definition video recorded at 60 frames per second eats up 410 gigabytes per hour of vide), they become difficult to transfer across the Internet quickly. To help speed up downloads, mathematical “codecs” were built to encode (“shrink”) a signal for transmission and then decode it for viewing or editing. Without codecs, downloads would take three to five times longer than they do now.
Codec is a translator for compressing/decompressing raw media data.
Raw Data is huge. Codec compress them and facilitate to store.
Lossy and Lossless Codecs
Lossy Codecs reduce quality by some amount in order to achieve compression. Often, this type of compression is virtually indistinguishable from the original uncompressed sound or images, depending on the codec and the settings used. Lower data rates also reduce cost and improve performance when the data is transmitted.
Lossless codecs are typically used for archiving data in a compressed form while retaining all of the information present in the original stream. If preserving the original quality of the stream is more important than eliminating the correspondingly larger data sizes, lossless codecs are preferred. This is especially true if the data is to undergo further processing (for example editing) in which case the repeated application of processing (encoding and decoding) on lossy codecs will degrade the quality of the resulting data such that it is no longer identifiable (visually, audibly or both).
Using more than one codec or encoding scheme successively can also degrade quality significantly. The decreasing cost of storage capacity and network bandwidth has a tendency to reduce the need for lossy codecs for some media.
Media Codecs and their variations
Codecs are often designed to emphasize certain aspects of the media, or their use, to be encoded. For example, a digital video (using a DV codec) of a sports event needs to encode motion well but not necessarily exact colors, while a video of an art exhibit needs to encode color and surface texture well.
Audio codecs for cell phones need to have very low latency between source encoding and playback. In contrast, audio codecs for recording or broadcast can use high-latency audio compression techniques to achieve higher fidelity at a lower bit-rate.
So, there are thousands of codecs available. There are codecs for audio and video compression, for streaming media over the Internet, videoconferencing, playing mp3’s, speech, or screen capture. If you are a regular downloader, you will probably need ten to twelve codecs to play your music and movies.
Many multimedia data streams contain both audio and video, and often some metadata that permit synchronization of audio and video. Each of these three streams may be handled by different programs, processes, or hardware; but for the multimedia data streams to be useful in stored or transmitted form, they must be encapsulated together in a container format.
Lower bitrate codecs allow more users, but they also have more distortion. Beyond the initial increase in distortion, lower bit rate codecs also achieve their lower bit rates by using more complex algorithms that make certain assumptions, such as those about the media and the packet loss rate. Other codecs may not make those same assumptions. When a user with a low bitrate codec talks to a user with another codec, additional distortion is introduced by each transcoding.
Difference between Codec and Compression and Container Format
Compression format or standard – a format is a document (the standard), a way of storing data, while a codec is a program (an implementation) which can read or write such files. In practice, however, “codec” is sometimes used loosely to refer to formats.
Container specifies how different data elements and metadata coexist in a computer file or stream
Once the media data is compressed into suitable formats and reasonable sizes, it needs to be packaged, transported, and presented. That’s the purpose of container formats–to be discrete “black boxes” for holding a variety of media formats. Good container formats can handle files compressed with a variety of different codecs.
Theoretically, a container format could wrap any kinds of data, most container formats are specialized for specific data requirements.
Container does not describe how the data warped is encoded.
Popular Codecs and Containers
The previous posts were –
A Typical example of SIP:
Let us consider an example user Alice wants to communicate with user Bob . Proxy 1 and Proxy 2 help to setup the session on behalf of the users. This common arrangement of the proxies and the end-users is called “SIP Trapezoid”.
The messages appear vertically in the order they appear i.e. the message on top comes first followed by others. The direction of arrows shows the sender and recipient of each message.
The transaction starts with Alice making an INVITE request for Bob. But Alice doesn’t know the exact location of Bob in the IP network. So it passes the request to Proxy1. Proxy1 on behalf of Alice forwards an INVITE request for Bob to Proxy2. It sends a TRYING response to Alice informing that it is trying to reach Bob. The response could have been different.
Receiving INVITE X2 from Proxy1, Proxy2 works in a similar fashion as Proxy1. It forwards an INVITE request to Bob(note: Here Proxy2 knows the location of Bob. If it didn’t know the location, it would have forwarded it to another proxy server. So an INVITE request may travel through several proxies before reaching the recipient). After forwarding INVITE X3 Proxy2 issues a TRYING response to Proxy1.
Bob’s SIP/Soft phone, on receiving the INVITE request, starts ringing informing Bob that a call request has come. It sends a RINGING response back to Proxy2 which reaches Alice through Proxy1. So Alice gets a feedback that Bob has received the INVITE request.
Bob at this point has a choice to accept or decline the call. Let’s assume that he decides to accept it. As soon as he accepts the call, a 200 OK response is sent by the phone to Proxy2. Retracing the route of INVITE, it reaches Alice. The softphone of Alice sends an ACK message to confirm the setup of the call. This 3-way-handshaking (INVITE+OK+ACK) is used for reliable call setup. Note that the ACK message is not using the proxies to reach Bob as by now Alice knows the exact location of Bob.
Once the connection has been setup, media flows between the two endpoints. Media flow is controlled using protocols different from SIP e.g. SDP, RTP etc.
When one party in the session decides to disconnect, it (Bob in this case) sends a BYE message to the other party. The other party sends a 200 OK message to confirm the termination of the session.
Relation among Call, Dialog, Transaction & Message
If you are confused with the relation among Call, Dialog, Transaction & Message, you are not alone. Quite a good number of people get confused regarding the relation in the beginning.
Messages are the individual textual bodies exchanged between a server and a client. There can be two types of messages. They are – Requests and Responses.
Transaction occurs between a client and a server and comprises all messages from the first request sent from the client to the server up to a final (non-1xx) response sent from the server to the client. If the request is INVITE and the final response is a non-2xx, the transaction also includes an ACK to the response. The ACK for a 2xx response to an INVITE request is a separate transaction.
Dialog is a peer-to-peer SIP relationship between two UAs that persists for some time. A dialog is identified by a Call-ID, a local tag and a remote tag. A dialog used to be referred as a ‘call leg’.
Call of a callee comprises of all the dialogs it is involved in. I think a Call is same as a Session.
My Aritcle is over here. It was just for the beginners. I will recommend further reading on SIP .
The previous posts were –
SIP Message Samples:
The following samples show the message exchange between two User Agents for the purpose of setting up a voice call. SIP user email@example.com invites SIP user firstname.lastname@example.org to a call for the purpose of discussing lunch. Alice sends an INVITE request containing an SDP body. Bob replies with a 200 OK response also containing an SDP body.
Request Message Line
|INVITE sip:email@example.com SIP/2.0||Request line: Method type, request URI (SIP address of called party), SIP version.|
|Via: SIP/2.0/UDP||Address of previous hop. It contains the local address of alice (initiator)i.e. mssys.com where it is expecting the responses to come.|
|Max Forward : 70||It is used to limit the number of hops that this request may take before reaching the recipient (Here is 70). It is decreased by one at each hop. It is necessary to prevent the request from traveling forever in case it is trapped in a loop.|
|From: Alice <sip:firstname.lastname@example.org>||User originating this request|
|TO : Bob< sip:email@example.com>tag=1928301774;||User being invited, as specified originally. It also optionally contains a tag which is a pseudo-random sequence inserted by the SIP application. It works as an identifier of the caller in the dialog.|
|Call-ID: 2388990012@alice_ws.mssys.com||Globally unique ID of this call. It is generated as the combination of a pseudo-random string and the softphone’s IP address.|
|CSeq: 1243 INVITE||Command sequence. Identifies transaction. It contains an integer and a method name. When a transaction starts, the first message is given a random CSeq. After that it is incremented by one with each new message. It is used to detect non-delivery of a message or out-of-order delivery of messages.|
|Contact< sip:firstname.lastname@example.org >||It contains a SIP or SIPS URI that is a direct route to alice. It contains a username and a fully qualified domain name(FQDN). It may also have an IP address. (Via field is used to send the response to the request. Contact field is used to send future requests. That is why the 200 OK response from user2 goes to user1 through proxies. But when user2 generates a BYE request (a new request and not a response to INVITE), it goes directly to user1 bypassing the proxies)|
|Subject: Lunch today.||Call subject and/or nature.|
|Content-Type: application/SDP||Type of body—in this case SDP.(This is beyond the scope of SIP and is controlled by SDP which will be discussed later )|
|Content-Length: 182||Number of bytes in the body. If the transport type is UDP, then the Content-Length header is not mandatory. It is, however, mandatory for TCP.|
|Blank line marks end of SIP headers and beginning of body.|
SIP/2.0 200 OK
Via: SIP/2.0/UDP site4.server2.com;branch=z9hG4bKnashds8;received=192.0.2.3
Via: SIP/2.0/UDP site3.server1.com;branch=z9hG4bK77ef4c2312983.1;received=192.0.2.2
Via: SIP/2.0/UDP pc33.server1.com;branch=z9hG4bK776asdhds;received=192.0.2.1
To: user2 <sip:email@example.com>;tag=a6c85cf
From: user1 <sip:firstname.lastname@example.org>;tag=1928301774
CSeq: 314159 INVITE
—- User2 Message Body Not Shown —-
The header fields that follow the status line are similar to those in a request. I will just mention the differences-
There are more than one via field. This is because each element through which the INVITE request has passed has added its identity in the Via field. Three Via fields are added by softphone of user1, server1 the first proxy and server2 the second proxy. The response retraces the path of INVITE using the Via fields. On its way back, each element removes the corresponding Via field before forwarding it back to the caller.
Note that the To field now contains a tag. This tag is used to represent the callee in a dialog.
It contains the exact address of user2. So user1 doesn’t need to use the proxy servers to find user2 in the future.
It is a 2xx response. However responses can be different depending on particular situations.
Next post related to SIP –
The previous posts were –
SIP Message Parts:
A SIP Message usually has 3 parts-
Start Line—Conveys the message type(request or response).
Start Line = Method for Request or Response Code for Response + Protocol version.
A Request’s Start Line(Request Line) uses the following format:
<Request method><URI><Protocol version>
URI indicates the user/service to which the request is addressed. This address/URI can be re-written by the Proxy Servers.
An Example of Request Line is –
REGISTER sip:arstechnica.com SIP/2.0
A Response’s Start Line (Status Line) uses the following format:
<Protocol version><Response Code><Reason phrase>
The reason phrase could be any text describing the nature of the response.
An Example of Status Line is –
SIP/2.0 200 OK
Header—Conveys the message attributes and modifies the meaning of the message. Very similar to the HTTP Headers.
All headers maintains the format-
Headers can span multiple lines. Some SIP headers such as Via, Contact, Route and Request-Route can appear multiple times in a message or, alternatively, can take multiple comma-separated values in a single header occurrence.
An Example of Header is –
Contact is the header name, sip:email@example.com is the value, and expires=2000 is a parameter. Other parameters may appear separated by semicolons.
Body(Content)— Describes the Session to be initiated (for example, in a multimedia session this may include audio and video codec types, sampling rates etc.), or alternatively it may be used to contain opaque textual or binary data of any type which relates in some way to the session. Message bodies can appear both in request and in response messages. SIP makes a clear distinction between signaling information, conveyed in the SIP Start Line and headers, and the session description information, which is outside the scope of SIP.
Possible body types include:
ü SDP—Session Description Protocol (SDP).
ü Multipurpose Internet Mail Extensions (MIME).
ü Others—to be defined in the IETF and in specific
Other posts related to SIP –
The previous post was –
Components of a SIP
Entities interacting in SIP scenario are called User Agents (UA).
Each entity has specific functions and participates in SIP communication as a client (initiates requests), as a server (responds to requests), or as both. One “physical device” can have the functionality of more than one logical SIP entity. For example, a network server working as a Proxy server can also function as a Registrar at the same time.
User Agents may operate in two fashions –
- User Agent Client (UAC) : It generates requests and send those requests to servers.
- User Agent Server (UAS) : It gets requests, processes those requests and generate responses.
Note: A single UA may function as both.
It may be a softphone application running on your PC or a messaging device in your IP phone. It generates a request when you try to call another person over the network and sends the request to a server (generally a proxy server).
Servers are in general part of the network. They possess a predefined set of rules to handle the requests sent by clients.
Servers can be of several types –
Proxy Server: When a request is generated, the exact address of the recipient is not known in advance. So the client sends the request to a proxy server. The server on behalf of the client (as if giving a proxy for it) forwards the request to another proxy server or the recipient itself.
Redirect Server: Redirects the request back to the client indicating that the client needs to try a different route to get to the recipient. It generally happens when a recipient has moved from its original position either temporarily or permanently. Unlike Proxy servers the redirect server does not forward requests to other servers.
Registrar Server: Users have to register their locations to a Registrar server. Users from time to time refresh their locations by registering (sending a special type of message) to a Register server.
Location Server: The addresses registered to a Registrar are stored in a Location Server.
Message Types and Commands of a SIP
There are two types of SIP messages:
Requests—sent from the client to the server.
Responses—sent from the server to the client.
|INVITE||Invites a user to call|
|ACK||Confirms a final response to a INVITE|
|BYE||Terminates a Call|
|CANCEL||Cancel searching and ringing|
|OPTION||Queries the capabilities of Sever/Other side|
|REGISTER||Register the Location Service|
|INFO||Send the mid-session info that will not modify the Session state|
Response Commands: Response Message contains numeric response codes. There are 2 types of Response Commands and 6 classes –
2 Types :-
Provisional (1xx class)—provisional responses are used by the server to indicate progress, but they do not terminate SIP transactions.
Final (2xx, 3xx, 4xx, 5xx, 6xx classes)—final responses terminate SIP transactions.
1xx : Provisional,Searcing,Quering,Ringing.
2xx : Success.
3xx : Redirection,Forwarding
4xx : Request Failure(client mistakes).
5xx : Server failure.
6xx : Global failure(busy,refusal,not available anywhere).
Other posts related to SIP –
It is a big topic . So I decided to publish it as a series. The intended viewers are the beginners only who want to wet their feet with SIP, VoIP and IMS(Internet Multimedia SubSystem) based application development. So , Lets go ———
What is SIP and where it is ?
SIP (Session Initiation Protocol) is a signaling protocol used to create, manage and terminate sessions in an IP based network. A session could be a simple two-way telephone call or it could be a collaborative multi-media conference session.
It is an application-layer control protocol.
SIP sessions involve one or more participants and can use unicast or multicast communication. Borrowing from ubiquitous Internet protocols, such as HTTP and SMTP, SIP is text-encoded and highly extensible. SIP may be extended to accommodate features and services such as call control services, mobility, interoperability with existing telephony systems, and more.
Can and Can’t by SIP
The job of SIP is limited to only the setup and control of sessions. The details of the data exchange within a session e.g. the encoding or codec related to an audio/video media is not controlled by SIP and is taken care of by other protocols.
There are mainly 4 types of functions that a SIP performs –
- Establishment of user location (i.e. translating from a user’s name to their current network address).
- Negotiating the provided features among participants in a session.
- Call management – for example adding, dropping, or transferring participants.
- Changing the features of a session while in progress.
SIP is not a resource reservation protocol and it has nothing to do with quality of service (QoS).
SIP can work in a framework with other protocols to make sure these roles are played out – but SIP does not do them.
SIP can function with SOAP, HTTP, XML, VXML , WSDL, UDDI, SDP and others. Everyone has a role to play!
Other posts related to SIP –