H.323 Videoconferencing Basics

Introduction
H.323 is an International standard protocol for videoconferencing. It uses the Internet for connectivity between endpoints. Endpoints can be client videoconferencing terminals, Multipoint Control Units (MCUs), or gateways. Client endpoints allow users to interact with other users at remote sites using the audio and video capabilities of the device (e.g. camera, display, microphone, and speaker).

Point-to-Point Videoconferencing
Consider two client terminals, or end points (LifeSize or Polycom videoconferencing units, for example) that are connected to the Internet. The videoconferencing unit allows users to make calls to other clients, send their local audio/video streams to other remote clients, and hear/view the received audio/video streams on local speakers/monitors that are connected to their videoconferencing units.

Assume one user (the local user) uses a videoconferencing unit to call a user at a remote videoconferencing unit by entering the IP address of the remote endpoint. The clients setup a call between the stations following the specifications of the H.323 protocol. Once the call is setup, the clients exchange audio/video streams over the Internet. The point-to-point videoconference continues until one of the users disconnects the call.

One of the problems with this type of video call is that IP addresses are used for the call. IP addresses are difficult to remember; some users have dynamically assigned (DHCP) IP addresses that can change every time they boot their system; and we have noted problems in using IP addressing when different vendor systems are used. We thus do not recommend the use of IP dialing although it is occasionally used. Refer to the write-up on "Understanding Addressing Issues" for a better understanding of addressing standards used at Northwestern University.

The Gatekeeper
To alleviate the problem of IP dialing, the H.323 standard defines the use of hardware called a “gatekeeper”. The gatekeeper is a system that connects to the Internet just like the client terminals. The IP address of the gatekeeper is configured into the client terminals. When the clients "power up", they communicate with the gatekeeper and transfer certain information to the gatekeeper that describes the client. This process is known as “registration”; the client registers with the gatekeeper. Two identifiers are then assigned to and configured in each client terminal.

One of the identifiers is a H.323 Alias. It is usually descriptive of the particular client terminal and usually contains alphanumeric characters. The user’s email address is often used as the H.323 alias. The other descriptor is the H.323 Extension. The H.323 extension usually consists of several numbers and can be thought of as being the "video telephone number" of the client. While it is possible to use either the H.323 Alias or the H.323 extension for dialing, it is difficult to dial alphanumeric characters on most clients. Thus it is the H.323 extension that is normally used for dialing. Refer to the write-up on "Understanding Addressing Issues” for a better understanding of addressing standards used at Northwestern University.

When the clients register with the gatekeeper, they relay their IP numbers, H.323 alias, and H.323 extension to the gatekeeper where it is stored. This allows a local user to dial a remote user by entering the remote user’s H.323 extension (video telephone number) rather than an IP address. The local client terminal communicates the H.323 extension to the gatekeeper. The gatekeeper then checks to see if the remote client is registered with the gatekeeper. If it is, the gatekeeper sets up the call between the two clients; if it is not registered, the call is rejected. Once the call has been setup, the audio/video streams flow directly between the clients over the Internet.

Multipoint Videoconferences
Multipoint videoconferences exist between three or more clients. In this scenario, the H.323 standard introduces the concept of a Multipoint Control Unit (MCU). The MCU is an endpoint that can be thought of as a "video bridge". The MCU connects to the Internet as does any other endpoint and registers with the gatekeeper as does any other endpoint. Depending on its design capacity, it can handle a certain number of simultaneous videoconferences. Each videoconference is independent of others and each has connections from three or more sites. Each audio/video stream is logically separate from the others. To place a multipoint videoconference call, you must use a particular dial string that was setup for that conference when the conference was scheduled (created).

When users want to join a particular videoconferencing session, they dial the pre-assigned conference number. The gatekeeper checks to see if that number has been registered by the MCU. If it has, the gatekeeper completes the call by connecting the client to the specified videoconference on the MCU; if the service has not been registered, the call is rejected.

Once the call has been connected, the client's audio/video stream is then sent over the Internet from the client to the MCU. Similarly, other clients connect to the same session and send their audio/video streams to the MCU. The MCU selects one of the audio/video streams on the videoconference and returns that audio/video stream to all of the other clients (that is all except the client whose stream was selected). There are several methods for selecting an audio/video stream. Audio switching and chairman control are two alternatives. Typically, the method that is chosen is audio switching where the MCU selects the stream that currently has active audio (someone is talking or is talking the loudest). We frequently refer to this selection process by saying that this particular stream (client) has "captured" the MCU. As the user(s) at one site stop talking and the user(s) at another site start to talk, they capture the MCU. The process is repeated with the video from the newly selected site now being sent to all the other sites, and the newly selected site getting the video from the previously selected site.

Streaming

To participate in a H.323 videoconference, users must have appropriate H.323 videoconferencing client terminals and Internet connectivity with sufficient bandwidth to support the videoconference. Some users may not have these capabilities but would still like to be able to participate, even if that meant that they could only see and hear conference participants but not be able to interact with them. This can be accomplished if the videoconference session is captured, encoded in an appropriate format, and streamed over the Internet.

Fortunately the Codian MCU that we are using at Northwestern University has the capability to stream a videoconference that is being held on the MCU. Users can receive the stream using a browser on a computer. They enter the URL of the MCU; fill in a form that specifies the selected session to watch; and the server starts the encoded audio/video stream over the Internet to the computer. Plug-Ins for the browser exist that are capable of decoding both RealVideo and Windows media streams. The user can thus see and hear the participants in the streamed videoconference in near real-time.

Archiving
The videoconferencing equipment installed at Northwestern University also has the capability to archive a videoconference. In this manner, a user can connect to a server at a latter date and view the archived version of the videoconference. However, at this time, this process has not been automated. Users can check with the Academic Technology's videoconferencing staff to see if the archiving capability can be made available for a particular videoconference.

Gateways
Many sites have videoconferencing rooms that implement the H.320 standard that uses telecommunication lines (e.g. dial-up or dedicated ISDN lines) rather than the H.323 standard which specifies the Internet for connectivity. The H.323 standard was developed after the H.320 standard and uses many of the encoding/decoding protocols originally developed for H.320. The H.320 systems can be considered to be legacy systems, but since many of them still exist, it is important that we continue to support H.320.

The Northwestern University infrastructure provides a “gateway” that allows videoconferences using both types of videoconferencing units in the same conference. The gateway provides a path between H.320 and H.323 systems. It translates H.320 commands and audio/video streams to H.323 audio/video streams and vice versa. Users with H.320 client terminals dial the gateway over ISDN lines.

Using the H.320 equipment the user dials the ISDN telephone number of the Northwestern gateway followed by a pound (#) sign (847 467 0001#). The H.320 client then needs to input the dial string for the selected session (this can either be a MCU session number or the H.323 extension of a H.323 endpoint), and the gateway connects the H.320 terminal to the selected session. All H.323-based users can see and hear the H.320-based users as if they were on H.323 terminals, and similarly the H.320-based users can see and hear the H.323 users as if the were on H.320-based terminals. Multiple H.320 connections can be made to the gateway up to the capacity of the gateway.

One other benefit of the gateway is that it can accept calls from standard telephones. A user with a standard telephone dials the ISDN telephone number (847 467 001#) and is connected to the gateway. The telephone user then enters the dial number of the desired videoconferencing session. The user can then hear the entire audio from the videoconference and can also interact with others in the conference. The gateway is able to simultaneously connect multiple telephone calls and can even connect to a telephone bridge that could allow participation by a large number of audio only users.

Bandwidth Considerations
The H.323 client terminals encode the selected audio (usually from a microphone) and video (usually from a camera) inputs. The encoded audio and video are then compressed into a single audio/video stream and sent to the remote end point (another client terminal or the MCU). Different rates can be selected for the encoding process. As an example, an encoding rate of 384 Kbps is common. 64 Kbps is reserved for the audio and 320 Kbps is reserved for the video. The 384 Kbps stream is compressed (redundancy is removed) and sent to the remote endpoint. Similarly a 384 Kbps stream is received from the remote end point. Thus approximately twice 384 Kbps in bandwidth (less any bandwidth saved because of compression) is required to support the videoconference for this end point. If there is a lot of motion in the video, very little compression is achieved. If there is almost no motion in the video, the savings approaches about 50%. Since we must design for the worst case, assume a full duplex bandwidth requirement of twice 384 Kbps.

Faster encoding rates can be selected. Most client terminals support rates up to 768 Kbps. Some videoconferencing implementations can encode at speeds up to 2 Mbps. Higher encoding rates provide better quality video. However, higher encoding rates also mean higher bandwidth requirements and greater impact on the network. Lower encoding speeds can also be selected down to about 128 Kbps. This of course means lower video quality. 384 Kbps is a good compromise between quality on one hand and resource impact on the other. 384 Kbps will support 30 frames per second video. Lower encoding speeds yields lower frame rates and choppy video. There is a discernable but small improvement in quality between 384 Kbps and 768 Kbps. Having said this, some applications require very high resolution. New videoconferencing units are now coming onto the market that supports High Definition video.



Northwestern Home| Northwestern Search
Information Technology  Rebecca Crown Center  633 Clark Street  Evanston, Illinois 60208 
E-mail: it-feedback@northwestern.edu
World Wide Web Disclaimer and University Policy Statements 
© 2006 Northwestern University
This site is powered by ORIGAMI,
A Northwestern University Academic Technologies Web Publishing Platform built on Plone