基於群組模式下之大型網路語音會談系統

A Cluster-Based Transmission Scheme for Large Scale VoIP Conferencing

邵育晟

　　利用VoIP進行網路會談(VoIP Conference)在現今的社會已有越來越流行的趨勢，尤其是遠距離語音會議的應用方面，不但能節省費用，還能同時允許多人上線會談，但隨著上線人數的增加，網路的頻寬需求與連線數量相對倍增，通話延遲時間也因頻道擁塞而難以控制，此時會議的品質(QoS)往往大打折扣，網路的負擔也會大幅增加。現行的主要解決方案為選擇網路會議中處理能力較強的使用者設備，將來自個別說話者的聲音經過混音疊加後再轉送給其他使用者，再利用不同架構的multicasting tree廣播語音封包至所有成員，藉此減少網路頻寬與連線數量的負擔，達到增加同時說話人數的目的。但這些作法在使用者所在位置過於分散、遙遠時，仍然會造成許多節點無法達到所需的通話品質。

　　本研究針對在大型網路語音會談中，與會人數過多造成通話品質不佳之問題，提出了解決方法，我們假設會議中大部份時間僅有一個說話者，首先為每位使用者加上靜音消除機制，只有發話者的封包直接送給其他所有與會人員，如此不但降低網路負擔，更可以刪除混音的需求，大幅降低傳輸時間，此外並以分群組（Cluster）的方式將位於同一地區的使用者分為同一群組，群組內以Full Mesh 機制相連，如此便能建立一複雜度較小的MLDST (Minimum Loss Diameter Spanning Tree) 樹狀結構的multicasting tree用以廣播語音封包至所有成員，可以進一步控制傳輸時間及封包遺失率。我們根據使用者之間彼此的連線速度與實際距離來分群組，從群組中選出連線能力較佳的節點作為群組轉送點(Cluster Head)，透過此轉送點收到聲音後做群組內的發送，並藉由一個multicasting tree將封包在嚴格控制延遲時間及封包遺失率之下，將語音封包廣播至其他所有Cluster Head，以確保VoIP網路會議之品質。

　　我們在PlanetLab這種實際網路的實驗平台上進行實驗，評估本方法的效能，實驗結果顯示我們的方法可以在與會人數達到20人時，仍能有效控制延遲時間在350ms左右，若利用Google雲端運算系統的協助，平均延遲時間將可減少50ms 至70ms。由於PlanetLab上的電腦性能普遍不佳，我們預期本技術應用於實際網路時，可以增加與會人數。

A Cluster-Based Transmission Scheme for Large Scale VoIP Conferencing

VoIP conferencing is becoming more and more popular in modern life, especially for long distance conferences. It can save a great deal of cost and traveling time. However, the increasing number of online users not only induces an explosive growth of bandwidth demand, but also reduces the quality of voice. Traditional multi-party conferencing systems select those computers that have larger capacity to combine the voice streams from all participants and to distribute the aggregated voice stream back to all participants by various multicasting schemes. Although these solutions can alleviate the burden of network bandwidth demand, the extended long service time remains the biggest obstacle to the improvement of voice quality.

This research proposes a Cluster Multicasting Tree (CMT) aiming to increase the size of VoIP conferencing with acceptable voice quality. Assuming that there is only one speaker in the conferencing for most of the time, CMT applies a silence suppression mechanism to block non-speech voice streams to reduce the number of voice streams injected into the network. CMT employees a special multicasting tree allowing each participant broadcast his/her voice directly to all other participants. By eliminating the need of voice aggregation (mixing), the service delay can be greatly reduces. Furthermore, CMT reduces the zigzag effect existing in the transmission paths by dividing the participants into clusters according to their physical distances. Each member of a cluster can broadcast its voice stream directly to all others members in the same cluster. CMT then employees a multicasting tree called MLDST (Minimum Loss-Diameter Spanning Tree) allowing the head of each cluster to forward and multicast the voice streams generated by cluster members to all other clusters. MLDST is able to control the transmission time as well as loss rate. We evaluate the performance of CMT on the PlanetLab testbed against a few existing VoIP conferencing schemes.

The experimental results show that the service delay can be effectively controlled under 350ms when there is no more 20 participants in a VoIP conference. With the assistance of Google cloud service system, the average delay can be reduced by 50ms to 70ms. In reality, the performance of the computing devices on the PlanetLab testbed is generally far behind their counter parts in the real world. Therefore, we anticipate our transmission scheme will be able to allow more participants in a real world VoIP conference.