利用vocal 论坛中Ben发布的新的branch,我们可以增加自己的音频或者视频codec,开发sip-based video phone UA, 因为开发过程很是匆忙,中间文档保留不全,先就一些大体进行描述,有什么问题需要交流,直接联系偶 xiaoguizi@gmail.com
系统中牵涉到开发视频ua的主要地方有 session的建立控制,SDP部分的修改,视频(capture(V4L接口编程),encode,decode,RTP payload for ***等RFC文件的理解并parse的工作),(如果需要GUI界面可以使用GTK或者QT进行开发),首先我们看下原始的vocal代码中的接收发送数据流程
接收发送数据流程a) MrtpSession->Thread() (while循环运行判断当前session的socket套接字是否有可读数据)
如果有则调用MRtpSession::processIncomingRTP()来处理,再次判断套接字可读
注:增加视频,此处应更改为判断2个socket套接字是否有可读数据,然后分别到不同的下层处理
确认后MRtpSession::processRecv(RtpPacket* packet, const NetworkRes& sentBy)
>>中间(RtpSession*)rtpStack->receive();
(RtpReceiver*)recv->receive() 得到若干个包并添加进本地的buf (包括排序,插入静音数据,计算jitter等),直至没有数据然后返回buf的最前面的packet
>>然后MRtpSession::processRecv(RtpPacket* packet, const NetworkRes& sentBy)
解包
调用(MediaSession*)mySession->processRaw(×××)
如果是发送数据则:
调用(Adaptor*)myRtpSession->sinkData(data, len, cType),仍然到MrtpSession中处理,即
打包
调用MRtpSession::sinkData (char* data, int length, VCodecType type)
进行数据codec转换,如音频转换为PCM数据,还有别的考虑
最后调用(RtpSession*)rtpStack->transmitRaw(data, length)
调用(RtpTransmitter)tran->transmitRaw(inbuffer, len); // takes rawdata, buffers it, and send network packets
调用RtpTransmitter::transmit(RtpPacket*packet,bool eventFlag)//takes api RTP packet and send to network
在经过一系列的参数判断处理后,继续调用(UdpStack*)myStack->
transmitTo( (char*)p->getHeader(), p->getTotalUsage(), &remoteAddr );
如果是收取的数据则送至声卡(显示):
调用(MediaDevice*)myMediaDevice->sinkData(data, len, cType);(虚函数)其实调用的是
LinAudioDevice::sinkData(char* data, int length, VCodecType type)
解码,并写入声卡
下面我们就如何修改做详细描述
2. 如何修改代码使其适应多session (A/V/Message) 环境a) MrtpSession控制对MediaSession, RtpSession等的控制
b) MediaSession负责
c) RtpSession负责
d)
为了创建多session环境,我们需要在MRtpSession中创建MRtpSubSession, 负责现有的MRtpSession的任务,同时改造后的MRtpSession负责协调单个MRtpSubSession以及与协议之间的关联,因此需要
a) 创建MRtpSubSession类,其中负责联系MediaSession, RtpSession
b) 创建MRtpSubsessionIterator类,负责MRtpSession与MRtpSubSession之间的联系
3. MediaController 类说明
构造
初始化根据端口范围创建UdpStack[应该更改为根据媒体”m=”数目创建UdpStack数目]
创建supported codec list,[修改default list值,增加视频codec],然后注册列表
createSession
创建MediaSession以及设定default属性??
a) CreateSessionImpl 得到本地可用地址和端口,实例化MediaSession
b) NegotiateSdp <Based on the remote SDP and capability, create a compatible Local SDP and reserve associated Media Resources>
4. RtpReceiver类说明
构造(3种,区别在于socket端口是否指定,以及拷贝UdpStack)
创建UdpStack实例
constructRtpReceiver 初始化变量,如接收数目,jitter buffer属性,format以及baseSampleRate等
RtpReceiver::receive(RtpPacket& pkt, fd_set* fds)
根据fd_set内的socket描述符,读取socket数据,原始为循环读取当前socket所有packet数据,插入jitterBuffer中,返回最前面一个packet数据
需要修改否?
流程
a) 判断本地socket描述符是否存在fds中
b) getPacket(pkt);
int RtpReceiver::getPacket(RtpPacket& pkt)
a. (UdpStack*)myStack->receiveFrom(*)
b. Check the packet ‘s validation
c. Check if rtp event [rtpPayloadDTMF_RFC2833 || rtpPayloadCiscoRtp]
Yes
[payloadType is ok, then getDTMFInterface() ok, then recvEvent(pkt)]
No
d. updateSource(pkt)
增加SSRC if not exist.
判断包是否wrap, if yes, will drop it
判断是否为CN(comfort noise)包,是就丢弃,【这条以及下面的时间更新在video中应该不同】
对于数据包的检查,如SN以及timestamp的判断[与sampleSize挂钩],另外该包如果为很久很久以前的,则discard
BTW:【上面两条都要更改,因为这步操作下面之后才开始将包insert jitterBuffer,音频等长,视频确不是,因此,需要在c】之后再更改】
c) 得到本地时间,并判断是否需要进行字节顺序转变
d) 根据packet的SN,将包插入jitterBuffer[size可调]中,同时根据timestamp插入相应的slienceData【视频处理不同,应该big revise】
5. 增加H263codec.cxx (在rtp/codec下)
call sip:010@10.10.5.59
增加处理特殊头的程序
Part I: Call Flow Setup
CallAgent::startSession
>> According to the localSdp? Maybe according nothing?? Because now the communication didn’t begin yet, so the mediadevice will be added later
mDevice = facade->getMediaDevice(); --------------------------------------- (1)
MediaController::instance().addDeviceToSession(sessionId, mDevice); ----------(2)
----------------------------------------------------------------------------------------------------------------------
(1) UaFacade::getMediaDevice()
myMediaDevice = new MediaDeviceSet()
(2) MediaController::addDeviceToSession(…)
(MediaSession) mSession->addToSession(mDevice);
(2.1) MediaSession::addToSession( Sptr<MediaDevice> mDevice)
myMediaDevice = mDevice; (then under the thread, the mediasession will use this variable to control the device)
----------------------------------------------------------------------------------------------------------------------
VV_MediaDeviceSet::addMediaDevice
>>This is later added class, according to the MediaDevice, just used one structure named
typedef struct
{
DeviceThread* pThread;
MediaDevice* pDevice;
}MediaDeviceInstance;
to get the device and it’s thread together, and in the sip\gua\UaFacade.cxx, it create one MediaDevice variable, that valued with new MediaDeviceSet(), and when running, will using Polymorphism to arrive the target
Figure 1.1 Create the Device
Part II: Device in detail
This part we will focus on the device itself, and it can be divided into two group: audio device and video device
I. Audio DeviceThis was built by the Vocal itself, we only revise some part such as silence data insert, timestamp recalculate, but the audio device should obey the rules when we create new session, cooperate with the video device, maybe need A/V synch…
Please check the detail in the vocal\sip\gua\LinAudioDevice.cxx
::start()
::stop()
::suspend()
::resume()
::sinkData(…)
::process()
II. Video DeviceThis class is the one we added, it now using the V4L interface to communicate with the omi-vision driver (open source), as what we have said before, it will cooperate with audio device too.
VV_VideoDevice.cxx (sip\ gua)
In it’s constructor, something will be decided, such as if using software decoding, frameBuf size and count,
::start()
devCamera.open();
//
MediaDevice::start();
if (!usingSWDec) //Hard decode
{
my_H263HDThread = new H263HardDecodeThread;
my_H263HDThread->thread();
}
::stop()
MediaDevice::stop(); //Stop the base class
if(!usingSWDec)
{
StopDecode(); //Release hardware decoder
}
::suspend()
not fulfill…
::resume()
not fulfill…
::sinkData(…)
using the decode codec (hardware or software), get the decoded frame data and put it to the screen (using GTK )
::process()
In this step, will using the camera class to get the raw YUV data, and encode it to 263 frame, then deliver it to the sending thread… (call MediaDevice->processRaw, then using MediaSession ->processRaw, then …)
pData = devCamera.getFrameData((unsigned int *)&dataLen); //the driver interface
pData = H263EncoderGetFrameIntraEx(pData, &dataLen); // 263 encoder
if(frameBuf.setData(pData, dataLen) != true) … // make sure
processOutgoing(); // Put the 263 frame out
Figure 2.1 Send out frame flow
VV_VideoCamera.cxx (sip\ gua)
::open();
Make some initial work according to the video device driver and V4L interface
Most of the work done in the VV_VideoCapture
::close();
Release the video device
//
char *getFrameData(unsigned int *frameSize);
Get one frame in infinite loop
VV_VideoCapture.cxx (sip\ gua)
This is the V4L interface programming, good example can be refer from the MPEG4IP open source…
int initdevice();
void ReleaseDevice();
int SetPictureControls();
int setPicturePro();
int AcquireFrameEx(char *pFrameBuf, unsigned int *frameSize);