截屏和录屏现在已经变成了各个系统中最最基础的功能了,特别是直播的兴起和疫情促进的会议系统,诸如远程办公,都会使用到系统录屏的功能,录屏的快慢又决定了我们直播和会议的流畅程度。
最近各个大厂商也推出了很多屏幕截屏的优化方案。
对于远程录屏系统,在使用webrtc中,碰到了一些问题,比较明显的一点是,在macos系统中,进行远程投屏的时候,帧数上不去,只能维持在20帧左右,甚至有的时候更低。在排查问题的时候,看了下底层的源代码。
在macos中如果系统自带的截屏方式的话,那么用的是如下的代码:
// static
std::unique_ptr<DesktopFrameCGImage> DesktopFrameCGImage::CreateForDisplay(
CGDirectDisplayID display_id) {
// Create an image containing a snapshot of the display.
rtc::ScopedCFTypeRef<CGImageRef> cg_image(CGDisplayCreateImage(display_id));
if (!cg_image) {
return nullptr;
}
return DesktopFrameCGImage::CreateFromCGImage(cg_image);
}
webrtc 使用的是CGDisplayCreateImage 系统api 进行截屏,但是发现在macos下还有其他的截屏录屏方式,我们依次的进行对比。


CGDisplayCreateImage 可以排除特定窗口,直接截取当前屏幕的图像,使用的方法如下:
- (void)CGDisplayCreateImageDesktop {
uint32_t displayID = [self getDirectDisplayID];
[self displayInfo:displayID];
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
while ( !self.isCancel ) {
CGImageRef image = CGDisplayCreateImage(displayID);
CGDataProviderRef provider = CGImageGetDataProvider(image);
CGDataProviderCopyData(provider);
CFRelease(image);
[self calculateFps];
}
});
}
测试了下,在Catalina下,可以达到60帧左右的,CGDisplayCreateImage 不能控制具体的帧数,占用的cpu比较高,录屏中的鼠标无法去除。
如果之前有做过摄像头采集的话,那么使用AVCaptureScreenInput进行录屏,你会觉得熟悉很多。用AVCaptureSession 来调用屏幕截屏,屏幕截屏的数据就会在回调中。使用的方式如下:
- (void)captureDesktopWithCaptureScreenInput {
_captureSession = [[AVCaptureSession alloc] init];
AVCaptureVideoDataOutput* captureOutput = [[AVCaptureVideoDataOutput alloc] init];
NSString* key = (NSString*)kCVPixelBufferPixelFormatTypeKey;
NSNumber* val = [NSNumber numberWithUnsignedInt:kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange];
NSDictionary* videoSettings = [NSDictionary dictionaryWithObject:val forKey:key];
captureOutput.videoSettings = videoSettings;
if ([_captureSession canAddOutput:captureOutput]) {
[_captureSession addOutput:captureOutput];
}
[captureOutput setSampleBufferDelegate:self
queue:dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)];
NSArray* currentInputs = [_captureSession inputs];
// remove current input
if ([currentInputs count] > 0) {
AVCaptureInput* currentInput = (AVCaptureInput*)[currentInputs objectAtIndex:0];
[_captureSession removeInput:currentInput];
}
// now create capture session input out of AVCaptureDevice
uint32_t count = 0;
CGDirectDisplayID displayIDs[3] = {0};
CGGetOnlineDisplayList(3, displayIDs, &count);
AVCaptureScreenInput* newCaptureInput = [[AVCaptureScreenInput alloc] initWithDisplayID:displayIDs[0]];
newCaptureInput.minFrameDuration = CMTimeMake(1, 120);
// try to add our new capture device to the capture session
[_captureSession beginConfiguration];
BOOL addedCaptureInput = NO;
if ([_captureSession canAddInput:newCaptureInput]) {
[_captureSession addInput:newCaptureInput];
addedCaptureInput = YES;
} else {
addedCaptureInput = NO;
}
[_captureSession commitConfiguration];
}
- (void)captureOutput:(AVCaptureOutput *)output didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection {
const int kFlags = 0;
CVImageBufferRef videoFrame = CMSampleBufferGetImageBuffer(sampleBuffer);
if (CVPixelBufferLockBaseAddress(videoFrame, kFlags) != kCVReturnSuccess) {
return;
}
CVPixelBufferUnlockBaseAddress(videoFrame, kFlags);
count_pic++;
if (count_pic > 1000) {
count_pic = 0;
}
[self calculateFps];
}
在newCaptureInput.minFrameDuration = CMTimeMake(1, 120); 中,把帧率设置为120 ,当然实际的可能达不到这么大,在这台Catalina下,基本可以达到100帧左右,他同时也可以控制鼠标是否出现,占用的cpu并不高。
在回调captureOutput 中,给出了CVImageBufferRef 格式的数据,方便我们的调用。
但是在webrtc中,并没有AVCaptureScreenInput,改怎么添加进去呢?
其实也比较简单,如果你熟悉了webrtc中,macos下的摄像头推流方式,同样的你也可以在 captureOutput 中进行推流。
var peerConnectionFactory: RTCPeerConnectionFactory?
var localVideoSource: RTCVideoSource?
var videoCapturer: RTCVideoCapturer?
func setupVideoCapturer(){
// localVideoSource and videoCapturer will use
localVideoSource = self.peerConnectionFactory!.videoSource()
videoCapturer = RTCVideoCapturer()
// localVideoSource.capturer(videoCapturer, didCapture: videoFrame!)
let videoTrack : RTCVideoTrack = self.peerConnectionFactory!.videoTrack(with: localVideoSource, trackId: "100")
let mediaStream: RTCMediaStream = (self.peerConnectionFactory?.mediaStream(withStreamId: "1"))!
mediaStream.addVideoTrack(videoTrack)
self.newPeerConnection!.add(mediaStream)
}
override func processSampleBuffer(_ sampleBuffer: CMSampleBuffer, with sampleBufferType: RPSampleBufferType) {
switch sampleBufferType {
case RPSampleBufferType.video:
// create the CVPixelBuffer
let pixelBuffer:CVPixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)!;
// create the RTCVideoFrame
var videoFrame:RTCVideoFrame?;
let timestamp = NSDate().timeIntervalSince1970 * 1000
videoFrame = RTCVideoFrame(pixelBuffer: pixelBuffer, rotation: RTCVideoRotation._0, timeStampNs: Int64(timestamp))
// connect the video frames to the WebRTC
localVideoSource.capturer(videoCapturer, didCapture: videoFrame!)
break
}
}