在浏览器网络应用程序中同时对两个视频进行姿势检测不起作用

本文介绍了在浏览器网络应用程序中同时对两个视频进行姿势检测不起作用的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我编写了以下 Web 应用程序来对两个视频执行姿势检测.这个想法是，比如说，在第一个中给出一个基准视频，在第二个中给出一个用户视频(预先录制的视频或他们的网络摄像头)，然后比较两者的动作.

I have written the following web app to perform pose detection on two videos. The idea is to, say, give a benchmark video in the first and a user video (either a pre-recorded one or their webcam feed) in the second, and compare the movements of the two.

import dash, cv2
import dash_core_components as dcc
import dash_html_components as html
import mediapipe as mp
from flask import Flask, Response

mp_drawing = mp.solutions.drawing_utils
mp_pose = mp.solutions.pose

class VideoCamera(object):
    def __init__(self, video_path):
        self.video = cv2.VideoCapture(video_path)

    def __del__(self):
        self.video.release()

    def get_frame(self):
        with mp_pose.Pose(min_detection_confidence=0.5, min_tracking_confidence=0.5) as pose:
            success, image = self.video.read()

            # Recolor image to RGB
            image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
            image.flags.writeable = False

            # Make detection
            results = pose.process(image)

            # Recolor back to BGR
            image.flags.writeable = True
            image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

            # Render detections
            mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_pose.POSE_CONNECTIONS,
                                        mp_drawing.DrawingSpec(color=(245,117,66), thickness=2, circle_radius=2),
                                        mp_drawing.DrawingSpec(color=(245,66,230), thickness=2, circle_radius=2)
                                     )

            _, jpeg = cv2.imencode('.jpg', image)
            return jpeg.tobytes()


def gen(camera):
    while True:
        frame = camera.get_frame()
        yield (b'--frame\r\n'
               b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n\r\n')

server = Flask(__name__)
app = dash.Dash(__name__, server=server)

@server.route('/video_feed_1')
def video_feed_1():
    return Response(gen(VideoCamera(0)), mimetype='multipart/x-mixed-replace; boundary=frame')

@server.route('/video_feed_2')
def video_feed_2():
    return Response(gen(VideoCamera(0)), mimetype='multipart/x-mixed-replace; boundary=frame')

app.layout = html.Div([
    html.Img(src="/video_feed_1", style={'width' : '40%', 'padding': 10}),
    html.Img(src="/video_feed_2", style={'width' : '40%', 'padding': 10})
])

if __name__ == '__main__':
    app.run_server(debug=True)

但是，当我运行此代码时，我笔记本电脑上的风扇开始启动，并且它不会在浏览器中呈现任何内容.它适用于任何视频，但似乎只能处理一个视频.您可以删除 video_feed_1() 或 video_feed_2() 两个函数中的任何一个，也可以替换 0 中的视频路径(即网络摄像头)与任何其他视频的路径(例如，/path/to/video.mp4)，并且它工作正常.

However, when I run this code, the fans on my laptop start to kick in and it doesn't render anything in the browser. It works fine with any video, but it seems to be able to handle only one video. You can remove either of the two functions video_feed_1() or video_feed_2(), and you can also replace the video path from 0 (which is webcam) with the path to any other video (like, /path/to/video.mp4), and it works fine.

此外，当我只是在浏览器中显示两个视频时，它也能正常工作.您也可以尝试使用以下代码替换上面类中的 get_frame() 函数:

Also, when I simply display two videos in the browser, that too works fine. You can try this out too by replacing the get_frame() function in the class above with the following:

def get_frame(self):
    success, image = self.video.read()
    ret, jpeg = cv2.imencode('.jpg', image)
    return jpeg.tobytes()

那么，在同时渲染两个视频的姿态估计时，如何减少浏览器的负载?为什么在浏览器中渲染时负载如此之高，当姿势估计默认渲染在两个弹出窗口(即，使用 cv.imshow(image))时它工作得非常好?

So, how do I reduce the load on the browser when rendering the pose estimation of two videos simultaneously? And why is the load so high anyway when rendering in browser, when it works perfectly fine when the pose estimations render by default on two pop-up windows (i.e., with cv.imshow(image))?

推荐答案

对于像姿势估计这样需要实时更新的任务，我建议使用 websockets 进行通信.这是一个小示例，其中 Quart 服务器通过 websockets 将数据流式传输到 Dash 前端，

For a task that requires real time updates like your pose estimation, I would recommend using websockets for communication. Here is a small example where a Quart server streams the data via websockets to a Dash frontend,

import asyncio
import base64
import dash, cv2
import dash_html_components as html
import mediapipe as mp
import threading

from dash.dependencies import Output, Input
from quart import Quart, websocket
from dash_extensions import WebSocket

mp_drawing = mp.solutions.drawing_utils
mp_pose = mp.solutions.pose


class VideoCamera(object):
    def __init__(self, video_path):
        self.video = cv2.VideoCapture(video_path)

    def __del__(self):
        self.video.release()

    def get_frame(self):
        with mp_pose.Pose(min_detection_confidence=0.5, min_tracking_confidence=0.5) as pose:
            success, image = self.video.read()

            # Recolor image to RGB
            image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
            image.flags.writeable = False

            # Make detection
            results = pose.process(image)

            # Recolor back to BGR
            image.flags.writeable = True
            image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

            # Render detections
            mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_pose.POSE_CONNECTIONS,
                                      mp_drawing.DrawingSpec(color=(245, 117, 66), thickness=2, circle_radius=2),
                                      mp_drawing.DrawingSpec(color=(245, 66, 230), thickness=2, circle_radius=2)
                                      )

            _, jpeg = cv2.imencode('.jpg', image)
            return jpeg.tobytes()


# Setup small Quart server for streaming via websocket, one for each stream.
server = Quart(__name__)
n_streams = 2


async def stream(camera, delay=None):
    while True:
        if delay is not None:
            await asyncio.sleep(delay)  # add delay if CPU usage is too high
        frame = camera.get_frame()
        await websocket.send(f"data:image/jpeg;base64, {base64.b64encode(frame).decode()}")


@server.websocket("/stream0")
async def stream0():
    camera = VideoCamera("./kangaroo.mp4")
    await stream(camera)


@server.websocket("/stream1")
async def stream1():
    camera = VideoCamera("./yoga.mp4")
    await stream(camera)


# Create small Dash application for UI.
app = dash.Dash(__name__)
app.layout = html.Div(
    [html.Img(style={'width': '40%', 'padding': 10}, id=f"v{i}") for i in range(n_streams)] +
    [WebSocket(url=f"ws://127.0.0.1:5000/stream{i}", id=f"ws{i}") for i in range(n_streams)]
)
# Copy data from websockets to Img elements.
for i in range(n_streams):
    app.clientside_callback("function(m){return m? m.data : '';}", Output(f"v{i}", "src"), Input(f"ws{i}", "message"))

if __name__ == '__main__':
    threading.Thread(target=app.run_server).start()
    server.run()

虽然这个解决方案的性能明显更好(至少在我的笔记本电脑上)，但资源使用率仍然很高，所以我添加了一个 delay 参数，它可以以帧为代价降低资源使用率率降低.

While this solution performs significantly better (on my laptop at least), the resource usage is still high, so I added a delay parameter that makes it possible to lower resource usage at the expense of frame rate reduction.

这篇关于在浏览器网络应用程序中同时对两个视频进行姿势检测不起作用的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！