NAOロボットゴルフにおけるビジョンシステム設計

2022-03-15 06:11:41

昨年（2017年）、江蘇省と全国でそれぞれ開催されたNAOロボットゴルフ大会に参加し、ビジョン部分のプログラミングを担当しました。以下、前作の概要です。内容は主に、赤いボールと黄色いショットの認識と位置決めです（競技中に遭遇した問題と解決策も含みます）。完全なコード（C++版とPython版の両方）は、以下のサイトで公開されています。 https://github.com/ZhouJiaHuan/nao-golf-visual-task この記事では、例としてPythonのコードについてだけ説明します。

基本構成(ベースクラス)

コードの実装では、拡張しやすいように、まず、最も基本的な設定情報の一部を定義する基底クラスを定義し、次に、赤玉黄玉検出と黄棒検出のためのビジョンクラス（モーションコードを書くと、モーション用のクラスも導出できます）を導出しました。ベースクラスは以下のように定義されています。

# date: 1/15/2017
# description: basic class for all Nao tasks.
## ---------------------------------------------------------------------

import sys
# sys.path.append("/home/meringue/Softwares/pynaoqi-sdk/") # naoqi directory
from naoqi import ALProxy

class ConfigureNao(object):
    """
    a basic class for all nao tasks, including motion, bisualization etc.
    """
    def __init__(self, IP):
        self._IP = IP
        self._PORT = 9559
        self._cameraProxy = ALProxy("ALVideoDevice", self._IP, self._PORT)
        self._motionProxy = ALProxy("ALMotion", self._IP, self._PORT)
        self._postureProxy = ALProxy("ALRobotPosture", self._IP, self._PORT)
        self._tts = ALProxy("ALTextToSpeech",self._IP, self._PORT)
        self._memoryProxy = ALProxy("ALMemory", self._IP, self._PORT)

コードからわかるように、ベースクラスConfigureNaoは主にIPやポート番号などの基本的な設定情報を含み、NAOライブラリに付属するクラス（ビジュアル、音声、モーションなど）で使用する必要のあるいくつかのオブジェクトを作成します。後でビジョンクラスを定義するとき、ConfigureNao クラスを直接継承することができます。

Visionモジュール

Vision base class - ビジョンタスクのベースとなるクラスです。

ビジョンタスク（赤いボールの検出、黄色いバーの検出）はすべて基本的な機能（カメラからのデータ取得など）を共有する必要があるため、もう一つのビジョンベースクラスであるVisualBasisを定義し、ConfigureNaoから継承して使用します。その定義は以下の通りです。

class VisualBasis(ConfigureNao):
    """
    a basic class for visual tasks.
    """

    def __init__(self, IP, cameraId, resolution=vd.kVGA):
        """
        initilization. 
        Args:
            IP: NAO's IP
            cameraId: bottom camera (1,default) or top camera (0).
            resolution: kVGA, default: 640*480)
        Return: none
        """     
        super(VisualBasis, self). __init__(IP)
        self._cameraId = cameraId
        self._resolution = resolution

        self._colorSpace = vd.kBGRColorSpace
        self._fps = 20

        self._frameHeight = 0
        self._frameWidth = 0
        self._frameChannels = 0
        self._frameArray = None

        self._cameraPitchRange = 47.64/180*np.pi
        self._cameraYawRange = 60.97/180*np.pi
        self._cameraProxy.setActiveCamera(self._cameraId)


    def updateFrame(self, client="python_client"):
        """
        get a new image from the specified camera and save it in self._frame.
        Args:
            client: client name.
        Return: none.
        """

        """
        if self._cameraProxy.getActiveCamera() == self._cameraId:
            print("current camera has been activated.")
        else:
            self._cameraProxy.setActiveCamera(self._cameraId)
        """
        self._videoClient = self._cameraProxy.subscribe(client, self._resolution, self._colorSpace, self._fps)
        frame = self._cameraProxy.getImageRemote(self._videoClient)
        self._cameraProxy.unsubscribe(self._videoClient)

        try:
            self._frameWidth = frame[0]
            self._frameHeight = frame[1]
            self._frameChannels = frame[2]
            self._frameArray = np.frombuffer(frame[6], dtype=np.uint8).reshape([frame[1],frame[0],frame[2]])
        except IndexError:
            raise

    def getFrameArray(self):
        """
        get current frame.
        Return: 
            current frame array (numpy array).
        """
        if self._frameArray is None:
            return np.array([])
        return self._frameArray

    def showFrame(self):
        """
        show current frame image.
        """

        if self._frameArray is None:
            print("please get an image from Nao with the method updateFrame()")
        else:
            cv2.imshow("current frame", self._frameArray)

    def printFrameData(self):
        """
        print current frame data.
        """
        print("frame height = ", self._frameHeight)
        print("frame width = ", self._frameWidth)
        print("frame channels = ", self._frameChannels)
        print("frame shape = ", self._frameArray.shape)

    def saveFrame(self, framePath):
        """
        Save the current frame to the specified direction.      
        Arguments:
            framePath: image path.
        """

        cv2.imwrite(framePath, self._frameArray)
        print("current frame image has been saved in", framePath)

    def setParam(self, paramName=None, paramValue = None):
        raise NotImplementedError

    def setAllParamsToDefault(self):
        raise NotImplementedError

ビジュアルベースクラスでは、いくつかのデフォルトカメラパラメータを定義するほか、指定したカメラからのフレーム取得、現在保存されている画像データの返却、現在の画像の表示、現在の画像のローカル保存、および後で使用するいくつかの関数などの基本的なメンバー関数も定義されていますので、ここでは言い訳を残しておきます。このクラスはいくつかの簡単な関数なので、ここで紹介することはあまりありません。

上記のビジョンクラスが定義されたことで、以下では別途説明する赤玉検出クラスと黄棒検出クラスの導出に進むことができます。

赤玉検出クラス

実際、NAOの公式ライブラリは赤いボールを認識するためのAPIを提供していますが、私たちがテストしたところ、それは非常に非効率的で、いくつかのディストラクターに対して非常に脆弱であることが判明しました。そこで私たちは、OpenCVをベースにした赤玉認識用の独自のコードを書くことを計画しています。最もシンプルなアイデアは、色閾値セグメンテーション＋ホフサークル検出ですが、テスト中にこの2ステップだけでは検出結果が安定しないことが分かったので、これをベースにゲーム環境用に改良しました。

赤玉検出クラスはBallDetectと呼ばれ、VisualBasisクラスを継承しているため、VisualBasisクラスの基本プロパティとメンバ関数がすでに含まれているので、それをベースに赤玉検出に必要なメンバ関数を書き続けるだけです。

赤玉の基本プロパティ

赤いボールに関連する情報で、保存しておく必要があるものは2つあります。(1）画像内の赤玉の位置に関する情報、（2）ロボット座標系に対する赤玉の位置に関する情報です。そこで、まずこの2つの属性を以下のように定義します。

def __init__(self, IP, cameraId=vd.kBottomCamera, resolution=vd.kVGA):
    """
    initialization.
    """
    super(BallDetect, self). __init__(IP, cameraId, resolution)
    self._ballData = {
"centerX":0, "centerY":0, "radius":0}
    self._ballPosition= {
"disX":0, "disY":0, "angle":0}
    self._ballRadius = 0.05

画像内の赤いボールの位置と実際の位置がそれぞれ"_ballData"と"_ballPosition"に格納され、初期値はどちらも0に設定されています。

画像の前処理

この機能は、画像を特定のチャンネルに分離してフィルタリングするもので、RGB空間とHSV空間に別々の前処理関数を書きました。

RGB space preprocessing functions

def _getChannelAndBlur(self, color):
    """
    get the specified channel and blur the result.

    Arguments:
        color: the color channel to split, only supports the color of red, geen and blue.   
    Return: 
        the specified color channel or None (when the color is not supported).
    """
    try:
        channelB = self._frameArray[:,:,0]
        channelG = self._frameArray[:,:,1]
        channelR = self._frameArray[:,:, 2]
    except:
        raise Exception("no image detected!")

    Hm = 6
    if color == "red":
        channelB = channelB*0.1*Hm
        channelG = channelG*0.1*Hm
        channelR = channelR - channelB - channelG
        channelR = 3*channelR
        channelR = cv2.GaussianBlur(channelR, (9,9), 1.5)
        channelR[channelR<0] = 0
        channelR[channelR>255] = 255
        return np.uint8(np.round(channelR))

    elif color == "blue":
        channelR = channelR*0.1*Hm
        channelG = channelG*0.1*Hm
        channelB = channelB - channelG - channelR
        channelB = 3*channelB            
        channelB = cv2.GaussianBlur(channelB, (9,9), 1.5)
        channelB[channelB<0] = 0
        channelB[channelB>255] = 255
        return np.uint8(np.round(channelB))

    elif color == "green":
        channelB = channelB*0.1*Hm
        channelR= channelR*0.1*Hm
        channelG = channelG - channelB - channelR
        channelG = 3*channelG
        channelG = cv2.GaussianBlur(channelG, (9,9), 1.5)
        channelG[channelG<0] = 0
        channelG[channelG>255] = 255
        return np.uint8(np.round(channelG))

    else:
        print("can not recognize the color!")
        print("supported color: red, green and blue.")
        return None

Although it says red ball detection, I also added green and blue to account for code generality. Enhancing the value of the red channel and cutting the results in the other spaces when separating makes the separation better. Also adding Gaussian filtering makes the local information blurred in favor of the Hough circle detection.

HSV spatial preprocessing function

def _binImageHSV(self, color):
    """
    get binary image from the HSV image (transformed from BGR image)
    Args:
        color: the color for binarization.
    Return:
        binImage: binary image.
    """
    try:
        frameArray = self._frameArray.copy()
        imgHSV = cv2.cvtColor(frameArray, cv2.COLOR_BGR2HSV)
    except:
        raise Exception("no image detected!")

    if color == "red":
        minHSV1=np.array([0,43,46])
        maxHSV1=np.array([10,255,255])
        minHSV2=np.array([156,43,46])
        maxHSV2=np.array([180,255,255])
        frameBin1 = cv2.inRange(imgHSV, minHSV1, maxHSV1)
        frameBin2 = cv2.inRange(imgHSV, minHSV2, maxHSV2)
        frameBin = np.maximum(frameBin1, frameBin2)
        return frameBin
    else:
        raise Exception("not recognize the color!")

The input to this function is still an RGB-space image, which is internally converted to HSV-space for processing. The inRange() function in the OpenCV library is used here for binarization, and since there are two HSV ranges for the red color, the code is implemented as a "merge" operation. The function implementation only provides the extraction of red color, but it can be extended to other color preprocessing, and an experimental range of values for each color HSV space is given here.

在实验测试的时候，发现在一般的情况下，两个空间都能准确地把红球分割出来，但在光线条件较差的时候（较强或较弱），发现HSV空间更加稳定。

红球识别

图像预处理后，图像上会分割出红球所在区域和其他的一些噪声。理想情况下，红球所在的区域分割结果应该是一个圆（椭圆），这里我是直接通过OpenCV库中的霍夫圆检测函数实现的：

def _findCircles(self, img, minDist, minRadius, maxRadius):
    """
    detect circles from an image.
    Arguments:
        img: image to be detected.
        minDist: minimum distance between the centers of the detected circles.
        minRadius: minimum circle radius.
        maxRadius: maximum circle radius.

    Return: an uint16 numpy array shaped circleNum*3 if circleNum>0, ([[circleX, circleY,radius]])
            else return None.
    """
    circles = cv2.HoughCircles(np.uint8(img), cv2.HOUGH_GRADIENT, 1, minDist, 
                                param1=150, param2=15, minRadius=minRadius, maxRadius=maxRadius)

    if circles is None:
        return np.uint16([])
    else:
        return np.uint16(np.around(circles[0, ]))

根据比赛用球的大小要求可以大概限制一下红球在图像中的半径范围（和分辨率有关），代码中的参数是基于640×480的分辨率设置的。需要注意的是，经过上述霍夫圆检测到的球可能有多个（可能包含了一些噪声），因此还应该对结果进一步的判断。

红球筛选

经过红球识别的结果可能有如下2种情况：

图像中没有检测到球。
图像中检测到一个或者多个球。

第一种情况不需要讨论，只需要返回没有球的信息即可。对于第二种情况，我们需要对每一个检测出的红球进行二次判断。因为在比赛现场，NAO机器人最多只应该检测到一个球。因此，针对第二种情况，我给出的筛选方法如下：

对于每一个检测出的红球，以红球圆心为中心，以红球的4倍半径为边长画一个外围正方形，计算外接正方形区域内红色和绿色像素点所占的比值。一个简单的示意图如下：

最理想的情况下，红色像素的比例为 π r 2 / 16 r 2 = 0.196 π r 2 / 16 r 2 = 0.196 ，绿色像素所占的比例为 0.804 0.804 ，但在实际检测的时候，存在各种不确定因素（圆检测误差、光线不均匀导致颜色信息发生变化、其他干扰物的影响等），几乎不可能达到理想情况。因此，在具体的实现上，我们需要把条件设定得宽松点。实现代码如下：

def _selectCircle(self, circles):
    """
    select one circle in list type from all circles detected. 
    Args:
        circles: numpy array shaped (N, 3),　N is the number of circles.
    Return:
        selected circle or None (no circle is selected).
    """

    if len(circles) == 0 :
        return circles

    if circles.shape[0] == 1:
        centerX = circles[0][0]
        centerY = circles[0][1]
        radius = circles[0][2]
        initX = centerX - 2*radius
        initY = centerY - 2*radius
        if initX<0 or initY<0 or (initX+4*radius)>self._frameWidth or (initY+4*radius)>self._frameHeight or radius<1:
            return circles

    channelB = self._frameArray[:,:,0]
    channelG = self._frameArray[:,:,1]
    channelR = self._frameArray[:,:,2]

    rRatioMin = 1.0; circleSelected = np.uint16([])
    for circle in circles:
        centerX = circle[0]
        centerY = circle[1]
        radius = circle[2]
        initX = centerX - 2*radius
        initY = centerY - 2*radius

        if initX<0 or initY<0 or (initX+4*radius)>self._frameWidth or (initY+4*radius)>self._frameHeight or radius<1:
            continue

        rectBallArea = self._frameArray[initY:initY+4*radius+1, initX:initX+4*radius+1,:]
        bFlat = np.float16(rectBallArea[:,:,0].flatten())
        gFlat = np.float16(rectBallArea[:,:,1].flatten())
        rFlat = np.float16(rectBallArea[:,:,2].flatten())

        rScore1 = np.uint8(rFlat>1.0*gFlat)
        rScore2 = np.uint8(rFlat>1.0*bFlat)
        rScore = float(np.sum(rScore1*rScore2))

        gScore = float(np.sum(np.uint8(gFlat>1.0*rFlat)))

        rRatio = rScore/len(rFlat)
        gRatio = gScore/len(gFlat) 

        print("red ratio = ", rRatio)
        print("green ratio = ", gRatio)

        if rRatio>=0.12 and gRatio>=0.1 and abs(rRatio-0.19)<abs(rRatioMin-0.19):
            circleSelected = circle

    return circleSelected

该函数的输入是一个2维数组，每一个代表一个检测出来的红球信息 ( x , y , r ) ( x , y , r ) ，为了保证筛选后最多只剩一个红球，代码最后再所有满足比例条件中的球中选择了红色比例最接近理想值0.19的红球。

还有一点需要说明的是，上面的代码中使用的是RGB空间进行颜色统计的，在实现上其实也可以使用HSV空间进行统计，方法还是一样。一个完整的红球检测过程如下：

红球定位

上面只是把红球的位置在图像中定位出来了，而我们在比赛中需要红球相对于机器人（机器人坐标系）的位置信息。比较简单的一种定位方法就是三角函数定位，也就是利用已知的一些参数（红球半径、机器人摄像头离地面高度、摄像头位置、广角等）构造几个直角三角形，最后即可得出红球相当于机器人的位置信息。一个简单的计算示意图如下(具体计算公式见代码)：

对应的计算位置的代码如下：

def _updateBallPosition(self, standState):
    """
    compute and update the ball position with the ball data in frame.
    standState: "standInit" or "standUp".
    """

    bottomCameraDirection = {
"standInit":49.2/180*np.pi, "standUp":39.7/180*np.pi} 
    try:
        cameraDirection = bottomCameraDirection[standState]
    except KeyError:
        print("Error! Unknown standState, please check the value of stand state!")
        raise
    else:
        if self._ballData["radius"] == 0:
            self._ballPosition= {
"disX":0, "disY":0, "angle":0}
        else:
            centerX = self._ballData["centerX"]
            centerY = self._ballData["centerY"]
            radius = self._ballData["radius"]
            cameraPos = self._motionProxy.getPosition(self._cameraName, motion.FRAME_WORLD, True)
            cameraX, cameraY, cameraHeight = cameraPos[:3]
            head_yaw, head_pitch = self._motionProxy.getAngles("Head", True)
            camera_pitch = head_pitch + cameraDirection
            img_center_x = self._frameWidth/2
            img_center_y = self._frameHeight/2
            center_x = self._ballData["centerX"]
            center_y = self._ballData["centerY"]
            img_pitch = (center_y-img_center_y)/(self._frameHeight)*self._cameraPitchRange
            img_yaw = (img_center_x-center_x)/(self._frameWidth)*self._cameraYawRange

            ball_pitch = camera_pitch + img_pitch
            ball_yaw = img_yaw + head_yaw
            print("ball yaw = ", ball_yaw/np.pi*180)
            dis_x = (cameraHeight-self._ballRadius)/np.tan(ball_pitch) + np.sqrt(cameraX**2+cameraY**2)
            dis_y = dis_x*np.sin(ball_yaw)
            dis_x = dis_x*np.cos(ball_yaw)
            self._ballPosition["disX"] = dis_x
            self._ballPosition["disY"] = dis_y
            self._ballPosition["angle"] = ball_yaw

The first part of the code is mainly to get the information needed for the calculation (sensor values and constant terms), and the second part is the formula for calculating the red ball position. Finally the result of the calculation is saved directly in the class. During the experiment, the average error of each position is counted inside 1-2, with smaller errors in the center of the field of view and larger errors near the field of view boundary.

また、以前は、弊社コードの計算式に誤りがあったため（上下のコードの違いを比較してみてください）、X方向の距離は基本的に正確でY方向の距離情報が小さく、Xが小さいほど相対的に偏りが顕著であることからわかるように、算出される位置情報に大きな偏りがあったことをここで補足しておきます。このため、視野内の異なる位置情報を収集（数回の平均）し、離散的な位置ごとに誤差情報をカウントし、最終的に多項式で誤差を補正することにした。テストでは、補正の結果、誤差を1cm程度に抑えることができることがわかりました。コードは以下の通りです。

def _updateBallPositionFitting(self, standState):
    """
    compute and update the ball position with compensation.
    Args:
        "standState: "standInit" or "standUp".
    """
    bottomCameraDirection = {
"standInit":49.2, "standUp":39.7} 
    ballRadius = self._ballRadius
    try:
        cameraDirection = bottomCameraDirection[standState]
    except KeyError:
        print("Error! Unknown standState, please check the value of stand state!")
        raise
    else:
        if self._ballData["radius"] == 0:
            self._ballPosition= {
"disX":0, "disY":0, "angle":0}
        else:
            centerX = self._ballData["centerX"]
            centerY = self._ballData["centerY"]
            radius = self._ballData["radius"]
            cameraPosition = self._motionProxy.getPosition("CameraBottom", 2, True)
            cameraX = cameraPosition[0]
            cameraY = cameraPosition[1]
            cameraHeight = cameraPosition[2]
            headPitches = self._motionProxy.getAngles("HeadPitch", True)
            headPitch = headPitches[0]
            headYaws = self._motionProxy.getAngles("HeadYaw&qu

最後に上記のすべてを実装するために、以下のように、以前に実装した様々なモジュールをまとめてラップする別の関数をクラス内に定義します。

def updateBallData(self, standState="standInit", color="red", color_space="BGR", fitting=False):
    """
    Update the ball data with the frame get from the bottom camera.
    Arguments:
        standState: ("standInit", default), "standInit" or "standUp".
        color: ("red", default) the color of the ball to be detected.
        color_space: "BGR", "HSV".
        fittting: the method of localization.
    Return: 
        a dict with ball data. for example: {"centerX":0, "centerY":0, "radius":0}.
    """

    self.updateFrame()
    #cv2.imwrite("src_image.jpg", self._frameArray)
    minDist = int(self._frameHeight/30.0)
    minRadius = 1
    maxRadius = int(self._frameHeight/10.0)
    if color_space == "BGR":
        grayFrame = self._getChannelAndBlur(color)
    else:
        grayFrame = self._binImageHSV(color)
    #cv2.imshow("bin frame", grayFrame)
    #cv2.imwrite("bin_frame.jpg", grayFrame)
    #cv2.waitKey(20)
    circles = self._findCircles(grayFrame, minDist, minRadius, maxRadius)
    circles = self._selectCircle(circles)

    if len(circle) == 0:
        self._ballData = {
"centerX":0, "centerY":0, "radius":0}
        self._ballPosition= {
"disX":0, "disY":0, "angle":0}
    else:    
        self._ballData = {
"centerX":circle[0], "centerY":circle[1], "radius":circle[2]}
        if fitting == True:
            self._updateBallPositionFitting(standState=standState)
        else:
            self._updateBallPosition(standState=standState)

イエローバー検出クラス

NAOロボットゴルフでは、NAOロボットが遠隔操作でホールを方向付けるために、黄色のクラブがホールに立てられます。NAOロボットの頭部カメラはより遠くを見るため、プレー中にNAOロボットの頭部カメラで黄色いクラブを検出します。しかし、これには、遠くにある黄色いものを黄色い棒と間違えてしまうなどの問題もあります。

ここでも、視覚の基底クラスVisualBasisを継承した黄色い棒の検出クラスStickDetectを定義する必要があります。検出された黄色いバーの位置に関する情報を保持するリスト、ロボットに対するバーの角度を保持する値、画像を切り取るかどうかを決定する定数の3つのプロパティをこのクラスで定義しています。最初の2つはよく理解されています。クロップを定義する一般的な理由は、競技会場において、NAOロボットがイエローバーを見つけるために通常の姿勢で歩いているとき、イエローバーは通常画像の下部に位置しており、画像の上半分を切り取ることでいくつかの邪魔なものを排除できるだけでなく、計算量を削減することができるためです。具体的な定義は以下の通りです。

def __init__(self, IP, cameraId=vd.kTopCamera, resolution=vd.kVGA):
    super(StickDetect, self). __init__(IP, cameraId, resolution)
    self._boundRect = []
    self._cropKeep = 1
    self._stickAngle = None # rad

本検出方法では、HSVカラーセグメンテーション＋形状判別を行っている。HSVカラーセグメンテーションは、上記のHSV空間を用いた赤玉のセグメンテーションと同じで、以下の関数で、閾値の範囲を変更するだけです。

def _preprocess(self, minHSV, maxHSV, cropKeep, morphology):
    """
    preprocess the current frame for stick detection.(binalization, crop etc.)
    Arguments:
        minHSV: the lower limit for binalization.
        maxHSV: the upper limit for binalization.
        cropKeep: crop ratio (>=0.5).
        morphology: erosion and dilation.
    Return: preprocessed image for stick detection:
        preprocessed image for stick detection.
    """
    self._cropKeep = cropKeep
    frameArray = self._frameArray
    height = self._frameHeight
    width = self._frameWidth
    try:
        frameArray = frameArray[int((1-cropKeep)*height):,:]
    except IndexError:
        raise       
    frameHSV = cv2.cvtColor(frameArray, cv2.COLOR_BGR2HSV)
    frameBin = cv2.inRange(frameHSV, minHSV, maxHSV)

    kernelErosion = np.ones((5,5), np.uint8)
    kernelDilation = np.ones((5,5), np.uint8) 
    frameBin = cv2.erode(frameBin, kernelErosion, iterations=1)
    frameBin = cv2.dilate(frameBin, kernelDilation, iterations=1)
    frameBin = cv2.GaussianBlur(frameBin, (9,9), 0)
    return frameBin

また、コードには腐食拡大とガウシアンフィルタリングを追加しています。これは主に実験でわかったことですが、黄色い棒がロボットから遠くにある場合、いくつかの干渉物が検出に大きな影響を与えるので、腐食膨張を追加すると、多くの不要な干渉をフィルタリングするのに非常に良い方法です。もちろん、フィルタのサイズも把握する必要があり、そうでない場合は、黄色の棒もフィルタリングすることができます。

上記のような前処理を施した後、2値化されたマップが得られます。理想的には、このマップにはほとんどイエローバーの情報（と、場合によっては色に似たノイズも）が含まれているはずです。そのためには、2値化された画像のどこにイエローバーがあるのかを探す必要があります。私たちの競技のイエローバーは長いので、さらに形状情報を使って決定することができます。コード内で使用した判別方法は以下の通りです。

まず二値化画像に対して凸包検出を行い、面積や周囲長が小さい凸包を除去し、残った凸包の中で最も小さい外接矩形を求め、それぞれの外接矩形の形状を判別して、必要な縦横比の矩形を残し、最後に残った矩形の中で最も縦横比の大きい矩形を選択します。そのコードは以下の通りである。

def _findStick(self, frameBin, minPerimeter, minArea):
    """
    find the yellow stick in the preprocessed frame.
    Args:
        frameBin: preprocessed frame.
        minPerimeter: minimum perimeter of detected stick.
        minArea: minimum area of detected stick.
    Return: detected stick marked with rectangle or [].
    """

    rects = []
    _, contours, _ = cv2.findContours(frameBin, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)

    if len(contours) == 0:
        return rects

    for contour in contours:
        perimeter = cv2.arcLength(contour, True)
        area = cv2.contourArea(contour)
        if perimeter>minPerimeter and area>minArea:
            x,y,w,h = cv2.boundingRect(contour)
            rects.append([x,y,w,h])

    if len(rects) == 0:
        return rects

    rects = [rect for rect in rects if (1.0*rect[3]/rect[2])>0.8]

    if len(rects) == 0:
        return rects

    rects = np.array(rects)
    print(rects)
    rect = rects[np.argmax(1.0*(rects[:,-1])/rects[:,-2]),]
    rect[1] += int(self._frameHeight *(1-self._cropKeep))
    return rect

上記のコードでは、さらに、多くの判定を加えています。これは、画像に必要な黄色いバーが含まれていないときに、空のリストを返すためである。もう一つ注意すべき点は、最終的にトリミングされた画像になるため、最終的な座標情報を元の画像に変換する必要があることです。以下は、イエローバーの検出結果の全体像です。

この時点で、NAOロボットゴルフトーナメントにおける赤いボールと黄色いショットの認識とローカライズを実装しました。実際には、競技場には白線、障害物、さらには穴など、検出する必要のある他のターゲットが存在しますが、ここでは触れません。

結論を出すための言葉

ここまで読んでいただきありがとうございました！このブログが多少なりともお役に立てたのなら幸いです。このブログについて問題点があれば遠慮なくご指摘ください！偉い人に教えてもらうのも大歓迎です。

NAOロボットゴルフにおけるビジョンシステム設計

基本構成(ベースクラス)

Visionモジュール

Vision base class - ビジョンタスクのベースとなるクラスです。

赤玉検出クラス

赤玉の基本プロパティ

画像の前処理

红球识别

红球筛选

红球定位

イエローバー検出クラス

最新

nginxです。[emerg] 0.0.0.0:80 への bind() に失敗しました (98: アドレスは既に使用中です)

htmlページでギリシャ文字を使うには

ピュアhtml+cssでの要素読み込み効果

純粋なhtml + cssで五輪を実現するサンプルコード

ナビゲーションバー・ドロップダウンメニューのHTML+CSSサンプルコード

タイピング効果を実現するピュアhtml+css

htmlの選択ボックスのプレースホルダー作成に関する質問

html css3 伸縮しない画像表示効果

トップナビゲーションバーメニュー作成用HTML+CSS

html+css 実装サイバーパンク風ボタン

おすすめ

ハートビート・エフェクトのためのHTML+CSS

HTML ホテルフォームによるフィルタリング

HTML+cssのボックスモデル例（円、半円など）「border-radius」使いやすい

HTMLテーブルのテーブル分割とマージ（colspan, rowspan）

ランダム・ネームドロッパーを実装するためのhtmlサンプルコード

Html階層型ボックスシャドウ効果サンプルコード

QQの一時的なダイアログボックスをポップアップし、友人を追加せずにオンラインで話す効果を達成する方法

sublime / vscodeショートカットHTMLコード生成の実装

HTMLページを縮小した後にスクロールバーを表示するサンプルコード

html のリストボックス、テキストフィールド、ファイルフィールドのコード例