Airtest源码解析

Airtest图像识别

Airtest介绍

Airtest介绍

Airtest是一款网易出品的基于图像识别面向手游UI测试的工具，也支持原生Android App基于元素识别的UI自动化测试(现在支持Android、ios、Windows)。主要包含了三部分：Airtest IDE、Airtest（用截图写脚本）和 Poco（用界面UI元素来写脚本）。来自Google的评价：Airtest 是安卓游戏开发最强大、最全面的自动测试方案之一。

源码

Airtest
下载后，如果，只是代码阅读，可以不用部署环境。运行需要参照readme，有中文版很贴心 😃

touch方法

如图示所示，从touch图片开始，即为点击某个传入的图片，源码在api.py里面：
Alt

@logwrap
def touch(v, times=1, **kwargs):
    """
    Perform the touch action on the device screen

    :param v: target to touch, either a ``Template`` instance or absolute coordinates (x, y)
    :param times: how many touches to be performed
    :param kwargs: platform specific `kwargs`, please refer to corresponding docs
    :return: finial position to be clicked
    :platforms: Android, Windows, iOS
    :Example:
        Click absolute coordinates::

        >>> touch((100, 100))

        Click the center of the picture(Template object)::

        >>> touch(Template(r"tpl1606730579419.png", target_pos=5))

        Click 2 times::

        >>> touch((100, 100), times=2)

        Under Android and Windows platforms, you can set the click duration::

        >>> touch((100, 100), duration=2)

        Right click(Windows)::

        >>> touch((100, 100), right_click=True)

    """
    if isinstance(v, Template):
        pos = loop_find(v, timeout=ST.FIND_TIMEOUT)
    else:
        try_log_screen()
        pos = v
    for _ in range(times):
        G.DEVICE.touch(pos, **kwargs)
        time.sleep(0.05)
    delay_after_operation()
    return pos

这个函数执行点击操作的是 G.DEVICE.touch(pos, **kwargs)，而pos就是图片匹配返回的坐标位置，重点看loop_find这个函数是怎样识别并返回坐标数据的：

@logwrap
def loop_find(query, timeout=ST.FIND_TIMEOUT, threshold=None, interval=0.5, intervalfunc=None):
    """
    Search for image template in the screen until timeout

    Args:
        query: image template to be found in screenshot
        timeout: time interval how long to look for the image template
        threshold: default is None
        interval: sleep interval before next attempt to find the image template
        intervalfunc: function that is executed after unsuccessful attempt to find the image template

    Raises:
        TargetNotFoundError: when image template is not found in screenshot

    Returns:
        TargetNotFoundError if image template not found, otherwise returns the position where the image template has
        been found in screenshot

    """
    G.LOGGING.info("Try finding: %s", query)
    start_time = time.time()
    while True:
        screen = G.DEVICE.snapshot(filename=None, quality=ST.SNAPSHOT_QUALITY)

        if screen is None:
            G.LOGGING.warning("Screen is None, may be locked")
        else:
            if threshold:
                query.threshold = threshold
            match_pos = query.match_in(screen)
            if match_pos:
                try_log_screen(screen)
                return match_pos

        if intervalfunc is not None:
            intervalfunc()

        # 超时则raise，未超时则进行下次循环:
        if (time.time() - start_time) > timeout:
            try_log_screen(screen)
            raise TargetNotFoundError('Picture %s not found in screen' % query)
        else:
            time.sleep(interval)

解读下这个loop_find这个方法：

def loop_find(query, timeout=ST.FIND_TIMEOUT, threshold=None, interval

"""
参数：

查询：截图中找到的图像模板

超时：查找图像模板的时间间隔

阈值：默认值为“无”

间隔：下次尝试查找图像模板之前的睡眠间隔

intervalfunc：在att失败后执行的函数
"""

1、首先会获取手机屏幕截图；

2、然后对比脚本传入图片获取匹配上的位置；

3、一直重复上面两步骤直到匹配上或者超时。

跟脚本传入图片进行匹配的代码在这一行 match_pos = query.match_in(screen)

在cv.py 里面找到 Template类的 match_in方法：

    def match_in(self, screen):
        match_result = self._cv_match(screen)
        G.LOGGING.debug("match result: %s", match_result)
        if not match_result:
            return None
        focus_pos = TargetPos().getXY(match_result, self.target_pos)
        return focus_pos

解读下 match_in方法：

1、调用自己的_cv_math方法，找到匹配到的坐标结果；

2、根据图片坐标返回点击坐标，默认点击图片中心位置。

接下来看 self._cv_match(screen) 也在cv.py 里面：

 @logwrap
    def _cv_match(self, screen):
        # in case image file not exist in current directory:
        image = self._imread()
        image = self._resize_image(image, screen, ST.RESIZE_METHOD)
        ret = None
        for method in ST.CVSTRATEGY:
            # get function definition and execute:
            func = MATCHING_METHODS.get(method, None)
            if func is None:
                raise InvalidMatchingMethodError("Undefined method in CVSTRATEGY: '%s', try 'kaze'/'brisk'/'akaze'/'orb'/'surf'/'sift'/'brief' instead." % method)
            else:
                ret = self._try_match(func, image, screen, threshold=self.threshold, rgb=self.rgb)
            if ret:
                break
        return ret

解读下_cv_match代码：

1、将用例传入的截图进行缩放（写用例设备与运行用例设备可能不一致）；

2、遍历配置项里面的方法，进行匹配。匹配方法是_try_match()

@staticmethod
    def _try_match(func, *args, **kwargs):
        G.LOGGING.debug("try match with %s" % func.__name__)
        try:
            ret = func(*args, **kwargs).find_best_result()
        except aircv.NoModuleError as err:
            G.LOGGING.warning("'surf'/'sift'/'brief' is in opencv-contrib module. You can use 'tpl'/'kaze'/'brisk'/'akaze'/'orb' in CVSTRATEGY, or reinstall opencv with the contrib module.")
            return None
        except aircv.BaseError as err:
            G.LOGGING.debug(repr(err))
            return None
        else:
            return ret

通过，find_best_result()方法得到结果：

@print_run_time
    def find_best_result(self):
        """基于kaze进行图像识别，只筛选出最优区域."""
        """函数功能：找到最优结果."""
        # 第一步：校验图像输入
        check_source_larger_than_search(self.im_source, self.im_search)
        # 第二步：计算模板匹配的结果矩阵res
        res = self._get_template_result_matrix()
        # 第三步：依次获取匹配结果
        min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
        h, w = self.im_search.shape[:2]
        # 求取可信度:
        confidence = self._get_confidence_from_matrix(max_loc, max_val, w, h)
        # 求取识别位置: 目标中心 + 目标区域:
        middle_point, rectangle = self._get_target_rectangle(max_loc, w, h)
        best_match = generate_result(middle_point, rectangle, confidence)
        LOGGING.debug("[%s] threshold=%s, result=%s" % (self.METHOD_NAME, self.threshold, best_match))

        return best_match if confidence >= self.threshold else None

概括来说find_best_result 主要做了这几件事情：

1、校验图像输入；

2、计算模板匹配的结果矩阵res；

3、依次获取匹配结果；

4、求取可信度；

5、求取识别位置。
基于kaze进行图像识别，只筛选出最优区域.
KAZE是发表在ECCV2012的一种特征点检测算法，相比于SIFT和SURF，KAZE建立的高斯金字塔是非线性的尺度空间，采用加性算子分裂算法(Additive Operator Splitting, AOS)来进行非线性扩散滤波。一个很显著的特点是在模糊图像的同时还能保留边缘细节。
KAZE论文中给出了若干实验图表数据，与SURF、SIFT和STAR相比，KAZE有更好的尺度和旋转不变性，并且稳定、可重复检测。主要的实验包括：

（1）重复检测试验
这里主要从旋转缩放、视角变换、噪声干扰、模糊图像、压缩图像等方面进行了测试，可以看出KAZE的可重复性明显优于其它特征。

（2）特征检测与匹配试验
这里也是从旋转缩放、视角变换、噪声干扰、模糊图像、压缩图像等方面进行了测试，给出了特征匹配的Precision-Recall图。使用的匹配算法是最近邻匹配。这里可以看出，在图像模糊、噪声干扰和压缩重构等造成的信息丢失的情况下，KAZE特征的鲁棒性明显优于其它特征。

（3）表面形变目标的特征匹配
这里可以看出基于g2传导函数的KAZE特征性能最好。

（4）检测效率测试
这里可以看出KAZE的特征检测时间高于SURF和STAR，但与SIFT相近。这里比较花时间的是非线性尺度空间的构建。

测试代码与结果：

#include <opencv2/opencv.hpp>
#include <iostream>
#include <math.h>


using namespace cv;
using namespace std;

int main(int argc, char** argv) {
    Mat img1 = imread("C:/Users/wenhaofu/Desktop/picture/gril1.png");
    Mat img2 = imread("C:/Users/wenhaofu/Desktop/picture/gril2.png");
    if (img1.empty() || img2.empty()) {
        printf("could not load images...\n");
        return -1;
    }
    imshow("box image", img1);
    imshow("scene image", img2);


    // extract kaze features
    Ptr<KAZE> detector = KAZE::create();
    vector<KeyPoint> keypoints_obj;
    vector<KeyPoint> keypoints_scene;
    Mat descriptor_obj, descriptor_scene;
    double t1 = cv::getTickCount();
    detector->detectAndCompute(img1, Mat(), keypoints_obj, descriptor_obj);
    detector->detectAndCompute(img2, Mat(), keypoints_scene, descriptor_scene);
    double t2 = cv::getTickCount();
    double tkaze = 1000 * (t2 - t1) / cv::getTickFrequency();
    cout<<" KAZE Time consume(ms): " << tkaze << endl;
    // matching
    //FlannBasedMatcher matcher(new flann::LshIndexParams(20, 10, 2));
    //FlannBasedMatcher matcher;
    BFMatcher matcher;
    vector<DMatch> matches;
    matcher.match(descriptor_obj, descriptor_scene, matches);

    // draw matches(key points)
    Mat akazeMatchesImg;
    
    drawMatches(img1, keypoints_obj, img2, keypoints_scene, matches, akazeMatchesImg);
    imshow("akaze match result", akazeMatchesImg);

    vector<DMatch> goodMatches;
    double minDist = 100000, maxDist = 0;
    for (int i = 0; i < descriptor_obj.rows; i++) {
        double dist = matches[i].distance;
        if (dist < minDist) {
            minDist = dist;
        }
        if (dist > maxDist) {
            maxDist = dist;
        }
    }
    printf("min distance : %f", minDist);

    for (int i = 0; i < descriptor_obj.rows; i++) {
        double dist = matches[i].distance;
        //max(1.5 * minDist, 0.02)
        if (dist < 1.3 * minDist) {
            goodMatches.push_back(matches[i]);
        }
    }
    cout << endl;
    cout << "goodmatche size : " << goodMatches.size() << endl;
    Mat kazematchimg;
    drawMatches(img1, keypoints_obj, img2, keypoints_scene, goodMatches, kazematchimg, Scalar::all(-1),
        Scalar::all(-1), vector<char>(), DrawMatchesFlags::NOT_DRAW_SINGLE_POINTS);
    imshow("good match result", kazematchimg);

    waitKey(0);
    return 0;
}

Alt
Alt
Alt

AKAZE局部匹配介绍

AOS构造尺度空间
Hessian矩阵特征点检测
方向指定基于一阶微分图像
描述子生成
AKAZE与KAZE的区别带K是快速的
与SIFT/SURF比较

更加稳定
非线性尺度空间
AKAZE速度更加快
比较新的算法，只有在opencv新版本中才有
KAZE是日语音译过来的， KAZE与SIFT、SURF最大的区别在于构造尺度空间，KAZE是利用非线性方式构造，得到的关键点也就更准确（尺度不变性）；

代码比较

#include <opencv2/opencv.hpp>
//#include <opencv2/xfeatures2d.hpp>
#include <iostream>
#include <math.h>

using namespace std;
using namespace cv;


#define PIC_PATH "C:/Users/wenhaofu/Desktop/picture/"
#define PIC_NAME "tubiao3.png"

#define PIC_PATH1 "C:/Users/wenhaofu/Desktop/picture/"
#define PIC_NAME1 "tubiao1.png"

int main(void)
{
    Mat subimg, mainimg;

    //获取完整的图片路径及名称
    string pic1 = string(PIC_PATH) + string(PIC_NAME);
    string pic2 = string(PIC_PATH1) + string(PIC_NAME1);

    //打印图片路径
    cout << "subimg  path is :" << pic1 << endl;
    cout << "mainimg path is :" << pic2 << endl;
    //读取图片
    subimg = imread(pic1, IMREAD_GRAYSCALE);
    mainimg = imread(pic2, IMREAD_GRAYSCALE);
    //判断图片是否存在
    if (subimg.empty() || mainimg.empty())
    {
        cout << "pic is not exist!!!!" << endl;
        return -1;
    }

    //显示图片
#if 0
    namedWindow("subimg pic", WINDOW_AUTOSIZE);
    imshow("subimg pic", subimg);

    namedWindow("mainimg pic", WINDOW_AUTOSIZE);
    imshow("mainimg pic", mainimg);
#endif
    Mat kpimg;
    Ptr<AKAZE> detector = AKAZE::create();
    vector<KeyPoint> keypoints_sub;
    vector<KeyPoint> keypoints_main;

    Mat descript_sub, descript_main;
    double t1 = cv::getTickCount();
    detector->detectAndCompute(subimg, Mat(), keypoints_sub, descript_sub);
    detector->detectAndCompute(mainimg, Mat(), keypoints_main, descript_main);
    double t2 = cv::getTickCount();
    double tkaze = 1000 * (t2 - t1) / cv::getTickFrequency();
    cout << "AKAZE Time consume(ms): " << tkaze << endl;

    //这里使用暴力匹配与flann匹配都可以 flann匹配时注意调整以下参数否则会报错
    //FlannBasedMatcher matcher(new flann::LshIndexParams(20,10,2));
    BFMatcher matcher;
    vector<DMatch> matches;
    matcher.match(descript_sub, descript_main, matches);

    Mat akazeimg;
    drawMatches(subimg, keypoints_sub, mainimg, keypoints_main, matches, akazeimg);

    namedWindow("akazeimg display", WINDOW_AUTOSIZE);
    imshow("akazeimg display", akazeimg);  //匹配点过多


    //优化匹配点 寻找最优匹配
    float maxdist = 0;
    float mindist = 1000;

    //通过逐步收敛的办法查找到最大值最小值
    for (int i = 0; i < descript_sub.rows; i++)
    {
        float distance = matches[i].distance;
        //cout << "distance: " << distance << endl;
        if (distance > maxdist)
            maxdist = distance;
        if (distance < mindist)
            mindist = distance;
    }

    //将最大值 最小值打印出来
    cout << "maxdist: " << maxdist << endl;
    cout << "mindist: " << mindist << endl;

    vector<DMatch> goodmatches;   //定义最优的距离点集合
    for (int i = 0; i < descript_sub.rows; i++)
    {
        float distance = matches[i].distance;
        if (distance <= max(1.5 * mindist, 0.02))
        {
            //将查找到的最小距离添加到goodmatches中
            goodmatches.push_back(matches[i]);
        }
    }
    cout << "goodmatche size : " << goodmatches.size() << endl;


    Mat goodmatchimages;
    //绘制匹配点 点数大大减少
    drawMatches(subimg, keypoints_sub, mainimg, keypoints_main, goodmatches, goodmatchimages);
    imshow("goodmatchimages", goodmatchimages);

    waitKey(0);
    destroyAllWindows();
    return 0;
}

Alt
与KAZE对比AKAZE速度确实提升，但是，匹配精度过于严苛。

confidence 可信度可以简单理解为相似度，这里默认的阈值是threshold=0.8 如果匹配的结果大于这个0.8就把最佳匹配的坐标返回，否则认为没有匹配上返回None，在写脚本的时候可以传入threshold这个参数来提高或降低匹配精度。best_match 就是最佳匹配的坐标，重点看 _get_template_result_matrix是怎样得到匹配结果的：

    def _get_confidence_from_matrix(self, max_loc, max_val, w, h):
        """根据结果矩阵求出confidence."""
        # 求取可信度:
        if self.rgb:
            # 如果有颜色校验,对目标区域进行BGR三通道校验:
            img_crop = self.im_source[max_loc[1]:max_loc[1] + h, max_loc[0]: max_loc[0] + w]
            confidence = cal_rgb_confidence(img_crop, self.im_search)
        else:
            confidence = max_val

        return confidence

def cal_rgb_confidence(img_src_rgb, img_sch_rgb):
    """同大小彩图计算相似度."""
    # 扩展置信度计算区域
    img_sch_rgb = cv2.copyMakeBorder(img_sch_rgb, 10,10,10,10,cv2.BORDER_REPLICATE)
    # 转HSV强化颜色的影响
    img_src_rgb = cv2.cvtColor(img_src_rgb, cv2.COLOR_BGR2HSV)
    img_sch_rgb = cv2.cvtColor(img_sch_rgb, cv2.COLOR_BGR2HSV)
    src_bgr, sch_bgr = cv2.split(img_src_rgb), cv2.split(img_sch_rgb)

    # 计算BGR三通道的confidence，存入bgr_confidence:
    bgr_confidence = [0, 0, 0]
    for i in range(3):
        res_temp = cv2.matchTemplate(src_bgr[i], sch_bgr[i], cv2.TM_CCOEFF_NORMED)
        min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res_temp)
        bgr_confidence[i] = max_val

    return min(bgr_confidence)

可以看出，直接用的OpenCV的模板匹配方法。

第二种find_best_result()方法：

@print_run_time
    def find_best_result(self):
        """基于kaze进行图像识别，只筛选出最优区域."""
        # 第一步：检验图像是否正常：
        if not check_image_valid(self.im_source, self.im_search):
            return None

        # 第二步：获取特征点集并匹配出特征点对: 返回值 good, pypts, kp_sch, kp_src
        self.kp_sch, self.kp_src, self.good = self._get_key_points()

        # 第三步：根据匹配点对(good),提取出来识别区域:
        if len(self.good) in [0, 1]:
            # 匹配点对为0,无法提取识别区域;为1则无法获取目标区域,直接返回None作为匹配结果:
            return None
        elif len(self.good) in [2, 3]:
            # 匹配点对为2或3,根据点对求出目标区域,据此算出可信度:
            if len(self.good) == 2:
                origin_result = self._handle_two_good_points(self.kp_sch, self.kp_src, self.good)
            else:
                origin_result = self._handle_three_good_points(self.kp_sch, self.kp_src, self.good)
            # 某些特殊情况下直接返回None作为匹配结果:
            if origin_result is None:
                return origin_result
            else:
                middle_point, pypts, w_h_range = origin_result
        else:
            # 匹配点对 >= 4个，使用单矩阵映射求出目标区域，据此算出可信度：
            middle_point, pypts, w_h_range = self._many_good_pts(self.kp_sch, self.kp_src, self.good)

        # 第四步：根据识别区域，求出结果可信度，并将结果进行返回:
        # 对识别结果进行合理性校验: 小于5个像素的，或者缩放超过5倍的，一律视为不合法直接raise.
        self._target_error_check(w_h_range)
        # 将截图和识别结果缩放到大小一致,准备计算可信度
        x_min, x_max, y_min, y_max, w, h = w_h_range
        target_img = self.im_source[y_min:y_max, x_min:x_max]
        resize_img = cv2.resize(target_img, (w, h))
        confidence = self._cal_confidence(resize_img)

        best_match = generate_result(middle_point, pypts, confidence)
        LOGGING.debug("[%s] threshold=%s, result=%s" % (self.METHOD_NAME, self.threshold, best_match))
        return best_match if confidence >= self.threshold else None

概括来说find_best_result()主要做了这几件事情：

1、检验图片是否正常；

2、获取特征点集并匹配出特征点对；

3、根据匹配点对(good),提取出来识别区域；

4、根据识别区域，求出结果可信度。

接下来看如何找到特征点集：

    def _get_key_points(self):
        """根据传入图像,计算图像所有的特征点,并得到匹配特征点对."""
        # 准备工作: 初始化算子
        self.init_detector()
        # 第一步：获取特征点集，并匹配出特征点对: 返回值 good, pypts, kp_sch, kp_src
        kp_sch, des_sch = self.get_keypoints_and_descriptors(self.im_search)
        kp_src, des_src = self.get_keypoints_and_descriptors(self.im_source)
        # When apply knnmatch , make sure that number of features in both test and
        #       query image is greater than or equal to number of nearest neighbors in knn match.
        if len(kp_sch) < 2 or len(kp_src) < 2:
            raise NoMatchPointError("Not enough feature points in input images !")
        # match descriptors (特征值匹配)
        matches = self.match_keypoints(des_sch, des_src)

        # good为特征点初选结果，剔除掉前两名匹配太接近的特征点，不是独特优秀的特征点直接筛除(多目标识别情况直接不适用)
        good = []
        for m, n in matches:
            if m.distance < self.FILTER_RATIO * n.distance:
                good.append(m)
        # good点需要去除重复的部分，（设定源图像不能有重复点）去重时将src图像中的重复点找出即可
        # 去重策略：允许搜索图像对源图像的特征点映射一对多，不允许多对一重复（即不能源图像上一个点对应搜索图像的多个点）
        good_diff, diff_good_point = [], [[]]
        for m in good:
            diff_point = [int(kp_src[m.trainIdx].pt[0]), int(kp_src[m.trainIdx].pt[1])]
            if diff_point not in diff_good_point:
                good_diff.append(m)
                diff_good_point.append(diff_point)
        good = good_diff

        return kp_sch, kp_src, good

matches = self.matcher.knnMatch(des_sch, des_src, k=2)这就是在对图像的特征点进行匹配，knnMatch 是筛选匹配点。至于这个初始化算子是什么对象，就是通过这个对象获取到图像特征点：

    def init_detector(self):
        """Init keypoint detector object."""
        self.detector = cv2.KAZE_create()
        # create BFMatcher object:
        self.matcher = cv2.BFMatcher(cv2.NORM_L1)  # cv2.NORM_L1 cv2.NORM_L2 cv2.NORM_HAMMIN

最终用到的就是OpenCV的两个方法：模版匹配和特征匹配

1.模板匹配：

cv2.matchTemplate(i_gray, s_gray, cv2.TM_CCOEFF_NORMED)

2.特征匹配：

cv2.FlannBasedMatcher(index_params,search_params).knnMatch(des1,des2,k=2)

哪个优先匹配上了，就直接返回结果，可以看到用的都是OpenCV的图像识别算法。