EasyOCR光学字符识别

OCR光学字符识别,将不同类型的文档,图像中的字符保存为可编辑和可搜索的数据。

EasyOCR

EasyOCR是python的一个包,将pytorch作为后端处理程序,EasyOCR 像任何其他 OCR(Google 的 tesseract 或任何其他)一样检测图像中的文本,但我在使用它时,我发现它是从图像中检测文本的最直接的方法,而且它将 PyTorch 作为后端处理程序,准确性更可靠。
EasyOCR 支持 42 多种语言进行检测。EasyOCR 是由 Jaided AI 公司创建的。

安装

在经历上述过程后,很容易安装EasyOCR,运行以下命令:

1
pip3 install easyocr

输出结果:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
Collecting easyocr
Using cached easyocr-1.4.1-py3-none-any.whl (63.6 MB)
Collecting python-bidi
Using cached python_bidi-0.4.2-py2.py3-none-any.whl (30 kB)
Collecting scikit-image
Downloading scikit_image-0.17.2-cp36-cp36m-win_amd64.whl (11.5 MB)
|████████████████████████████████| 11.5 MB 211 kB/s
Requirement already satisfied: PyYAML in d:\users\vadmin\anaconda3\envs\pytorch\lib\site-packages (from easyocr) (5.4.1)
Requirement already satisfied: numpy in d:\users\vadmin\anaconda3\envs\pytorch\lib\site-packages (from easyocr) (1.19.5)
Requirement already satisfied: torch in d:\users\vadmin\anaconda3\envs\pytorch\lib\site-packages (from easyocr) (1.9.0)
Requirement already satisfied: scipy in d:\users\vadmin\anaconda3\envs\pytorch\lib\site-packages (from easyocr) (1.5.4)
Collecting opencv-python-headless
Downloading opencv_python_headless-4.5.4.58-cp36-cp36m-win_amd64.whl (35.0 MB)
|████████████████████████████████| 35.0 MB 338 kB/s
Requirement already satisfied: torchvision>=0.5 in d:\users\vadmin\anaconda3\envs\pytorch\lib\site-packages (from easyocr) (0.10.0)
Requirement already satisfied: Pillow<8.3.0 in d:\users\vadmin\anaconda3\envs\pytorch\lib\site-packages (from easyocr) (8.2.0)
Requirement already satisfied: typing_extensions in d:\users\vadmin\anaconda3\envs\pytorch\lib\site-packages (from torch->easyocr) (3.7.4.3)
Requirement already satisfied: dataclasses in d:\users\vadmin\anaconda3\envs\pytorch\lib\site-packages (from torch->easyocr) (0.8)
Requirement already satisfied: six in d:\users\vadmin\anaconda3\envs\pytorch\lib\site-packages (from python-bidi->easyocr) (1.16.0)
Requirement already satisfied: imageio>=2.3.0 in d:\users\vadmin\anaconda3\envs\pytorch\lib\site-packages (from scikit-image->easyocr) (2.9.0)
Requirement already satisfied: matplotlib!=3.0.0,>=2.0.0 in d:\users\vadmin\anaconda3\envs\pytorch\lib\site-packages (from scikit-image->easyocr) (3.3.4)
Collecting tifffile>=2019.7.26
Downloading tifffile-2020.9.3-py3-none-any.whl (148 kB)
|████████████████████████████████| 148 kB ...
Collecting networkx>=2.0
Downloading networkx-2.5.1-py3-none-any.whl (1.6 MB)
|████████████████████████████████| 1.6 MB 6.4 MB/s
Requirement already satisfied: PyWavelets>=1.1.1 in d:\users\vadmin\anaconda3\envs\pytorch\lib\site-packages (from scikit-image->easyocr) (1.1.1)
Requirement already satisfied: kiwisolver>=1.0.1 in d:\users\vadmin\anaconda3\envs\pytorch\lib\site-packages (from matplotlib!=3.0.0,>=2.0.0->scikit-image->easyocr) (1.3.1)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in d:\users\vadmin\anaconda3\envs\pytorch\lib\site-packages (from matplotlib!=3.0.0,>=2.0.0->scikit-image->easyocr) (2.4.7)
Requirement already satisfied: cycler>=0.10 in d:\users\vadmin\anaconda3\envs\pytorch\lib\site-packages (from matplotlib!=3.0.0,>=2.0.0->scikit-image->easyocr) (0.10.0)
Requirement already satisfied: python-dateutil>=2.1 in d:\users\vadmin\anaconda3\envs\pytorch\lib\site-packages (from matplotlib!=3.0.0,>=2.0.0->scikit-image->easyocr) (2.8.1)
Collecting decorator<5,>=4.3
Downloading decorator-4.4.2-py2.py3-none-any.whl (9.2 kB)
Installing collected packages: decorator, tifffile, networkx, scikit-image, python-bidi, opencv-python-headless, easyocr
Successfully installed decorator-4.4.2 easyocr-1.4.1 networkx-2.5.1 opencv-python-headless-4.5.4.58 python-bidi-0.4.2 scikit-image-0.17.2 tifffile-2020.9.3

使用easyocr

简单示例:

1
2
3
4
5
6
import easyocr

reader = easyocr.Reader(['ch_sim', 'en'], gpu=True)
result = reader.readtext(r"labelimg.jpg", detail=0)
print(result)
# detail=0 不输出位置信息

导入库

1
2
3
4
5
import os
import easyocr
import cv2
from matplotlib import pyplot as plt
import numpy as np

读取图像

  • URL图片
    Image='https://www.codekp.cn/download/img/ubuntu/NVIDIA1.png'
  • 本地图片
    Image='labelimg.jpg'
    result = reader.readtext(Image, detail=0,paragraph="False")

    图片如果是https,请附上

    1
    2
    3
    # 全局取消证书验证
    import ssl
    ssl._create_default_https_context = ssl._create_unverified_context

识别结果

1
['手动搜索驱动程序 提供您的系统信息以搜索全部 GeForce 驱动程序。', '产品类型: GeFOrE', '产品系列: GeFOrcE RTX 30 Series [Notebooks]', '产品: GeForce RTX 3060 Laptoo GPU', '操怍系统: LinUx 04-Oit', '语言', 'Chinese (Simolified]', '下载类型', '全部', 'Q开始搜索']

准确度极高

  • 其中调用时使用[‘ch_sim’, ‘en’]可以识别简体中文和英文
  • “paragraph”,这里它被设置为False**,这意味着现在easyOCR不会合并结果,即如果EasyOCR会遇到多个文本,它不会合并它们,而是将它们分开显示。

单行文本结果绘制

1
2
3
top_left = tuple(result[0][0][0])
bottom_right = tuple(result[0][0][2])
text = result[0][1]
  1. 我们正在尝试获取坐标以在我们必须执行检测的图像上绘制边界框和文本。
  2. 在top_left变量中,我们以元组访问的形式从结果中获取左上角的坐标。同样,我们可以获取右下角的坐标。
  3. 从二维数组格式获取文本的坐标
    1
    2
    3
    4
    5
    6
    img = cv2.imread(Image2)
    img = cv2.rectangle(img,top_left,bottom_right,(0,255,0),3)
    img = cv2ImgAddText(img, text, bottom_right[0], bottom_right[1], (0, 255, 0), 10)
    plt.figure(figsize=(10,10))
    plt.imshow(img)
    plt.show()
  4. 读取图片
  5. 画矩形框
  6. 图片输入中文英文字符
  7. 图片显示

结果

处理多行文本的结果

1
2
3
4
5
6
7
8
9
10
11
12
13
img = cv2.imread(Image2)
spacer = 100
for detection in result:
top_left = tuple(detection[0][0])
bottom_right = tuple(detection[0][2])
text = detection[1]
img = cv2.rectangle(img,top_left,bottom_right,(0,255,0),3)
img = cv2ImgAddText(img, text, bottom_right[0], bottom_right[1], (0, 255, 0), 18)
spacer+=15

plt.figure(figsize=(10,10))
plt.imshow(img)
plt.show()
  • 遍历所有检测,绘制多行文本

结果:

中文绘制函数

1
2
3
4
5
6
7
8
def cv2ImgAddText(img, text, left, top, textColor=(0, 255, 0), textSize=20):
if (isinstance(img, np.ndarray)): # 判断是否OpenCV图片类型
img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
draw = ImageDraw.Draw(img)
fontStyle = ImageFont.truetype(
"simsun.ttc", textSize, encoding="utf-8")
draw.text((left, top), text, textColor, font=fontStyle)
return cv2.cvtColor(np.asarray(img), cv2.COLOR_RGB2BGR)

总结

识别准确度较高,识别时延极低在10-20ms,适合进行应用。