Skip to content

用于OCR的字符样本生成工具 A tools can generate samples for OCR trainning.

Notifications You must be signed in to change notification settings

smartcai/OCR_CharGen

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

#用于OCR训练的字符样本生成工具 A tools can generate samples for OCR trainning.

###生成的字符样张: ###需要的依赖:

  • opencv-python
  • pillow
  • numpy
  • PIL

###使用方法:

  • genSamples 构造函数

    • chineseFontpath 车牌中文字体的路径
    • EnglishFontPath 车牌英文字体的路径
  • genImage 渲染生成一张样本

    • id 用于生成字符索引 [0,30] 为汉字 [31,41] 为数字 [42,64] 为英文大写字符
    • tranformFactor 形变程度 为了使分类器具有较好抗扭曲性应该对其生成的字符进行一定程度的形变。这里建议的值是 5-10。 越大形变程度越大
    • shadeSize 遮罩尺寸 能使字符
    • shadeFilter 排除添加遮罩的字符
    • smuFilter 排除添加污迹的字符
    • blur 模糊卷积核大小程度
    • rotFilter 排除形变的字符
    • blurFilter_level1 削弱模糊的字符
    • blurFilter_level2 排除模块的字符
    • size 输出尺寸
  • genBatch 批量生成

    • batchSize 每批的尺寸 如果是 1000 就是每个字符 1000 张
    • charRange 字符范围 如 range(65) ,range(0,31)
    • outputPath 输出路径
    • ...同上

###字符索引表: index = {"京": 0, "沪": 1, "津": 2, "渝": 3, "冀": 4, "晋": 5, "蒙": 6, "辽": 7, "吉": 8, "黑": 9, "苏": 10, "浙": 11, "皖": 12, "闽": 13, "赣": 14, "鲁": 15, "豫": 16, "鄂": 17, "湘": 18, "粤": 19, "桂": 20, "琼": 21, "川": 22, "贵": 23, "云": 24, "藏": 25, "陕": 26, "甘": 27, "青": 28, "宁": 29, "新": 30, "0": 31, "1": 32, "2": 33, "3": 34, "4": 35, "5": 36, "6": 37, "7": 38, "8": 39, "9": 40, "A": 41, "B": 42, "C": 43, "D": 44, "E": 45, "F": 46, "G": 47, "H": 48, "J": 49, "K": 50, "L": 51, "M": 52, "N": 53, "P": 54, "Q": 55, "R": 56, "S": 57, "T": 58, "U": 59, "V": 60, "W": 61, "X": 62, "Y": 63, "Z": 64};

###作者:

  • Jack Yu

About

用于OCR的字符样本生成工具 A tools can generate samples for OCR trainning.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%