Skip to content

基于行块分布函数的通用网页正文抽取算法的Python版本实现,添加了英文支持/ Web page content extraction algorithm, support both Chinese and English

License

Notifications You must be signed in to change notification settings

ccssu/cx-extractor-python.github.io

 
 

Repository files navigation

About

基于行块分布函数的通用网页正文抽取算法的Python版本实现,添加了英文支持/ Web page content extraction algorithm, support both Chinese and English

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HTML 99.9%
  • Other 0.1%