Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/develop'
Browse files Browse the repository at this point in the history
  • Loading branch information
mozillazg committed Aug 16, 2020
2 parents b220506 + 26b274c commit 113865e
Show file tree
Hide file tree
Showing 10 changed files with 148 additions and 16 deletions.
27 changes: 27 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,32 @@
Changelog
---------


`0.39.0`_ (2020-08-16)
++++++++++++++++++++++++

* **[New]** ``pinyin`` 和 ``lazy_pinyin`` 函数增加参数 ``v_to_u`` 和 ``neutral_tone_with_five``:

* ``v_to_u=True`` 时在无声调相关拼音风格下使用 ``ü`` 代替原来的 ``v``

.. code-block:: python
>>> lazy_pinyin('衣裳', style=Style.TONE3)
['yi1', 'shang']
>>> lazy_pinyin('衣裳', style=Style.TONE3, neutral_tone_with_five=True)
['yi1', 'shang5']
* ``neutral_tone_with_five=True`` 时在数字标识声调相关风格下使用 ``5`` 标识轻声

.. code-block:: python
>>> lazy_pinyin('战略')
['zhan', 'lve']
>>> lazy_pinyin('战略', v_to_u=True)
['zhan', 'lüe']
`0.38.1`_ (2020-07-05)
++++++++++++++++++++++++

Expand Down Expand Up @@ -848,3 +874,4 @@ __ https://github.com/mozillazg/python-pinyin/issues/8
.. _0.37.0: https://github.com/mozillazg/python-pinyin/compare/v0.36.0...v0.37.0
.. _0.38.0: https://github.com/mozillazg/python-pinyin/compare/v0.37.0...v0.38.0
.. _0.38.1: https://github.com/mozillazg/python-pinyin/compare/v0.38.0...v0.38.1
.. _0.39.0: https://github.com/mozillazg/python-pinyin/compare/v0.38.1...v0.39.0
9 changes: 7 additions & 2 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -56,12 +56,17 @@ Python 3(Python 2 下把 ``'中心'`` 替换为 ``u'中心'`` 即可):
[['ㄓㄨㄥ'], ['ㄒㄧㄣ']]
>>> lazy_pinyin('中心') # 不考虑多音字的情况
['zhong', 'xin']
>>> lazy_pinyin('战略', v_to_u=True) # 不使用 v 表示 ü
['zhan', 'lüe']
# 使用 5 标识轻声
>>> lazy_pinyin('衣裳', style=Style.TONE3, neutral_tone_with_five=True)
['yi1', 'shang5']
**注意事项** :

* 拼音结果不会标明哪个韵母是轻声,轻声的韵母没有声调或数字标识(使用 ``5`` 标识轻声的方法见 `文档 <https://pypinyin.readthedocs.io/zh_CN/master/contrib.html#neutraltonewith5mixin>`__ )。
* 无声调相关拼音风格下的结果会使用 ``v`` 表示 ``ü`` (使用 ``ü`` 代替 ``v`` 的方法见 `文档 <https://pypinyin.readthedocs.io/zh_CN/master/contrib.html#v2umixin>`__ )。
* 默认情况下拼音结果不会标明哪个韵母是轻声,轻声的韵母没有声调或数字标识(可以通过参数 ``neutral_tone_with_five=True`` 开启使用 ``5`` 标识轻声 )。
* 默认情况下无声调相关拼音风格下的结果会使用 ``v`` 表示 ``ü`` (可以通过参数 ``v_to_u=True`` 开启使用 ``ü`` 代替 ``v`` )。
* 默认情况下会原样输出没有拼音的字符(自定义处理没有拼音的字符的方法见 `文档 <https://pypinyin.readthedocs.io/zh_CN/master/usage.html#handle-no-pinyin>`__ )。

命令行工具:
Expand Down
2 changes: 1 addition & 1 deletion docs/develop.rst
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ TODO: 画流程图
6. 提交代码
7. 检查 develop 分支的 CI 结果
8. 切换到 master 分支
9. 合并 develop 分支代码: ``git merge_dev``
9. 合并 develop 分支代码: ``make merge_dev``
10. 更新版本号:

* 大改动(1.1.x -> 1.2.x):``make bump_minor``
Expand Down
6 changes: 3 additions & 3 deletions docs/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@
**注意事项** :

* 拼音结果不会标明哪个韵母是轻声,轻声的韵母没有声调或数字标识(使用 ``5`` 标识轻声的方法见 `文档 <https://pypinyin.readthedocs.io/zh_CN/master/contrib.html#neutraltonewith5mixin>`__ )。
* 无声调相关拼音风格下的结果会使用 ``v`` 表示 ``ü`` (使用 ``ü`` 代替 ``v`` 的方法见 `文档 <https://pypinyin.readthedocs.io/zh_CN/master/contrib.html#v2umixin>`__ )。
* 默认情况下拼音结果不会标明哪个韵母是轻声,轻声的韵母没有声调或数字标识(可以通过参数 ``neutral_tone_with_five=True`` 开启使用 ``5`` 标识轻声 )。
* 默认情况下无声调相关拼音风格下的结果会使用 ``v`` 表示 ``ü`` (可以通过参数 ``v_to_u=True`` 开启使用 ``ü`` 代替 ``v`` )。
* 默认情况下会原样输出没有拼音的字符(自定义处理没有拼音的字符的方法见 `文档 <https://pypinyin.readthedocs.io/zh_CN/master/usage.html#handle-no-pinyin>`__ )。


Expand Down Expand Up @@ -243,4 +243,4 @@ CYRILLIC_FIRST :py:attr:`~pypinyin.Style.CYRILLIC_FIRST`
================== =========================================


.. _《汉语拼音方案》: http://www.moe.edu.cn/s78/A19/yxs_left/moe_810/s230/195802/t19580201_186000.html
.. _《汉语拼音方案》: http://www.moe.gov.cn/s78/A19/yxs_left/moe_810/s230/195802/t19580201_186000.html
48 changes: 48 additions & 0 deletions pypinyin/converter.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@
PHRASES_DICT, PINYIN_DICT,
RE_HANS
)
from pypinyin.contrib.uv import V2UMixin
from pypinyin.contrib.neutral_tone import NeutralToneWith5Mixin
from pypinyin.utils import _remove_dup_items
from pypinyin.style import auto_discover
from pypinyin.style import convert as convert_style
Expand Down Expand Up @@ -321,3 +323,49 @@ def _convert_nopinyin_chars(self, chars, style, heteronym, errors, strict):
return ''.join(text_type('%x' % ord(x)) for x in chars)
else:
return text_type('%x' % ord(chars))


class _v2UConverter(V2UMixin, DefaultConverter):
pass


class _neutralToneWith5Converter(NeutralToneWith5Mixin, DefaultConverter):
pass


class _neutralToneWith5AndV2UConverter(
NeutralToneWith5Mixin, V2UMixin, DefaultConverter):
pass


class _mixConverter(DefaultConverter):
def __init__(self, v_to_u=False, neutral_tone_with_five=False, **kwargs):
super(_mixConverter, self).__init__(**kwargs)
self._v_to_u = v_to_u
self._neutral_tone_with_five = neutral_tone_with_five

self._v2uconverter = _v2UConverter()
self._neutraltonewith5converter = _neutralToneWith5Converter()
self._neutraltonewith5andv2uconverter = \
_neutralToneWith5AndV2UConverter()

def post_convert_style(self, han, orig_pinyin, converted_pinyin,
style, strict, **kwargs):
if self._v_to_u and not self._neutral_tone_with_five:
return self._v2uconverter.post_convert_style(
han, orig_pinyin, converted_pinyin, style, strict,
**kwargs)

if self._neutral_tone_with_five and not self._v_to_u:
return self._neutraltonewith5converter.post_convert_style(
han, orig_pinyin, converted_pinyin, style, strict,
**kwargs)

if self._neutral_tone_with_five and self._v_to_u:
return self._neutraltonewith5andv2uconverter.post_convert_style(
han, orig_pinyin, converted_pinyin, style, strict,
**kwargs)

return super(_mixConverter, self).post_convert_style(
han, orig_pinyin, converted_pinyin, style, strict,
**kwargs)
34 changes: 29 additions & 5 deletions pypinyin/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
from pypinyin.constants import (
PHRASES_DICT, PINYIN_DICT, Style
)
from pypinyin.converter import DefaultConverter
from pypinyin.converter import DefaultConverter, _mixConverter
from pypinyin.seg import mmseg
from pypinyin.seg.simpleseg import seg
from pypinyin.utils import (
Expand Down Expand Up @@ -209,7 +209,8 @@ def phrase_pinyin(phrase, style, heteronym, errors='default', strict=True):


def pinyin(hans, style=Style.TONE, heteronym=False,
errors='default', strict=True):
errors='default', strict=True,
v_to_u=False, neutral_tone_with_five=False):
"""将汉字转换为拼音,返回汉字的拼音列表。
:param hans: 汉字字符串( ``'你好吗'`` )或列表( ``['你好', '吗']`` ).
Expand All @@ -228,6 +229,11 @@ def pinyin(hans, style=Style.TONE, heteronym=False,
:param heteronym: 是否启用多音字
:param strict: 是否严格遵照《汉语拼音方案》来处理声母和韵母,详见 :ref:`strict`
:param v_to_u: 无声调相关拼音风格下的结果是否使用 ``ü`` 代替原来的 ``v``
:type v_to_u: bool
:param neutral_tone_with_five: 声调使用数字表示的相关拼音风格下的结果是否
使用 5 标识轻声
:type neutral_tone_with_five: bool
:return: 拼音列表
:rtype: list
Expand All @@ -247,8 +253,14 @@ def pinyin(hans, style=Style.TONE, heteronym=False,
[['zho1ng'], ['xi1n']]
>>> pinyin('中心', style=Style.CYRILLIC)
[['чжун1'], ['синь1']]
>>> pinyin('战略', v_to_u=True, style=Style.NORMAL)
[['zhan'], ['lüe']]
>>> pinyin('衣裳', style=Style.TONE3, neutral_tone_with_five=True)
[['yi1'], ['shang5']]
"""
return _default_pinyin.pinyin(
_pinyin = Pinyin(_mixConverter(
v_to_u=v_to_u, neutral_tone_with_five=neutral_tone_with_five))
return _pinyin.pinyin(
hans, style=style, heteronym=heteronym, errors=errors, strict=strict)


Expand Down Expand Up @@ -292,7 +304,8 @@ def slug(hans, style=Style.NORMAL, heteronym=False, separator='-',
)


def lazy_pinyin(hans, style=Style.NORMAL, errors='default', strict=True):
def lazy_pinyin(hans, style=Style.NORMAL, errors='default', strict=True,
v_to_u=False, neutral_tone_with_five=False):
"""将汉字转换为拼音,返回不包含多音字结果的拼音列表.
与 :py:func:`~pypinyin.pinyin` 的区别是返回的拼音是个字符串,
Expand All @@ -305,6 +318,11 @@ def lazy_pinyin(hans, style=Style.NORMAL, errors='default', strict=True):
:param errors: 指定如何处理没有拼音的字符,详情请参考
:py:func:`~pypinyin.pinyin`
:param strict: 是否严格遵照《汉语拼音方案》来处理声母和韵母,详见 :ref:`strict`
:param v_to_u: 无声调相关拼音风格下的结果是否使用 ``ü`` 代替原来的 ``v``
:type v_to_u: bool
:param neutral_tone_with_five: 声调使用数字表示的相关拼音风格下的结果是否
使用 5 标识轻声
:type neutral_tone_with_five: bool
:return: 拼音列表(e.g. ``['zhong', 'guo', 'ren']``)
:rtype: list
Expand All @@ -324,6 +342,12 @@ def lazy_pinyin(hans, style=Style.NORMAL, errors='default', strict=True):
['zho1ng', 'xi1n']
>>> lazy_pinyin('中心', style=Style.CYRILLIC)
['чжун1', 'синь1']
>>> lazy_pinyin('战略', v_to_u=True)
['zhan', 'lüe']
>>> lazy_pinyin('衣裳', style=Style.TONE3, neutral_tone_with_five=True)
['yi1', 'shang5']
"""
return _default_pinyin.lazy_pinyin(
_pinyin = Pinyin(_mixConverter(
v_to_u=v_to_u, neutral_tone_with_five=neutral_tone_with_five))
return _pinyin.lazy_pinyin(
hans, style=style, errors=errors, strict=strict)
8 changes: 6 additions & 2 deletions pypinyin/core.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,9 @@ def pinyin(hans: Union[List[Text], Text],
style: TStyle = ...,
heteronym: bool = ...,
errors: TErrors = ...,
strict: bool = ...
strict: bool = ...,
v_to_u: bool = ...,
neutral_tone_with_five: bool = ...
) -> List[List[Text]]: ...


Expand All @@ -78,7 +80,9 @@ def slug(hans: Union[List[Text], Text],
def lazy_pinyin(hans: Union[List[Text], Text],
style: TStyle = ...,
errors: TErrors = ...,
strict: bool = ...
strict: bool = ...,
v_to_u: bool = ...,
neutral_tone_with_five: bool = ...
) -> List[Text]: ...


Expand Down
3 changes: 1 addition & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ def long_description():

meta_d = get_meta()
setup(
name=meta_d['__title__'],
name='pypinyin',
version=meta_d['__version__'],
description='汉字拼音转换模块/工具.',
long_description=long_description(),
Expand All @@ -62,7 +62,6 @@ def long_description():
license=meta_d['__license__'],
project_urls={
'Documentation': 'https://pypinyin.readthedocs.io/',
'Say Thanks!': 'https://saythanks.io/to/mozillazg',
'Source': 'https://github.com/mozillazg/python-pinyin',
'Tracker': 'https://github.com/mozillazg/python-pinyin/issues',
},
Expand Down
26 changes: 25 additions & 1 deletion tests/contrib/test_neutral_tone.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,13 +30,32 @@ class TheyConverter(V2UMixin, NeutralToneWith5Mixin, DefaultConverter):
def test_neutral_tone_with_5():
assert lazy_pinyin('好了', style=Style.TONE2) == ['ha3o', 'le']
assert my_pinyin.lazy_pinyin('好了', style=Style.TONE2) == ['ha3o', 'le5']
assert lazy_pinyin(
'好了', style=Style.TONE2, neutral_tone_with_five=True
) == ['ha3o', 'le5']
assert her_pinyin.lazy_pinyin('好了', style=Style.TONE2) == ['ha3o', 'le5']
assert lazy_pinyin(
'好了', style=Style.TONE2, neutral_tone_with_five=True,
v_to_u=True) == ['ha3o', 'le5']
assert they_pinyin.lazy_pinyin('好了', style=Style.TONE2) == ['ha3o', 'le5']
assert lazy_pinyin(
'好了', style=Style.TONE2, v_to_u=True,
neutral_tone_with_five=True) == ['ha3o', 'le5']

assert lazy_pinyin('好了绿', style=Style.TONE2) == ['ha3o', 'le', 'lv4']
assert lazy_pinyin(
'好了绿', style=Style.TONE2, v_to_u=True,
neutral_tone_with_five=True) == ['ha3o', 'le5', 'lü4']

assert lazy_pinyin('好了') == ['hao', 'le']
assert my_pinyin.lazy_pinyin('好了') == ['hao', 'le']
assert lazy_pinyin('好了', neutral_tone_with_five=True) == ['hao', 'le']
assert her_pinyin.lazy_pinyin('好了') == ['hao', 'le']
assert they_pinyin.lazy_pinyin('好了') == ['hao', 'le']
assert lazy_pinyin(
'好了', neutral_tone_with_five=True, v_to_u=True) == ['hao', 'le']
assert lazy_pinyin(
'好了绿', v_to_u=True, neutral_tone_with_five=True) == [
'hao', 'le', 'lü']


@mark.parametrize('input,style,expected_old, expected_new', [
Expand Down Expand Up @@ -69,3 +88,8 @@ def test_neutral_tone_with_5_many_cases(
input, style, expected_old, expected_new):
assert lazy_pinyin(input, style=style) == expected_old
assert my_pinyin.lazy_pinyin(input, style=style) == expected_new
assert lazy_pinyin(
input, style=style, neutral_tone_with_five=True) == expected_new
assert lazy_pinyin(
input, style=style, neutral_tone_with_five=True,
v_to_u=True) == expected_new
1 change: 1 addition & 0 deletions tests/contrib/test_uv.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,4 @@ class MyConverter(V2UMixin, DefaultConverter):
def test_v2u():
assert lazy_pinyin('战略') == ['zhan', 'lve']
assert my_pinyin.lazy_pinyin('战略') == ['zhan', 'lüe']
assert lazy_pinyin('战略', v_to_u=True) == ['zhan', 'lüe']

0 comments on commit 113865e

Please sign in to comment.