Skip to content

Commit

Permalink
merged conflicts
Browse files Browse the repository at this point in the history
  • Loading branch information
jiangfeng1124 committed Jun 13, 2014
2 parents a999f8a + b4bc801 commit cdf5044
Show file tree
Hide file tree
Showing 246 changed files with 46,411 additions and 28,251 deletions.
31 changes: 19 additions & 12 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -24,21 +24,28 @@ config.h
include/
lib/
bin/
lgdpj
lgsrl
otcws
otpos
otner
maxent
cws
cws_cmdline
multi_cws_cmdline
pos
par
ner
tools/train/lgdpj
tools/train/lgsrl
tools/train/otcws
tools/train/otpos
tools/train/otner
tools/train/maxent
examples/cws
examples/cws_cmdline
examples/multi_cws_cmdline
examples/pos
examples/pos_cmdline
examples/multi_pos_cmdline
examples/par
examples/ner

###############
# data file #
###############
new_ltp_data/
ltp_data/

##################
# running folder #
##################
dummy/
13 changes: 13 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,19 @@ compiler:
before_install:
- sudo apt-get install gfortran
- sudo apt-get install cmake
env:
global:
- secure: "VdY9DQK8PdZ5cBpn9qG+8KqyC7BFuYPNwU4f5n19nto62V6ifU5XOLBkxCF36bSF8C4Nf0y0uDdj4gqMnL7OqgwFjucBRQLOuk/10uuy3azEjGzTxePgXlYw15XmMEpWy3hvxEfFqvonJ0g9+fZjKeEmbASVixIbWpYN/pIy2jQ="

addons:
coverity_scan:
project:
name: "HIT-SCIR/ltp"
description: "Language Technology Platform"
notification_email: [email protected]
build_command_prepend: "./configure"
build_command: "make -j4"
branch_pattern: master

script:
- ./configure
Expand Down
40 changes: 24 additions & 16 deletions ChangeLog.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
2014-01-20
----------
* 在分词、词性标注和依存句法分析模块中加入模型裁剪功能,减少了模型大小。用户可以通过配置文件里的rare-feature-threshold参数配置裁剪力度,如果rare-feature-threshold为0,则只去掉为0的特征;rare-feature-threshold大于0时将一步去掉更新次数低于阈值的特征。这一优化方法主要参考[Learning Sparser Perceptron Models](http://www.cs.bgu.ac.il/~yoavg/publications/acl2011sparse.pdf)
* 增加了`ltp_server`在异常输入情况下返回错误代码,如果输入数据编码错误或者输入xml不符合规则,将返回400
* 修复了词性标注、命名实体识别、依存句法分析训练套件中的内存泄露问题
* 修复了语义角色标注的内存泄露问题
* 修复了词性标注、命名实体识别模型文件的错误标示符,这项修改将导致3.1.0以及之后的版本不能与3.0.x的模型兼容,请务必注意
* 修复了由boost.multi_array.views引起的MSVC下不能以Debug方式编译的问题
* 修复了由打开文件时字符串为空引起的Windows下不能正常运行的bug

2013-09-29
----------
* 解决windows编译问题
Expand Down Expand Up @@ -199,11 +209,9 @@ float CParser::Smoothen(float ftd, float ftt, float ftttt, float ftw, float fwt,
* 胡禹轩修改了srl中overlapped的bug
* 上午修改IRLAS_DLL_x.cpp的时候使用到了MyLib.cpp。但是ltp中有很多MyLib.cpp,如:

```
_irlas/MyLib.cpp
__util/MyLib.cpp
_parser/MyLib.cpp
```
* _irlas/MyLib.cpp
* __util/MyLib.cpp
* _parser/MyLib.cpp

等,现在还没有统一。
* 我简单的将_irlas/MyLib.cpp替换__util/MyLib.cpp,导致出现了新的bug。因为_irlas/MyLib.cpp和__util/MyLib.cpp中convert_to_pair的实现不相同。完成功能是:`、/wp => [、][wp]`_irlas/MyLib.cpp中的实现是错误的。
Expand Down Expand Up @@ -268,7 +276,9 @@ string itos(int i);
```
char* presult = new char[5000];
```

当句子过长的时候,会出现内存越界。修改为:

```
int nChar = 0;
for (int i=0; i<(int)vecWord.size(); ++i) {
Expand All @@ -292,17 +302,17 @@ string itos(int i);

2007-11-22
----------
SDS中:`void SDS_TS::SelectSnt()`
定义:
SDS中:`void SDS_TS::SelectSnt()` 定义:

```
unsigned sntNum;
```
但是后面用到:
```
sntNum = m_vctSntPairs_Score[summarySntNum].m_nSntNum - 1;
if(sntNum >= 0) {
...
}
sntNum = m_vctSntPairs_Score[summarySntNum].m_nSntNum - 1;
if(sntNum >= 0) {
...
}
```
此时当`m_vctSntPairs_Score[summarySntNum].m_nSntNum == 0`时:`sntNum = 0xFFFF;`

Expand All @@ -313,8 +323,7 @@ SDS中:`void SDS_TS::SelectSnt()`

2007-11-22
----------
* Parser中:parser_dll_x.cpp中`void Parse(vector < string >& vecWord, vector < string >& vecPOS, vector < pair<int,string> >& vecParse)`
原来为:
* parser_dll_x.cpp中`void Parse(vector < string >& vecWord, vector < string >& vecPOS, vector < pair<int,string> >& vecParse)`中原来为:
```
char * csOutput = new char[vecWord.size() * 50];
```
Expand All @@ -326,9 +335,8 @@ SDS中:`void SDS_TS::SelectSnt()`
}
char * csOutput = new char[nChar * 2 + vecWord.size() * 32];
```
因为有的时候会输入:
```"------------------------------------"```
或者很长的数字串,这样会造成内存越界问题。

因为有的时候会输入`"------------------------------------"`或者很长的数字串,这样会造成内存越界问题。

2007-11-21
----------
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
---

* [百度云](http://pan.baidu.com/share/link?shareid=1988562907&uk=2738088569)
* 当前模型版本3.0.3
* 当前模型版本3.1.0

开源协议
-------
Expand Down
Loading

0 comments on commit cdf5044

Please sign in to comment.