Skip to content
/ LEVEN Public

Source code and dataset for ACL2022 Findings Paper "LEVEN: A Large-Scale Chinese Legal Event Detection dataset"

Notifications You must be signed in to change notification settings

thunlp/LEVEN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LEVEN

Dataset and source code for ACL 2022 Findings paper "LEVEN: A Large-Scale Chinese Legal Event Detection Dataset" .

Background

Events are the essence of the facts in legal cases. Therefore, Legal Event Detection (LED) is fundamentally important and naturally beneficial to case understanding and other Legal AI tasks.

bg

Overview

The dataset can be obtained from Tsinghua Cloud or Google Drive. The annotation guidelines are provided in Annotation Guidelines. You can also check out our poster at ACL2022 main conference.

We remove the annotations for the test set deliberately. To get the results on LEVEN test set, please refer to Leaderboard.

Large Scale

LEVEN is the largest Legal Event Detection dataset and the largest Chinese Event Detection dataset. Here is a comparison between the scale of LEVEN and other datasets.

tab1

Datasets denoted with * are not publicly available, and – means the value is not accessible

High Coverage

LEVEN contains 108 event types in total, including 64 charge-oriented events and 44 general events. Their distribution is shown below.

tab2

The LEVEN event schema has a sophisticated hierarchical structure, which is shown here.

Leaderboard

LEVEN is adopted for CAIL 2022, the most influential Legal AI contest in China.

You can submit your predictions to CAIL Event Detection Track to win a prize up to CNY 15,000!

Please follow submission instructions here.

Experiments

The source codes for the experiments are included in the Baselines and Downstreams folder.

Baselines

We implement six competitive Baselines and their performances are as follows.

tab3

Downstream Tasks

We also explore the use of LEVEN on two Downstreams. We simply use event as side information to promote the performance of Legal Judgment Prediction and Similar Case Retrieval.

The experiment results for Legal Judgment Prediction are shown below.

tab4

The experiment results for Similar Case Retrieval are shown below.

tab5

Schema

The Chinese event schema is shown below. Please check our paper for the English version.

The detailed explanation and annotation guidelines are provided in Annotation Guidelines.

schema

Citation

If these data and codes help you, please cite this paper.

@inproceedings{yao-etal-2022-leven,
    title = "{LEVEN}: A Large-Scale {C}hinese Legal Event Detection Dataset",
    author = "Yao, Feng and Xiao, Chaojun and Wang, Xiaozhi and Liu, Zhiyuan and Hou, Lei and Tu, Cunchao and Li, Juanzi and Liu, Yun and Shen, Weixing and Sun, Maosong",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2022",
    year = "2022",
    url = "https://aclanthology.org/2022.findings-acl.17",
    doi = "10.18653/v1/2022.findings-acl.17",
    pages = "183--201",
}

About

Source code and dataset for ACL2022 Findings Paper "LEVEN: A Large-Scale Chinese Legal Event Detection dataset"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published