Argument retrieval model for Touché @ CLEF 2020 - 1st Shared Task on Argument Retrieval.
- Touché notebook (English, short paper)
- Qrel evaluation - Evaluation results for official qrels (on original dataset)
- Leaderboard results - Official results as team 'Oscar François de Jarjayes' in leader board
In this work we explore the yet untested inclusion of sentiment analysis in the argument ranking process. By utilizing a word embedding model we create document embeddings for all queries and arguments. These are compared with each other to calculate top-N argument context scores for each query. We also calculate top-N DPH scores with the Terrier Framework. This way, each query receives two lists of top-N arguments. Afterwards we form an intersection of both argument lists and sort the result by the DPH scores. To further increase the ranking quality, we sort the final arguments of each query by sentiment values. Our findings ultimately imply that rewarding neutral sentiments can decrease the quality of the retrieval outcome.
- Download and extract the Args.me Corpus
- Put args-me.json and the topic.xml in your input directory
$ docker pull mongo:4.4
$ docker build -t argu .
$ chmod +x ./tira_run.sh
$ ./tira_run.sh -s <run-type> -i $inputDataset -o $outputDir
- Run Types
no
: No sentimentsemotional
: Emotional is betterneutral
: Neutral is better
- Run Types
$ docker build -t argu .
- Build the image
$ docker run --name argu-mongo -p 27017:27017 -d --rm mongo
- Starts a MonoDB container
$ docker run -e RUN_TYPE=<run-type> -v <input-dir-path>:/input -v <output-dir-path>:/output --name argu --rm -it --network="host" argu
- Runs the ArgU container
- Run Types
no
: No sentimentsemotional
: Emotional is betterneutral
: Neutral is better
- Input directory with args-me.json and topics.xml
- Output directory will get the results as run.txt
$ docker stop argu-mongo
- Remove MongoDB container
-
$ pip install -r requirements.txt
-
$ python argU/preprocessing/mongodb.py -i <input-dir-path>
- Create mapping (mongoDB ID <--> argument.id); store into MongoDB
- Store arguments with the new ID into MongoDB
- Clean arguments and store as train-arguments into MongoDB
- Read Sentiments and store into MongoDB
-
$ python argU/preprocessing/trec.py -i <input-dir-path>
- Create a .trec-file for Terrier (train and queries)
-
$ python argU/indexing/a2v.py -f
- Generate CBOW
- Generate argument embeddings and store them into MongoDB
-
Install and run Terrier (see Dockerfile)
- Calculate DPH for queries
- copy result file in resources
-
$ python -m argU -d
- Compare given queries with argument embeddings; store Top-N DESM scores into MongoDB
-
Merge DESM, Terrier and Sentiments to create final scores
$ python -m argU -m -s no -o <output-dir-path>
- R1: No sentiments
$ python -m argU -m -s emotional -o <output-dir-path>
- R2: Emotional is better
$ python -m argU -m -s neutral -o <output-dir-path>
- R3: Neutral is better
- For individual moduls, cd into directory and run
$ python -m [modulname]
- indexing - Index for DESM scores
- preprocessing - Prework and cleaning of input data
- sentiment - Sentiment analysis
- utils - Helper functionalities
- visualization - Visualization of scores
- Docker - Used to build and run
- gensim - Used to generate a Continuous Bag of Words Model
- NumPy - Used as a mathematical base to compute vectors and matrices
- Matplotlib - Used to visualize scores
- Google Cloud Natural Language API - Used for sentiment analysis
- Natural Language Toolkit - (Deprecated) Used to train sentiment analysis model
- MongoDB - Used to store arguments and scores
- Terrier - Used to calculate DPH scores
Decision making processes, be it at the societal or at the personal level, eventually come to a point where one side will challenge the other with a why-question, which is a prompt to justify one’s stance. Thus, technologies for argument mining and argumentation processing are maturing at a rapid pace, giving rise for the first time to argument retrieval. We invite to participate in the first lab on Argument Retrieval at CLEF 2020 featuring two subtasks:
(1) retrieval in a focused argument collection to support argumentative conversations.
The (1) subtask is motivated by the support of users who search for arguments directly, e.g., by supporting their stance, and targets argumentative conversations. The task is to retrieve arguments from the provided dataset of the focused crawl with content from online debate portals for the 50 given topics, covering a wide range of controversial issues.
Argument topics for subtask (1) and comparative questions for subtask (2) will be send to each team via email upon completed registration. The topics will be provided as XML files.
Example topic for subtask (1):
1 <title>Is climate change real?</title> You read an opinion piece on how climate change is a hoax and disagree. Now you are looking for arguments supporting the claim that climate change is in fact real. Relevant arguments will support the given stance that climate change is real or attack a hoax side's argument.Document collections. To search for relevant arguments, you can use your own index based on the dataset args-me or for simplicity deploy an API of the search engine args.me.
We encourage participants to use TIRA for their submissions to increase replicability of the experiments. We provide a dedicated TIRA tutorial for Touché and are available to walk you through. You can also submit runs per email. In both cases, we will review your submission promptly and provide feedback.
Runs may be either automatic or manual. An automatic run is made without any manual manipulation of the given topic titles. Your run is automatic if you do not use description and narrative for developing approaches. A manual run is anything that is not an automatic run. Please let us know which of your runs are manual upon submission.
The submission format for both tasks will follow the standard TREC format:
qid Q0 doc rank score tag
With:
- qid: The topic number.
- Q0: Unused, should always be Q0.
- doc: The document id returned by your system for the topic qid:
- For subtask (1): Use the official args-me id.
- rank: The rank the document is retrieved at.
- score: The score (integer or floating point) that generated the ranking. The score must be in descending (non-increasing) order. It is important to handle tied scores. (trec_eval sorts documents by the score values and not your rank values.)
- tag: A tag that identifies your group and the method you used to produce the run.
The fields should be spectated with a whitespace. The width of the columns in the format is not important, but it is important to include all columns and have some amount of white space between the columns.
An example run for task 1 is:
1 Q0 10113b57-2019-04-18T17:05:08Z-00001-000 1 17.89 myGroupMyMethod
1 Q0 100531be-2019-04-18T19:18:31Z-00000-000 2 16.43 myGroupMyMethod
1 Q0 10006689-2019-04-18T18:27:51Z-00000-000 3 16.42 myGroupMyMethod
...
- Literature: Ajjour et al. 2019, Wachsmuth et al. 2017, Potthast et al. 2019
- Dataset: Args.me Corpus
- Evaluation: Topics Queries XML