Skip to content

Commit

Permalink
[SKEP P1] pass labels to forward function in examples/applications (P…
Browse files Browse the repository at this point in the history
…addlePaddle#5067)

* initial commit

* refine readme

* refine codestyle

* refine readme

* refine readme

* fix model saving bug

* initial commit

* initial commit

* initial commit

* use common metric instead of eval_metrics.py and remove unuseful code

* mv stage project to ASO_analysis

* add unified sentiment analysis

* refine readme

* refine readme

* refnie readme

* add unified sentiment analysis

* refine readme

* initial commit

* initial commit

* refine readme

* add taskflow for sentiment analysis with UIE

* refine Readme

* refine readme.md

* support sentiment analysis (UIE) with inputing by file format

* refine readme

* delete predict scripts

* refine readme

* delete unuseful files

* add pipeline for sentiment_analysis

* merging with the newest code

* fix to convert data without synonyms

* add senta pipeline

* refine readme

* drop functions: inputting file and saving results

* add UIE-seta-[base, medium, mini, micro, nano]

* modify .gitignore to trace deploy code

* add deploy with SimpleServer

* add debug mode

* fix debug mode

* update the loading  method of UIE

* refine readme

* fix bug caused by version updating

* fix hard coding for model name.

* refine codestyle

* modify readme according the way of 'step by step'

* refine codestyl

* change saving txt to json files

* download font automatically when not input font_path

* change readme in the way 'step by step'

* add model prediction by batch

* add uie-senta-x to support_schema_list

* update sentiment analysis in taskflow

* add prediction with saved offline model

* change the exception exposure way

* add description for visual schema

* delete comments

* remove comments

* remove unused code and comments

* convert uie-senta-x model params to fit ernie/uie

* refine readme for sentiment analysis

* add running time

* refine readme for senta pipeline

* change uie-base to uie-senta-base

* load uie-senta-x with auto module

* add deploy with SimpleServer

* refine codestyle

* refine readme

* add uie-senta-x to support_schema_list

* fix hard coding for mdoel anme

* refine codestyle

* refine codestyl

* refine codestyle

* refine codestyle

* refine codestyle

* refine codestyle

* refine codestyle

* refine codestyle

* refine codestyle

* fix senta response

* refine codestyle

* remove lambda expressions

* add link of senta pipeline

* refine codestyle

* remove local path

* fix typos

* refine readme

* load uie-senta-x with automodel

* remove commented code

* restore auto

* add link of hotel dataset to readme.

* add link for downloading test_hotel.txt

* fix url problem for server and client

* refine readme

* fix for senta_examples.py

* update visualization function

* update visualization function

* refine readme and update visualization description

* update visualization function

* refine readme and update visualization function

* change logger in PaddleNLP to log information

* fix running time for skep and uie

* fix bug to solve tokenizer updating problem

* refine label-studio readme

* refine label-studio readme

* refine label-studio readme

* optimize example construction for a, o, as, ao extraction task

* add the labeling method for ext task: a,  as, ao and so on.

* add note for visual_analysis.py

* change link for downloading data and refine log output

* refine log output

* refine readme

* expose options interface

* refine readme

* modify typos

* expose options for customing sentiment analysis

* README.md

* fix bug for param is_shuffle in label_studio.py

* [BugFix] Fix the param is_shuffle problem

* [BugFix] Fix the bool param is_shuffle problem

* [BugFix] Fix the bool param is_shuffle problem

* [Model Update] add configuration for skep model.

* [Transformer Update] update skep and add related unittest

* [Skep Update] fix examples and taskflow

* [Transformer Update] update skep examples for sentiment analysis

* [Transformer Update] update skep examples for sentiment analysis

* [Transformer Update] CodeStyle for examples/skep/*

* [Transformer Update] examples/skep done

* [Transformer Update] refine readme in examples/skep

* [Transformer Update] initial skep tests done.

* [Skep Upgrade] add skep in taskflow tests

* [Skep Upgrade] add tests for skep in taskflow

* [Skep Upgrade] remove yapf in examples/skep

* [Skep P0] remove print

* [SKEP P0] fix the param ckpt_dir in predict scripts

* [SKEP P0] fix the param ckpt_dir for skep examples.

* [SKEP P0] fix ci_case

* [SKEP P0] remove tiny_random_bert taskflow/test_sentiment_analysis

* [SKEP P0] add tiny-random-skep as code comment

* [SKEP P0] using tiny_random_skep for tests in case of OOM at test machine.

* [SKEP P0] add taskflow prediction for examples/skep/sentence

* [SKEP P0] add __internal_testing__/tiny-random-skep as taskflow model

* [SKEP P1] fix return parameters for SkepCrfForTokenClassification

* [SKEP P1] pass labels to forward for examples

* [SKEP P1] pass labels to forward for applications

---------

Co-authored-by: tianxin <[email protected]>
  • Loading branch information
1649759610 and tianxin authored Mar 2, 2023
1 parent 3b6db41 commit 1363448
Show file tree
Hide file tree
Showing 4 changed files with 7 additions and 17 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@

import numpy as np
import paddle
import paddle.nn.functional as F
from data import convert_example_to_feature, load_dict
from datasets import load_dataset
from evaluate import evaluate
Expand Down Expand Up @@ -100,8 +99,7 @@ def train():
batch_data["token_type_ids"],
batch_data["labels"],
)
logits = model(input_ids, token_type_ids=token_type_ids)
loss = F.cross_entropy(logits, labels)
loss, logits = model(input_ids, token_type_ids=token_type_ids, labels=labels)

loss.backward()
lr_scheduler.step()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@

import numpy as np
import paddle
import paddle.nn.functional as F
from data import convert_example_to_feature, load_dict
from datasets import load_dataset
from evaluate import evaluate
Expand Down Expand Up @@ -100,8 +99,7 @@ def train():
batch_data["token_type_ids"],
batch_data["labels"],
)
logits = model(input_ids, token_type_ids=token_type_ids)
loss = F.cross_entropy(logits.reshape([-1, len(label2id)]), labels.reshape([-1]), ignore_index=-1)
loss, logits = model(input_ids, token_type_ids=token_type_ids, labels=labels)

loss.backward()
lr_scheduler.step()
Expand Down
4 changes: 1 addition & 3 deletions examples/sentiment_analysis/skep/train_aspect.py
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,6 @@ def create_dataloader(dataset, mode="train", batch_size=1, batchify_fn=None, tra
weight_decay=args.weight_decay,
apply_decay_param_fun=lambda x: x in decay_params,
)
criterion = paddle.nn.loss.CrossEntropyLoss()
metric = paddle.metric.Accuracy()

global_step = 0
Expand All @@ -156,8 +155,7 @@ def create_dataloader(dataset, mode="train", batch_size=1, batchify_fn=None, tra
for epoch in range(1, args.epochs + 1):
for step, batch in enumerate(train_data_loader, start=1):
input_ids, token_type_ids, labels = batch["input_ids"], batch["token_type_ids"], batch["labels"]
logits = model(input_ids, token_type_ids)
loss = criterion(logits, labels)
loss, logits = model(input_ids, token_type_ids, labels=labels)
probs = F.softmax(logits, axis=1)
correct = metric.compute(probs, labels)
metric.update(correct)
Expand Down
12 changes: 4 additions & 8 deletions examples/sentiment_analysis/skep/train_sentence.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,13 +68,12 @@ def set_seed(seed):


@paddle.no_grad()
def evaluate(model, criterion, metric, data_loader):
def evaluate(model, metric, data_loader):
"""
Given a dataset, it evals model and computes the metric.
Args:
model(obj:`paddle.nn.Layer`): A model to classify texts.
criterion(obj:`paddle.nn.Layer`): It can compute the loss.
metric(obj:`paddle.metric.Metric`): The evaluation metric.
data_loader(obj:`paddle.io.DataLoader`): The dataset loader which generates batches.
"""
Expand All @@ -83,8 +82,7 @@ def evaluate(model, criterion, metric, data_loader):
losses = []
for batch in data_loader:
input_ids, token_type_ids, labels = batch["input_ids"], batch["token_type_ids"], batch["labels"]
logits = model(input_ids, token_type_ids)
loss = criterion(logits, labels)
loss, logits = model(input_ids, token_type_ids, labels=labels)
losses.append(loss.numpy())
correct = metric.compute(logits, labels)
metric.update(correct)
Expand Down Expand Up @@ -196,7 +194,6 @@ def create_dataloader(dataset, mode="train", batch_size=1, batchify_fn=None, tra
weight_decay=args.weight_decay,
apply_decay_param_fun=lambda x: x in decay_params,
)
criterion = paddle.nn.loss.CrossEntropyLoss()
metric = paddle.metric.Accuracy()

# start to train model
Expand All @@ -206,8 +203,7 @@ def create_dataloader(dataset, mode="train", batch_size=1, batchify_fn=None, tra
for epoch in range(1, args.epochs + 1):
for step, batch in enumerate(train_data_loader, start=1):
input_ids, token_type_ids, labels = batch["input_ids"], batch["token_type_ids"], batch["labels"]
logits = model(input_ids, token_type_ids)
loss = criterion(logits, labels)
loss, logits = model(input_ids, token_type_ids, labels=labels)
probs = F.softmax(logits, axis=1)
correct = metric.compute(probs, labels)
metric.update(correct)
Expand All @@ -227,7 +223,7 @@ def create_dataloader(dataset, mode="train", batch_size=1, batchify_fn=None, tra
save_dir = os.path.join(args.save_dir, "model_%d" % global_step)
if not os.path.exists(save_dir):
os.makedirs(save_dir)
evaluate(model, criterion, metric, dev_data_loader)
evaluate(model, metric, dev_data_loader)
# Need better way to get inner model of DataParallel
model._layers.save_pretrained(save_dir)
tokenizer.save_pretrained(save_dir)

0 comments on commit 1363448

Please sign in to comment.