Monday 27 March 2023

How to prevent transformer generate function to produce certain words?

I have the following code:

from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("t5-small")
model = T5ForConditionalGeneration.from_pretrained("t5-small")

input_ids = tokenizer("The <extra_id_0> walks in <extra_id_1> park", return_tensors="pt").input_ids

sequence_ids = model.generate(input_ids)
sequences = tokenizer.batch_decode(sequence_ids)
sequences

Currently it produces this:

['<pad><extra_id_0> park offers<extra_id_1> the<extra_id_2> park.</s>']

Is there a way to prevent the generator to produce certain words (e.g. park, offer) not in the list?



from How to prevent transformer generate function to produce certain words?

No comments:

Post a Comment