Sunday, 22 November 2020

How to load a finetuned sciBERT model in AllenNLP?

I have finetuned the SciBERT model on the SciIE dataset. The repository uses AllenNLP to finetune the model. The training is executed as follows:

python -m allennlp.run train $CONFIG_FILE  --include-package scibert -s "$@" 

After a successful training I have a model.tar.gz file as an output that contains weights.th, config.json, and vocabulary folder. I have tried to load it in the allenlp predictor:

from allennlp.predictors.predictor import Predictor
predictor = Predictor.from_path("model.tar.gz")

But I get the following error:

ConfigurationError: bert-pretrained not in acceptable choices for dataset_reader.token_indexers.bert.type: ['single_id', 'characters', 'elmo_characters', 'spacy', 'pretrained_transformer', 'pretrained_transformer_mismatched']. You should either use the --include-package flag to make sure the correct module is loaded, or use a fully qualified class name in your config file like {"model": "my_module.models.MyModel"} to have it imported automatically.

I have never worked with allenNLP, so I am quite lost about what to do.

For reference, this is the part of the config that describer token indexers

"token_indexers": {
            "bert": {
                "type": "bert-pretrained",
                "do_lowercase": "false",
                "pretrained_model": "/home/tomaz/neo4j/scibert/model/vocab.txt",
                "use_starting_offsets": true
            }
        }

I am using allenlp version

Name: allennlp Version: 1.2.1

Edit:

I think I have made a lot of progress, I have to use the same version that was used to train the model and I can import the modules like so:

from allennlp.predictors.predictor import Predictor
from scibert.models.bert_crf_tagger import *
from scibert.models.bert_text_classifier import *
from scibert.models.dummy_seq2seq import *
from scibert.dataset_readers.classification_dataset_reader import *

predictor = Predictor.from_path("scibert_ner/model.tar.gz")
dataset_reader="classification_dataset_reader")
predictor.predict(
  sentence="Did Uriah honestly think he could beat The Legend of Zelda in under three hours?"
)

Now I get an error:

No default predictor for model type bert_crf_tagger.\nPlease specify a predictor explicitly

I know that I can use the predictor_name to specify a predictor explicitly, but I haven't got the faintest idea which name to pick that would work



from How to load a finetuned sciBERT model in AllenNLP?

No comments:

Post a Comment