Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Are the pre-trained layers of the Huggingface BERT models frozen?

I use the following classification model from Huggingface:

model = AutoModelForSequenceClassification.from_pretrained("dbmdz/bert-base-german-cased", num_labels=2).to(device)

As I understand, this adds a dense layer at the end of the pre-trained model which has 2 output nodes. But are all the pre-trained layers before that frozen? Or are they also updated when fine-tuning? I can’t find information about that in the docs…

So do I still have to do something like this?:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

for param in model.bert.parameters():
    param.requires_grad = False

>Solution :

They are not frozen. All parameters are trainable by default. You can also check that with:

for name, param in model.named_parameters():
    print(name, param.requires_grad)

Output:

bert.embeddings.word_embeddings.weight True
bert.embeddings.position_embeddings.weight True
bert.embeddings.token_type_embeddings.weight True
bert.embeddings.LayerNorm.weight True
bert.embeddings.LayerNorm.bias True
bert.encoder.layer.0.attention.self.query.weight True
bert.encoder.layer.0.attention.self.query.bias True
bert.encoder.layer.0.attention.self.key.weight True
bert.encoder.layer.0.attention.self.key.bias True
bert.encoder.layer.0.attention.self.value.weight True
bert.encoder.layer.0.attention.self.value.bias True
bert.encoder.layer.0.attention.output.dense.weight True
bert.encoder.layer.0.attention.output.dense.bias True
bert.encoder.layer.0.attention.output.LayerNorm.weight True
bert.encoder.layer.0.attention.output.LayerNorm.bias True
bert.encoder.layer.0.intermediate.dense.weight True
bert.encoder.layer.0.intermediate.dense.bias True
bert.encoder.layer.0.output.dense.weight True
bert.encoder.layer.0.output.dense.bias True
bert.encoder.layer.0.output.LayerNorm.weight True
bert.encoder.layer.0.output.LayerNorm.bias True
bert.encoder.layer.1.attention.self.query.weight True
bert.encoder.layer.1.attention.self.query.bias True
bert.encoder.layer.1.attention.self.key.weight True
bert.encoder.layer.1.attention.self.key.bias True
bert.encoder.layer.1.attention.self.value.weight True
bert.encoder.layer.1.attention.self.value.bias True
bert.encoder.layer.1.attention.output.dense.weight True
bert.encoder.layer.1.attention.output.dense.bias True
bert.encoder.layer.1.attention.output.LayerNorm.weight True
bert.encoder.layer.1.attention.output.LayerNorm.bias True
bert.encoder.layer.1.intermediate.dense.weight True
bert.encoder.layer.1.intermediate.dense.bias True
bert.encoder.layer.1.output.dense.weight True
bert.encoder.layer.1.output.dense.bias True
bert.encoder.layer.1.output.LayerNorm.weight True
bert.encoder.layer.1.output.LayerNorm.bias True
bert.encoder.layer.2.attention.self.query.weight True
bert.encoder.layer.2.attention.self.query.bias True
bert.encoder.layer.2.attention.self.key.weight True
bert.encoder.layer.2.attention.self.key.bias True
bert.encoder.layer.2.attention.self.value.weight True
bert.encoder.layer.2.attention.self.value.bias True
bert.encoder.layer.2.attention.output.dense.weight True
bert.encoder.layer.2.attention.output.dense.bias True
bert.encoder.layer.2.attention.output.LayerNorm.weight True
bert.encoder.layer.2.attention.output.LayerNorm.bias True
bert.encoder.layer.2.intermediate.dense.weight True
bert.encoder.layer.2.intermediate.dense.bias True
bert.encoder.layer.2.output.dense.weight True
bert.encoder.layer.2.output.dense.bias True
bert.encoder.layer.2.output.LayerNorm.weight True
bert.encoder.layer.2.output.LayerNorm.bias True
bert.encoder.layer.3.attention.self.query.weight True
bert.encoder.layer.3.attention.self.query.bias True
bert.encoder.layer.3.attention.self.key.weight True
bert.encoder.layer.3.attention.self.key.bias True
bert.encoder.layer.3.attention.self.value.weight True
bert.encoder.layer.3.attention.self.value.bias True
bert.encoder.layer.3.attention.output.dense.weight True
bert.encoder.layer.3.attention.output.dense.bias True
bert.encoder.layer.3.attention.output.LayerNorm.weight True
bert.encoder.layer.3.attention.output.LayerNorm.bias True
bert.encoder.layer.3.intermediate.dense.weight True
bert.encoder.layer.3.intermediate.dense.bias True
bert.encoder.layer.3.output.dense.weight True
bert.encoder.layer.3.output.dense.bias True
bert.encoder.layer.3.output.LayerNorm.weight True
bert.encoder.layer.3.output.LayerNorm.bias True
bert.encoder.layer.4.attention.self.query.weight True
bert.encoder.layer.4.attention.self.query.bias True
bert.encoder.layer.4.attention.self.key.weight True
bert.encoder.layer.4.attention.self.key.bias True
bert.encoder.layer.4.attention.self.value.weight True
bert.encoder.layer.4.attention.self.value.bias True
bert.encoder.layer.4.attention.output.dense.weight True
bert.encoder.layer.4.attention.output.dense.bias True
bert.encoder.layer.4.attention.output.LayerNorm.weight True
bert.encoder.layer.4.attention.output.LayerNorm.bias True
bert.encoder.layer.4.intermediate.dense.weight True
bert.encoder.layer.4.intermediate.dense.bias True
bert.encoder.layer.4.output.dense.weight True
bert.encoder.layer.4.output.dense.bias True
bert.encoder.layer.4.output.LayerNorm.weight True
bert.encoder.layer.4.output.LayerNorm.bias True
bert.encoder.layer.5.attention.self.query.weight True
bert.encoder.layer.5.attention.self.query.bias True
bert.encoder.layer.5.attention.self.key.weight True
bert.encoder.layer.5.attention.self.key.bias True
bert.encoder.layer.5.attention.self.value.weight True
bert.encoder.layer.5.attention.self.value.bias True
bert.encoder.layer.5.attention.output.dense.weight True
bert.encoder.layer.5.attention.output.dense.bias True
bert.encoder.layer.5.attention.output.LayerNorm.weight True
bert.encoder.layer.5.attention.output.LayerNorm.bias True
bert.encoder.layer.5.intermediate.dense.weight True
bert.encoder.layer.5.intermediate.dense.bias True
bert.encoder.layer.5.output.dense.weight True
bert.encoder.layer.5.output.dense.bias True
bert.encoder.layer.5.output.LayerNorm.weight True
bert.encoder.layer.5.output.LayerNorm.bias True
bert.encoder.layer.6.attention.self.query.weight True
bert.encoder.layer.6.attention.self.query.bias True
bert.encoder.layer.6.attention.self.key.weight True
bert.encoder.layer.6.attention.self.key.bias True
bert.encoder.layer.6.attention.self.value.weight True
bert.encoder.layer.6.attention.self.value.bias True
bert.encoder.layer.6.attention.output.dense.weight True
bert.encoder.layer.6.attention.output.dense.bias True
bert.encoder.layer.6.attention.output.LayerNorm.weight True
bert.encoder.layer.6.attention.output.LayerNorm.bias True
bert.encoder.layer.6.intermediate.dense.weight True
bert.encoder.layer.6.intermediate.dense.bias True
bert.encoder.layer.6.output.dense.weight True
bert.encoder.layer.6.output.dense.bias True
bert.encoder.layer.6.output.LayerNorm.weight True
bert.encoder.layer.6.output.LayerNorm.bias True
bert.encoder.layer.7.attention.self.query.weight True
bert.encoder.layer.7.attention.self.query.bias True
bert.encoder.layer.7.attention.self.key.weight True
bert.encoder.layer.7.attention.self.key.bias True
bert.encoder.layer.7.attention.self.value.weight True
bert.encoder.layer.7.attention.self.value.bias True
bert.encoder.layer.7.attention.output.dense.weight True
bert.encoder.layer.7.attention.output.dense.bias True
bert.encoder.layer.7.attention.output.LayerNorm.weight True
bert.encoder.layer.7.attention.output.LayerNorm.bias True
bert.encoder.layer.7.intermediate.dense.weight True
bert.encoder.layer.7.intermediate.dense.bias True
bert.encoder.layer.7.output.dense.weight True
bert.encoder.layer.7.output.dense.bias True
bert.encoder.layer.7.output.LayerNorm.weight True
bert.encoder.layer.7.output.LayerNorm.bias True
bert.encoder.layer.8.attention.self.query.weight True
bert.encoder.layer.8.attention.self.query.bias True
bert.encoder.layer.8.attention.self.key.weight True
bert.encoder.layer.8.attention.self.key.bias True
bert.encoder.layer.8.attention.self.value.weight True
bert.encoder.layer.8.attention.self.value.bias True
bert.encoder.layer.8.attention.output.dense.weight True
bert.encoder.layer.8.attention.output.dense.bias True
bert.encoder.layer.8.attention.output.LayerNorm.weight True
bert.encoder.layer.8.attention.output.LayerNorm.bias True
bert.encoder.layer.8.intermediate.dense.weight True
bert.encoder.layer.8.intermediate.dense.bias True
bert.encoder.layer.8.output.dense.weight True
bert.encoder.layer.8.output.dense.bias True
bert.encoder.layer.8.output.LayerNorm.weight True
bert.encoder.layer.8.output.LayerNorm.bias True
bert.encoder.layer.9.attention.self.query.weight True
bert.encoder.layer.9.attention.self.query.bias True
bert.encoder.layer.9.attention.self.key.weight True
bert.encoder.layer.9.attention.self.key.bias True
bert.encoder.layer.9.attention.self.value.weight True
bert.encoder.layer.9.attention.self.value.bias True
bert.encoder.layer.9.attention.output.dense.weight True
bert.encoder.layer.9.attention.output.dense.bias True
bert.encoder.layer.9.attention.output.LayerNorm.weight True
bert.encoder.layer.9.attention.output.LayerNorm.bias True
bert.encoder.layer.9.intermediate.dense.weight True
bert.encoder.layer.9.intermediate.dense.bias True
bert.encoder.layer.9.output.dense.weight True
bert.encoder.layer.9.output.dense.bias True
bert.encoder.layer.9.output.LayerNorm.weight True
bert.encoder.layer.9.output.LayerNorm.bias True
bert.encoder.layer.10.attention.self.query.weight True
bert.encoder.layer.10.attention.self.query.bias True
bert.encoder.layer.10.attention.self.key.weight True
bert.encoder.layer.10.attention.self.key.bias True
bert.encoder.layer.10.attention.self.value.weight True
bert.encoder.layer.10.attention.self.value.bias True
bert.encoder.layer.10.attention.output.dense.weight True
bert.encoder.layer.10.attention.output.dense.bias True
bert.encoder.layer.10.attention.output.LayerNorm.weight True
bert.encoder.layer.10.attention.output.LayerNorm.bias True
bert.encoder.layer.10.intermediate.dense.weight True
bert.encoder.layer.10.intermediate.dense.bias True
bert.encoder.layer.10.output.dense.weight True
bert.encoder.layer.10.output.dense.bias True
bert.encoder.layer.10.output.LayerNorm.weight True
bert.encoder.layer.10.output.LayerNorm.bias True
bert.encoder.layer.11.attention.self.query.weight True
bert.encoder.layer.11.attention.self.query.bias True
bert.encoder.layer.11.attention.self.key.weight True
bert.encoder.layer.11.attention.self.key.bias True
bert.encoder.layer.11.attention.self.value.weight True
bert.encoder.layer.11.attention.self.value.bias True
bert.encoder.layer.11.attention.output.dense.weight True
bert.encoder.layer.11.attention.output.dense.bias True
bert.encoder.layer.11.attention.output.LayerNorm.weight True
bert.encoder.layer.11.attention.output.LayerNorm.bias True
bert.encoder.layer.11.intermediate.dense.weight True
bert.encoder.layer.11.intermediate.dense.bias True
bert.encoder.layer.11.output.dense.weight True
bert.encoder.layer.11.output.dense.bias True
bert.encoder.layer.11.output.LayerNorm.weight True
bert.encoder.layer.11.output.LayerNorm.bias True
bert.pooler.dense.weight True
bert.pooler.dense.bias True
classifier.weight True
classifier.bias True
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading