In the talk, Anna Rogers presented the paper: When BERT plays the lottery, all tickets are winning The lottery ticket hypothesis was originally developed for randomly initialized models, but might it also apply to pretrained Transformers If the good subnetworks exist, can they tell us anything about how BERT achieves its performance Full article:
0
0
Related videos
Preparing
To view the site materials you should be more than or equal to 18 years old