Hi, I try to use the code in quick start to train an AdaMix, but i find that if you don't add the "adapter_path", you can only train the weight and bias of classifer layer. Btw, I use the checkpoint of AdaMix (bert-based) on Cola dataset, I can only obtain the "eval_matthews_correlation" up to 0.25. Is that something wrong in your code?
Hi, I try to use the code in quick start to train an AdaMix, but i find that if you don't add the "adapter_path", you can only train the weight and bias of classifer layer. Btw, I use the checkpoint of AdaMix (bert-based) on Cola dataset, I can only obtain the "eval_matthews_correlation" up to 0.25. Is that something wrong in your code?