Push Models to Hugging Face With Saagie
- 
Install the Transformers and Datasets libraries. !pip install transformers[torch] -U !pip install datasets
- 
Load a pre-trained model from Hugging Face. - 
Find the model you need on the Model Hub. In our example, we use the bert-tiny model. 
- 
Load the model with the following code. from transformers import AutoModelForSequenceClassification ## Example NLP model for sentiment analysis model_name = "prajjwal1/bert-tiny:main" (1) if ':' in model_name: model_ver = model_name.split(':')[1] model_name = model_name.split(':')[0] else: model_ver = "main" model = AutoModelForSequenceClassification.from_pretrained(model_name, revision=model_ver)Where 1 "prajjwal1/bert-tiny:main"can be replaced by another model name.
 
- 
- 
Fine-tune your model. - 
Find the dataset you need on the Hugging Face Hub. In our example, we use the sst2dataset for textual sentiment classification.
- 
Load and pre-process your datasets with the following code. from transformers import AutoTokenizer, DataCollatorWithPadding, Trainer, TrainingArguments from datasets import load_dataset ## Loading Datasets dataset = load_dataset("sst2") (1) train_dataset = dataset['train'] valid_dataset = dataset['validation'] train_subset = 100 eval_subset = 20 seed = 42 repo_name = "MyRepo" (2) ## Pre-processing Datasets tokenizer = AutoTokenizer.from_pretrained(model_name) def tokenize_function(examples): return tokenizer(examples["sentence"], padding="max_length", truncation=True) tokenized_train = train_dataset.map(tokenize_function, batched=True) tokenized_valid = valid_dataset.map(tokenize_function, batched=True) small_train_dataset = tokenized_train.shuffle(seed=seed).select(range(train_subset)) small_eval_dataset = tokenized_valid.shuffle(seed=seed).select(range(eval_subset)) data_collator = DataCollatorWithPadding(tokenizer=tokenizer)Where 1 "sst2"can be replaced by another dataset name.2 "MyRepo"must be replaced by the name of your repository.
- 
Add the following code to configure the hyperparameters and train your model. ## Defining hyperparameters for fine-tuning training_args = TrainingArguments( output_dir=repo_name, num_train_epochs=2, per_device_train_batch_size=16, per_device_eval_batch_size=16, logging_dir='./logs', logging_steps=10, ) ## Fine-tune the model with Trainer class (1) trainer = Trainer( model=model, args=training_args, train_dataset=small_train_dataset, eval_dataset=small_eval_dataset, tokenizer=tokenizer, data_collator=data_collator, ) trainer.train() (2)Where 1 This block is your Trainer object with your model, training arguments, training and test datasets, and evaluation function. 2 This line gives you the result of your training. Relevant training results are displayed in the log of this step. It should look like the following: TrainOutput(global_step=14, training_loss=0.664779714175633 , metrics={'train_runtime': 2.4304, 'train_samples_per_second': 82.29 , 'train_steps_per_second': 5.76, 'total_flos': 17489048160.0 , 'train_loss': 0.664779714175633, 'epoch': 2.0})
 
- 
- 
Push your model to Hugging Face. - 
Log in directly to the Hub via the huggingface_hublibrary using the access token.from huggingface_hub import notebook_login notebook_login()
- 
Push your model. trainer.push_to_hub("MyModelName") (1)Where 1 "MyModelName"must be replaced by the name of your model.
- 
OPTIONAL: Download the model to test its availability. from transformers import AutoModelForSequenceClassification model_name = "MY_ORGANIZATION/"+ MyRepo (1) model = AutoModelForSequenceClassification.from_pretrained(model_name)Where 1 "MY_ORGANIZATION/"andMyRepomust be replaced by your own values.
 When running a Python script, for example via a job on the Saagie platform, use the following code to log in and push your model to Hugging Face: from transformers import AutoModelForSequenceClassification model_name = "MY_ORGANIZATION/"+ MyRepo (1) model = AutoModelForSequenceClassification.from_pretrained(model_name)Where 1 "MY_ORGANIZATION/"andMyRepomust be replaced by your own values.As described in the Hugging Face documentation, trained model files must be packaged and manually uploaded to Hugging Face. 
-