Read and Write Files From Amazon S3 Bucket With R
Declare your environment variables in your Saagie project to allow easy modifications and not store your credentials on Git when using version control on your project.
You can also declare your environment variables directly from your R code, but we do not recommend this solution.
key <- 'BLKIUG450KFBB' secret <- 'oihKJFuhfuh/953oiof' region <- 'eu-west-3' Sys.setenv(AWS_ACCESS_KEY_ID = key, AWS_SECRET_ACCESS_KEY = secret, AWS_DEFAULT_REGION = region)
You can now read and write files from Amazon S3 Bucket using the arrow or aws.s3 package with the following lines of code:
The arrow package is a library that can interact with Amazon S3 Bucket to write CSV and Parquet files locally and directly to Amazon S3 Bucket.
library(arrow) # To get a bucket. bucket <- s3_bucket(bucket_name) # Create a path to the file path <- bucket$path(object_name) # To write a CSV file from the created path. write_csv_arrow(iris, path) # To read the file from the path. iris2 <- read_csv_arrow(path))
The aws.s3 package is a library that can interact with Amazon S3 Bucket in different ways. It is slower than the arrow package, but has more features.
library(aws.s3) library(data.table) # Required to read from and write to the RAM # To upload the file from the RAM. s3write_using(iris, FUN = fwrite, object = object_name, bucket = bucket_name) # To read the file from Amazon S3 Bucket to the RAM. iris3 <- s3read_using(FUN = fread, object = object_name, bucket = bucket_name)
bucket_namemust be declared using
bucket_name ← 'saagie-service'.
object_namemust be replaced with
object_name ← 'documentation-s3/doc-r/iris.csv'.
# List the available buckets. bucketlist() # List the files in the bucket. get_bucket(bucket_name) ##### Uploading a file from the disk ##### # To write the file to the disk. write.csv(iris, 'iris.csv', row.names = F) put_object(file = 'iris.csv', object = object_name, bucket = bucket_name) # Another way to read from RAM. iris4 <- data.table::fread(rawToChar(get_object(object = object_name, bucket = bucket_name))) # To read the file from the disk. # To write binary to the disk and read it. No additional library is needed. writeBin(get_object(object = object_name, bucket = bucket_name, as = 'raw'), con = 'iris5.csv') iris5 <- read.csv('iris5.csv')