Manually Upload Large Files to HDFS
To manually upload files larger than a few gigabytes to HDFS via HUE, you can:
-
Split your files into several Zip archives smaller than 1GB.
To create Zip archives, use 7-Zip under Windows or the split
command under Linux. -
Upload files to HDFS via HUE.
-
Create and run a Sqoop job as follows:
mypath="/hdfspath/to/data/" myzip="name of my file" # The file name must be without the .zip extension. hadoop fs -chmod 777 "$mypath" hadoop fs -ls "$mypath$myzip.zip".* hadoop fs -cat "$mypath$myzip.zip".* > file.zip ls -la unzip file.zip -d "$myzip" ls -la "$myzip/" hadoop fs -put -f $( echo "$myzip/" | sed s/\ /\%20/g ) "$mypath"
Where the
mypath
andmyzip
variables must be replaced with your values.