Read and Write Files From HDFS, WebHDFS, and HTTPFS With HDFS
To execute the following examples, make sure you have created the following environment variables:
Key | Value | ||
---|---|---|---|
HDFS |
|
IP or full name of the |
|
|
|
||
WebHDFS |
|
IP or full name of the |
|
|
|
||
HTTPFS |
|
IP or full name of the |
|
|
|
These are default values. |
Read and Write Files With the HDFS Protocol
For more information, see the official HDFS protocol documentation. |
# To authenticate. export HADOOP_USER_NAME="my_user" # To get file. hdfs dfs -get hdfs://$IP_HDFS:$PORT_HDFS/distant/path/my_distant_file my_local_file
# To authenticate. export HADOOP_USER_NAME="my_user" # To place file. hdfs dfs -put my_local_file hdfs://$IP_HDFS:$PORT_HDFS/distant/path/
Read and Write Files With the WebHDFS Protocol
For more information, see the official WebHDFS protocol documentation. |
# To get file. curl -L -X GET "http://$IP_WEBHDFS:$PORT_WEBHDFS/webhdfs/v1/distant/path/my_distant_file?user.name=my_user&op=OPEN"
-
Request the node name to get the data node location by running the following lines of code:
# To get location. RET=$(curl -XPUT --silent --include "http://$IP_WEBHDFS:$PORT_WEBHDFS/webhdfs/v1/distant/path/my_distant_file?user.name=my_user&op=CREATE" | grep 'Location' | cut -d" " -f2) echo $RET
Where:
-
curl
sends the HTTPPUT
request. -
grep
retrieves only the value ofLocation
. -
cut
retrieves only the second element. -
echo
displays the return.
-
-
Put the file in the data node location by running the following lines of code:
# To place file. curl -XPUT --include -T my_local_file "$RET"
Where the variable
$RET
is the return of the first step.
Read and Write Files With the WebHDFS Protocol With a Kerberized Cluster
When using the WebHDFS protocol with a Kerberized cluster, make sure you are using the correct port (50470
).
Then, run the following curl
command after getting a valid ticket from a kinit
command.
# With Kerberos, provided you have a valid Kerberos ticket obtained with kinit. curl -k --negotiate -u : "https://nn1:50470/webhdfs/v1/?op=LISTSTATUS"
Write Files With the HTTPFS Protocol
# To get file. curl -X GET "http://$IP_HTTPFS:$PORT_HTTPFS//webhdfs/v1/distant/path/my_distant_file?user.name=my_user&op=OPEN" --header "Content-Type:application/octet-stream" -o "my_local_file"
# To place file. curl -X PUT "http://$IP_HTTPFS:$PORT_HTTPFS/webhdfs/v1/distant/path/my_distant_file?user.name=my_user&op=CREATE&data=true" --header "Content-Type:application/octet-stream" -T "my_local_file"