Hadoop HDFS cheat sheet

These are the most helpful commands for the Hadoop HDFS command line shell.
I run these commands on Hadoop 2.7.1 (OS: CentOS 6.7)

ls

list HDFS directory contents

hadoop_ls

mkdir

create a folder in the HDFS

put

put the file from the given path to the remote HDFS location

cat

view contain of file store to HDFS

get

copy files from HDFS to the local file system

cp

copy files from source to destination

mv

move files from source to destination

chmod

change the permissions of files, the -R makes the change recursively through the directory structure. More information about the permission, here.

rm

delete files specified as args. To delete a folder use rmdir . The -R option deletes the directory and any content under it recursively.

checksum

returns the checksum information of a file

hadoop_checksum_a

count

count the number of directories, files and bytes under the paths that match the specified file pattern
The parameter allows us to view the information in human readable way

hadoop_count

du

displays sizes of files and directories contained in the given directory or the length of a file in case its just a file.
The parameter -h will format file sizes in a “human-readable” fashion , the -s parameter will result in an aggregate summary of file lengths

du
Aggregated size visualization
hadoop_du_a

df

displays free space. The -h option will format file sizes in a “human-readable” fashion

hadoop_df

expunge

empty the trash. Here more information about the trash architecture.

setrep

changes the replication factor of a file. If path is a directory then the command recursively changes the replication factor of all files under the directory tree rooted at path

fsck

shows the number of blocks that their location among the cluster nodes and the status of the HDFS for the given path

hadoop_fcsk

report

find out the total usage of the cluster

hadoop_report

version

shows the version of the hadoop cluster

hadoop_version
For the list of all the commands, check the official documentation: Hadoop FileSystem Shell

Matteo