HDFS (Hadoop Distributed File System) is Hadoop’s primary file storage System.
Works well with large volumes of data, reduces I/O, high scalability, and availability and fault tolerance due to data replication.
The Hadoop file system is typically used as a column-oriented database management system called HBase.
- NameNode: There is only one in the cluster. is responsible for:
- Regulate the access to the files by the clients.
- Keep in memory the file system metadata.
- Control the file blocks that each DataNode has.
- DataNode: They are in charge of reading and writing the requests of the clients and of replicating the blocks in the different nodes.
Commands for manipulating HDFS files
There are two ways to query and manipulate files HDFS by command line: “Hadoop FS” and “HDFS Dfs”
The difference is that FS indicates a generic file system that can point to any file system, such as local FS, HFTP FS, S3 FS, and others like HDFS. On the contrary “HDFs” is specific for the HDFs file system.
Commands to manipulate files with “HADOOP FS”
These commands are executed from the command line, and before you can use them you need to start the Hadoop service:
$ hadoop/sbin/start-dfs.sh $ hadoop/sbin/start-yarn.sh
Reset the structure to delete past references.
$ hadoop namenode -format
Copy local file to the data structure:
$ hadoop fs -put /ruta-local/ficheroLocal.txt /ruta-hdfs/ficheroHDFS.txt $ hadoop fs -put /home/datos/cosumos.csv /user/hadoop/consumos/consumos.css
Copy files from the structure to the Local:
$ hadoop fs -get /ruta-hdfs/ficheroHDFS.txt /rutalocal/ficheroLocal.txt
List the contents of the directory:
$ hadoop fs -ls /
Display the contents of a file in the structure:
$ hadoop fs -cat /ruta-hdfs/ficheroHDFS.txt
Create a directory:
$ Hadoop FS-mkdir MiDirectorio
Recursively create a directory:
$ hadoop fs -mkdir -p miDirectorio/subdirectorio
Delete a directory and all its contents:
$ hadoop fs -rm -r miDirectorio
Commands to manipulate files with “HDFS DFS”
List main directory
hdfs dfs -ls /
List subdirectory “Test”
hdfs dfs -ls /prueba
Copy Local files to the FS data structure
hdfs dfs -copyFromLocal /directorio_local/ /directorio_hdfs/
Copy files from the FS to Local data structure
hdfs dfs -get /directorio_hdfs/ /directorio_local/
Other commands: appendToFile, cat, chgrp, chmod, chown, copyFromLocal, copyToLocal, count, cp, du, dus, expunge, get, getfacl, getmerge, ls, lsr, mkdir, moveFromLocal, moveToLocal, mv, put, rm, rmr, setfacl, setrep, stat, tail, test, text, touchz.
Source: Official WEB commands