Apache Hadoop Archivos

Apache Hadoop YARN

by Diego Calvo | Sep 12, 2018 | Apache Hadoop, Big Data

Yarn definition Yarn (Yet Another Resource negotiator) is a data operating system and distributed Resource Manager, also known as Hadoop 2 as it is the evolution of Hadoop Map-Reduce. The most significant changes of Hadoop 2 over Hadoop 1 is that the thread technology...

Kerberos

by Diego Calvo | Sep 4, 2018 | Apache Hadoop, Big Data

Kerberos definition Kerberos is an authentication protocol that allows two computers to demonstrate their identity mutually in a secure way. Implemented on a client server architecture and works on the basis of tickets that serve to demonstrate the identity of the...

Connect with Scala to the HDFS of Hadoop

by Diego Calvo | Aug 10, 2018 | Apache Hadoop, Apache Spark, Big Data

Write data to HDFS Example of how to write RDD data in a HDFS of Hadoop. Delete the file if it exists Import Scala. sys. process. _ "HDFs DFS-rm-R/pruebas"! Record a RDD in HDFS Val Rdd = sc. parallelize (List ( (0, 60), (0, 56), (0, 54), (0,...

Apache Flink (batch & streaming processing)

by Diego Calvo | Jul 5, 2018 | Apache Hadoop, Big Data

Flink definition Apache Flink is a native low-latency data flow processing engine that provides communication and fault tolerance data distribution capabilities. Flink was developed in Java and Scala by the Technical University of Berlin and is currently the start-up...

Big data-Data visualization tools

by Diego Calvo | Jul 5, 2018 | Apache Hadoop, Big Data

Data visualization tools for BIG data ecosystems are classified in the following blocks: Notebooks Jupyter Zeppelin Graphic libraries Google Chart D3. js Plotty Graphic analysis Tools Kibana Shiny Video Recorder Loggy Proprietary tools Splunk Tableau QLink Google...

Big data processing frameworks

by Diego Calvo | Jul 5, 2018 | Apache Hadoop, Big Data

The Big data ecosystems data processing frameworks are classified in the following blocks: Batch Processing Hadoop Map-Reduce: Batch or batch processing engine. Real-time processing Apache Storm Apache Samza IBM InfoSphere Apache S4 (Yahoo) Apache complexion Hybrid...

« Older Entries

Apache Hadoop YARN

Kerberos

Connect with Scala to the HDFS of Hadoop

Apache Flink (batch & streaming processing)

Big data-Data visualization tools

Big data processing frameworks

Mi filosofía

Contacto

Aviso legal