- Apache Ranger is a framework to enable, monitor and manage comprehensive data security across the Hadoop platform.
- Apache Sentry is a system for applying functionality-based authorization of fine granularity to data and metadata stored in a Hadoop cluster.
- Knox is a Gateway application to interact with the REST API and the Apache Hadoop UI.
- Kerberos is an authentication protocol that allows two computers to demonstrate their identity mutually in a secure way.
Machine learning Tools
- Apache Mahout is a distributed framework of linear algebra and mathematically expressive Scala DSL, designed to quickly implement algorithms.
- Spark MLlib is a library of machine learning, which contains the original API built on the RDD.
- SparkML is a library of machine learning, which provides a top-level API built on DataFrames.
- FlinkML is a library of machine learning for Flink.
Data labelling tools
- Apache Falcon
- Apache Atlas
Log processing Tools
- LogStash: Open-source tool for the administration of logs, which allows to load, transform, filter and save the logs on which to search.
- Apache Chukwa
- FLUENTD: Proprietary tool for managing logs.