NiFi installation and implementation

NiFi introduction

NiFi will allows you to create various data pipelines in a very nice web GUI.
Inside NiFi, one event sent and handled by the system is called a flow file. Each event will be stored as file, containing attributes. Flow files will be received, transformed, routed, split, transferred by processors. Tons of processors are proposed by default, there are processors to:
  • Receive messages from Syslog, HTTP, FTP, HDFS, Kafka, …
  • Transfer messages to Kafka, ElasticSearch, SolR, MongoDB, SQL RDBMS, HDFS, HTTP, FTP, …
  • Transform messages: split line of text into multiple attributes using GROK, Regex, mapping, ...
  • Aggregate multiple messages into one single new flow files
  • Route messages to different processors based on attributes / content.
  • Map content as a record defined into the external Registry component of the HDF platform. The registry allows you to define the format of so-called records that can be reused across multiple flows.
With the GUI, you can pause one processor and inspect its queues, see how the flow files flows, get error messages, move connection between processors “live”, …
To protect the GUI, you can setup up SSL access to it, protected by client certificates so only client allowed by you can use it. When you enable the SSL certificate authentication and authorization, you can give different permissions to different users (like full admin, read-only, …)
NiFi comes also with a so-called expression language, that allows you to use a kind of Java programming language when defining value for attributes.
All the information about NiFi can be found in their documentation site: https://nifi.apache.org/docs.html

 

NiFi installation

Whether it is done during the initial cluster setup or just by adding the NiFi service on an already defined cluster, the steps are as follow:
In the list of services supported by the cluster, chose NiFi and click “Next”.

1.services list.PNG

In the next screen, you choose on which host(s) of your cluster you want to run the service. If the service can be installed on more than one host, you will have a green plus icon at the left of the host chosen by the wizard. Once the service is associated to the host(s) you want, click “Next”.
In the “Assign Slaves and Clients” page, you must choose on which host the “NiFi Certificate Authority” must run. Only one host must be selected. Then click “Next”.
In the “Customize service” page, you must enter 2 passwords in the NiFi configuration. Click on the tab “NiFi” and fill in a passwords for the proposed parameters. Click “Next”.
You can now review the changes that will be done and once satisfied, you can click “Deploy”.
The next page will show a progress about the deployment of the service. Once this is finished, if any error occurred during the process, you will have a link where you can click on it to see the error logs. If everything is fine, you can click on the button “Next”.
Click on “Complete”.
Once the installation is complete, you can access your newly deployed NiFi at http://<hostname>:9090/nifi
 
You can change some parameters, by instance the port on which the Web GUI listen in the configuration tab of your NiFi service management into Ambari, in the first section:

2.nifi config.PNG

Click “Save”. When the configuration is saved, you will have to restart the NiFi services.
 

NiFi usage

The next pages of the NiFi topic will show various implementation case of NiFi.
Use the navigator below to learn more.