NiFI for Apache - the flow using records and registry


In a previous guide, we’ve setup MiNiFi on Web servers to export Apache access log event to a central NiFi server. Then we saw an example of flow build in this NiFi server to handle this flow. This flow was using standard NiFi processors, manipulating each event as a string. Now, we will start a new flow, achieving the same purpose but using a record oriented approach.
We will then discover the ease of use of the record oriented flow files and how it can speed up the deployment of a flow.

Pieces needed from before

NiFi and the Hortonworks Registry


The HortonWorks Registry is a service running on your Hortonworks Data Flow cluster that will allow you to centrally store and distribute schemas of how the data you are manipulating are organized.
The Registry is a web application offering:
  • A web interface to add and modify schema
  • A REST API that can be used by any other service to retrieve schema information
The Registry retains previous version of the schema each time you perform an update on an existing schema.