how to

New York - Hells Kitchen

Sqoop2 integration with Sentry

Sqoop2 finally supports Role Based Access Control (RBAC) as described in this blog post. Similarly, Apache Sentry added bindings for Sqoop2 to provided RBAC as a service. Installing Sentry Sqoop integration will be released as part of Sentry 1.6.0. Until then, this feature is available in trunk: $ git clone $ mvn clean install –DskipTests…

Lake Tahoe cliff face

A Round Trip From MySQL to Hive

Today we will show you, how to transfer data between MySQL and Hive using Sqoop. The instructions in this blog are performed on a single node Hadoop cluster with HDFS and HiveServer2 installed. Before getting started, please make sure that you have a Hadoop cluster already. You can either setup a Hadoop cluster and Hive from scratch…

Chess Casey vs Ippo

Managing passwords in Sqoop

Sqoop makes it easy to transfer data in and out of Hadoop. In this post, we’ll cover the different options available for managing passwords, with the exception of data source specific integration such as oracle wallet. Motivation Here’s a basic Sqoop command: sqoop import –connect jdbc:mysql:// –username sqoop –password sqoop –table tbl The username and password are both…


Kite Adds JSON Support

Kite’s CSV format support is one of its most popular features. It provides a quick way to get CSV data into a recommended format (Avro or Parquet), without writing an Avro schema by hand or deal directly with file layout. In the recent 0.18.0 release, Kite adds the same level of support for JSON. Kite…

Point Reyes

Integrating Apache NiFi with Apache Kafka

A couple of weeks ago, Joey Echeverria wrote a fantastic blog post about how to get started with Apache NiFi (Incubating), available at . In it, Joey outlines how to quickly build a simple dataflow that automatically picks up any data from the /dropbox directory on your computer and pushes the data to HDFS. He…