Tel Aviv Near Cafe Suzanna
0

Playing with Kite in Sqoop2

Kite is a high-level data layer for Hadoop. Kite’s API is built around datasets. A dataset is a consistent interface for working with your data. Datasets are uniquely identified by URIs, e.g. dataset:hive:hive_db/hive_table. You have control of implementation details, such as whether to use Avro or Parquet format, HDFS or HBase storage, and snappy compression…

New York - Hells Kitchen
0

Sqoop2 integration with Sentry

Sqoop2 finally supports Role Based Access Control (RBAC) as described in this blog post. Similarly, Apache Sentry added bindings for Sqoop2 to provided RBAC as a service. Installing Sentry Sqoop integration will be released as part of Sentry 1.6.0. Until then, this feature is available in trunk: $ git clone https://github.com/apache/incubator-sentry.git $ mvn clean install –DskipTests…

Lake Tahoe cliff face
3

A Round Trip From MySQL to Hive

Today we will show you, how to transfer data between MySQL and Hive using Sqoop. The instructions in this blog are performed on a single node Hadoop cluster with HDFS and HiveServer2 installed. Before getting started, please make sure that you have a Hadoop cluster already. You can either setup a Hadoop cluster and Hive from scratch…

Perfectly aligned wines
0

Role Based Access Control in Sqoop2

Brief Introduction Sqoop 2 has recently added several security features in Sqoop 1.99.6 release, that enables its use in environments where security concerns have to be addressed, this includes: Simple authorization 3rd party authorization through Sentry This blog post will detail how to setup Sqoop2 with role based access control. Role based access control development in Sqoop2 was a…

Repository of alcohol
0

PostgreSQL repository added to Sqoop2

If you’ve been watching Sqoop2 development the past few months, you’ve probably noticed a lot has changed. In spirit of growth and change, I’ve added another (not embedded) repository: PostgreSQL.   This is really important for Sqoop2 for a few reasons: PostgreSQL is a mature database that has HA deployments Apache Derby is no longer your…