Posted On: Mar 14, 2023
HAQM EMR is excited to announce a new capability that enables users to apply AWS Lake Formation based table and column level permissions on HAQM S3 data lake for write operations (i.e., INSERT INTO, INSERT OVERWRITE) with Apache Hive jobs submitted using HAQM EMR Steps API. This feature allows data administrators to define and enforce fine-grained table and column level security for customers accessing data via Apache Hive running on HAQM EMR.
HAQM EMR integration with AWS Lake Formation allows you to define and enforce database, table, and column-level permissions with open source data processing engines such as Apache Spark and Apache Hive running on HAQM EMR. Prior to this release, data administrators can define and enforce Lake Formation based permissions on Databases, Tables, and Columns for read only workloads with Apache Hive on EMR. With the current release, you can now use Hive to write to or alter Lake Formation-enabled Tables. This means you can enforce Lake Formation-based Database, Table, and Column level permissions when your customers are running INSERT INTO, INSERT OVERWRITE and ALTER TABLE queries. To use Lake Formation based permissions, customers must use Glue Data Catalog as the metastore.
This feature is available with HAQM EMR release 6.10 for HAQM EMR on EC2 clusters in all regions where HAQM EMR is available. To get started, refer to the Integrate HAQM EMR with AWS Lake Formation section in HAQM EMR documentation.