Today's AI/ML headlines are brought to you by ThreatPerspective

AWS Machine Learning Blog

Detect and protect sensitive data with Amazon Lex and Amazon CloudWatch Logs

In today’s digital landscape, the protection of personally identifiable information (PII) is not just a regulatory requirement, but a cornerstone of consumer trust and business integrity. Organizations use advanced natural language detection services like Amazon Lex for building conversational interfaces and Amazon CloudWatch for monitoring and analyzing operational data. One risk many organizations face is [ ] Replace INTENT_NAME and SLOT_NAME with your preferred intent and slot names, respectively. CloudWatch data protection log group policies for data identifiers Sensitive data that’s ingested by CloudWatch Logs can be safeguarded by using log group data protection policies. These policies allow to audit and mask sensitive data that appears in log events ingested by the log groups in your account. CloudWatch Logs supports both managed and custom data identifiers. Managed data identifiers offer preconfigured data types to protect financial data, personal health information (PHI), and PII. For some types of managed data identifiers, the detection depends on also finding certain keywords in proximity with the sensitive data. Each managed data identifier is designed to detect a specific type of sensitive data, such as name, email address, account numbers, AWS secret access keys, or passport numbers for a particular country or region. When creating a data protection policy, you can configure it to use these identifiers to analyze logs ingested by the log group, and take actions when they are detected. CloudWatch Logs data protection can detect the categories of sensitive data by using managed data identifiers. To configure managed data identifiers on the CloudWatch console, complete the following steps: On the CloudWatch console, under Logs in the navigation pane, choose Log groups. Select your log group and on the Actions menu, choose Create data protection policy. Under Auditing and masking configuration, for Managed data identifiers, select all the identifiers for which data protection policy should be applied. Choose the data store to apply the policy to and save the changes. Custom data identifiers let you define your own custom regular expressions that can be used in your data protection policy. With custom data identifiers, you can target business-specific PII use cases that managed data identifiers don’t provide. For example, you can use custom data identifiers to look for a company-specific account number format. To create a custom data identifier on the CloudWatch console, complete the following steps: On the CloudWatch console, under Logs in the navigation pane, choose Log groups. Select your log group and on the Actions menu, choose Create data protection policy. Under Custom Data Identifier configuration, choose Add custom data identifier. Create your own regex patterns to identify sensitive information that is unique to your organization or specific use case. After you add your data identifier, choose the data store to apply this policy to. Choose Activate data protection. For details about the types of data that can be protected, refer to Types of data that you can protect. Monitor and protect data with Amazon S3 In this section, we demonstrate how to protect your data in S3 buckets. Encrypt audio recordings in S3 buckets PII can often be captured in audio recordings, especially in sectors like customer service, healthcare, and financial services, where sensitive information is frequently exchanged over voice interactions. To comply with domain-specific regulatory requirements, organizations must adopt stringent measures for managing PII in audio files. One approach is to disable the recording feature entirely if it poses too high a risk of non-compliance or if the value of the recordings doesn’t justify the potential privacy implications. However, if audio recordings are essential, streaming the audio data in real time using Amazon Kinesis provides a scalable and secure method to capture, process, and analyze audio data. This data can then be exported to a secure and compliant storage solution, such as Amazon S3, which can be configured to meet specific compliance needs including encryption at rest. You can use AWS KMS or AWS CloudHSM to manage encryption keys, offering robust mechanisms to encrypt audio files at rest, thereby securing the sensitive information they might contain. Implementing these encryption measures makes sure that even if data breaches occur, the encrypted PII remains inaccessible to unauthorized parties. Configuring these AWS services allows organizations to balance the need for audio data capture with the imperative to protect sensitive information and comply with regulatory standards. S3 bucket security configurations You can use an AWS CloudFormation template to configure various security settings for an S3 bucket that stores Amazon Lex data like audio recordings and logs. For more information, see Creating a stack on the AWS CloudFormation console. See the following example code: AWSTemplateFormatVersion: '2010-09-09' Description: Create a secure S3 bucket with KMS encryption to store Lex Data Resources: S3Bucket: Type: AWS::S3::Bucket Properties: BucketName: YOUR_LEX_DATA_BUCKET AccessControl: Private PublicAccessBlockConfiguration: BlockPublicAcls: true BlockPublicPolicy: true IgnorePublicAcls: true RestrictPublicBuckets: true BucketEncryption: ServerSideEncryptionConfiguration: - ServerSideEncryptionByDefault: SSEAlgorithm: aws:kms KMSMasterKeyID: alias/aws/s3 VersioningConfiguration: Status: Enabled ObjectLockConfiguration: ObjectLockEnabled: Enabled Rule: DefaultRetention: Mode: GOVERNANCE Years: 5 LoggingConfiguration: DestinationBucketName: !Ref YOUR_SERVER_ACCESS_LOG_BUCKET LogFilePrefix: lex-bucket-logs/ The template defines the following properties: BucketName  Specifies your bucket. Replace YOUR_LEX_DATA_BUCKET with your preferred bucket name. AccessControl Sets the bucket access control to Private, denying public access by default. PublicAccessBlockConfiguration Explicitly blocks all public access to the bucket and its objects BucketEncryption  Enables server-side encryption using the default KMS encryption key ID, alias/aws/s3, managed by AWS for Amazon S3. You can also create custom KMS keys. For instructions, refer to Creating symmetric encryption KMS keys VersioningConfiguration Enables versioning for the bucket, allowing you to maintain multiple versions of objects. ObjectLockConfiguration Enables object lock with a governance mode retention period of 5 years, preventing objects from being deleted or overwritten during that period. LoggingConfiguration  Enables server access logging for the bucket, directing log files to a separate logging bucket for auditing and analysis purposes. Replace YOUR_SERVER_ACCESS_LOG_BUCKET with your preferred bucket name. This is just an example; you may need to adjust the configurations based on your specific requirements and security best practices. Monitor and protect with data governance controls and risk management policies In this section, we demonstrate how to protect your data with using a Service Control Policy (SCP). To create an SCP, see Creating an SCP. Prevent changes to an Amazon Lex chatbot using an SCP To prevent changes to an Amazon Lex chatbot using an SCP, create one that denies the specific actions related to modifying or deleting the chatbot. For example, you could use the following SCP: { "Version": "2012-10-17", "Statement": [ { "Effect": "Deny", "Action": [ "lex:DeleteBot", "lex:DeleteBotAlias", "lex:DeleteBotChannelAssociation", "lex:DeleteBotVersion", "lex:DeleteIntent", "lex:DeleteSlotType", "lex:DeleteUtterances", "lex:PutBot", "lex:PutBotAlias", "lex:PutIntent", "lex:PutSlotType" ], "Resource": [ "arn:aws:lex:*:YOUR_ACCOUNT_ID:bot:YOUR_BOT_NAME", "arn:aws:lex:*:YOUR_ACCOUNT_ID:intent:YOUR_BOT_NAME:*", "arn:aws:lex:*:YOUR_ACCOUNT_ID:slottype:YOUR_BOT_NAME:*" ], "Condition": { "StringEquals": { "aws:PrincipalArn": "arn:aws:iam::YOUR_ACCOUNT_ID:role/YOUR_IAM_ROLE" } } } ] } The code defines the following: Effect This is set to Deny, which means that the specified actions will be denied. Action This contains a list of actions related to modifying or deleting Amazon Lex bots, bot aliases, intents, and slot types. Resource This lists the Amazon Resource Names (ARNs) for your Amazon Lex bot, intents, and slot types. Replace YOUR_ACCOUNT_ID with your AWS account ID and YOUR_BOT_NAME with the name of your Amazon Lex bot. Condition This makes sure the policy only applies to actions performed by a specific IAM role. Replace YOUR_ACCOUNT_ID with your AWS account ID and YOUR_IAM_ROLE with the name of the AWS Identity and Access Management (IAM) provisioned role you want this policy to apply to. When this SCP is attached to an AWS Organizations organizational unit (OU) or an individual AWS account, it will allow only the specified provisioning role while preventing all other IAM entities (users, roles, or groups) within that OU or account from modifying or deleting the specified Amazon Lex bot, intents, and slot types. This SCP only prevents changes to the Amazon Lex bot and its components. It doesn’t restrict other actions, such as invoking the bot or retrieving its configuration. If more actions need to be restricted, you can add them to the Action list in the SCP. Prevent changes to a CloudWatch Logs log group using an SCP To prevent changes to a CloudWatch Logs log group using an SCP, create one that denies the specific actions related to modifying or deleting the log group. The following is an example SCP that you can use: { "Version": "2012-10-17", "Statement": [ { "Effect": "Deny", "Action": [ "logs:DeleteLogGroup", "logs:PutRetentionPolicy" ], "Resource": "arn:aws:logs:*:YOUR_ACCOUNT_ID:log-group:/aws/YOUR_LOG_GROUP_NAME*", "Condition": { "StringEquals": { "aws:PrincipalArn": "arn:aws:iam::YOUR_ACCOUNT_ID:role/YOUR_IAM_ROLE" } } } ] } The code defines the following: Effect This is set to Deny, which means that the specified actions will be denied. Action This includes logs:DeleteLogGroup and logs:PutRetentionPolicy actions, which prevent deleting the log group and modifying its retention policy, respectively. Resource This lists the ARN for your CloudWatch Logs log group. Replace YOUR_ACCOUNT_ID with your AWS account ID and YOUR_LOG_GROUP_NAME with the name of your log group. Condition This makes sure the policy only applies to actions performed by a specific IAM role. Replace YOUR_ACCOUNT_ID with your AWS account ID and YOUR_IAM_ROLE with the name of the IAM provisioned role you want this policy to apply to. Similar to the preceding chatbot SCP, when this SCP is attached to an Organizations OU or an individual AWS account, it will allow only the specified provisioning role to delete the specified CloudWatch Logs log group or modify its retention policy, while preventing all other IAM entities (users, roles, or groups) within that OU or account from performing these actions. This SCP only prevents changes to the log group itself and its retention policy. It doesn’t restrict other actions, such as creating or deleting log streams within the log group or modifying other log group configurations. To restrict additional actions, add it to the Action list in the SCP. Also, this SCP will apply to all log groups that match the specified resource ARN pattern. To target a specific log group, modify the Resource value accordingly. Restrict viewing of unmasked sensitive data in CloudWatch Logs Insights using an SCP When you create a data protection policy, by default, any sensitive data that matches the data identifiers you’ve selected is masked at all egress points, including CloudWatch Logs Insights, metric filters, and subscription filters. Only users who have the logs:Unmask IAM permission can view unmasked data. The following is an SCP you can use: { "Version": "2012-10-17", "Statement": [ { "Sid": "RestrictUnmasking", "Effect": "Deny", "Action": "logs:Unmask", "Resource": "arn:aws:logs:*:YOUR_ACCOUNT_ID:log-group:YOUR_LOG_GROUP:*", "Condition": { "StringEquals": { "aws:PrincipalArn": "arn:aws:iam::YOUR_ACCOUNT_ID:role/YOUR_IAM_ROLE" } } } ] } It defines the following: Effect This is set to Deny, which means that the specified actions will be denied. Action This includes logs:Unmask, which prevents viewing of masked data. Resource This lists the ARN for your CloudWatch Logs log group. Replace YOUR_ACCOUNT_ID with your AWS account ID and YOUR_LOG_GROUP_NAME with the name of your log group. Condition This makes sure the policy only applies to actions performed by a specific IAM role. Replace YOUR_ACCOUNT_ID with your AWS account ID and YOUR_IAM_ROLE with the name of the IAM provisioned role you want this policy to apply to. Similar to the previous SCPs, when this SCP is attached to an Organizations OU or an individual AWS account, it will allow only the specified provisioning role while preventing all other IAM entities (users, roles, or groups) within that OU or account from unmasking sensitive data from the CloudWatch Logs log group. Similar to the previous log group service control policy, this SCP only prevents changes to the log group itself and its retention policy. It doesn’t restrict other actions such as creating or deleting log streams within the log group or modifying other log group configurations. To restrict additional actions, add them to the Action list in the SCP. Also, this SCP will apply to all log groups that match the specified resource ARN pattern. To target a specific log group, modify the Resource value accordingly. Clean up To avoid incurring additional charges, clean up your resources: Delete the Amazon Lex bot: On the Amazon Lex console, choose Bots in the navigation pane. Select the bot to delete and on the Action menu, choose Delete. Delete the associated Lambda function: On the Lambda console, choose Functions in the navigation pane. Select the function associated with the bot and on the Action menu, choose Delete. Delete the account-level data protection policy. For instructions, see DeleteAccountPolicy. Delete the CloudFormation log group policy: On the CloudWatch console, under Logs in the navigation pane, choose Log groups. Choose your log group. On the Data protection tab, under Log group policy, choose the Actions menu and choose Delete policy. Delete the S3 bucket that stores the Amazon Lex data: On the Amazon S3 console, choose Buckets in the navigation pane. Select the bucket you want to delete, then choose Delete. To confirm that you want to delete the bucket, enter the bucket name and choose Delete bucket. Delete the CloudFormation stack. For instructions, see Deleting a stack on the AWS CloudFormation console. Delete the SCP. For instructions, see Deleting an SCP. Delete the KMS key. For instructions, see Deleting AWS KMS keys. Conclusion Securing PII within AWS services like Amazon Lex and CloudWatch requires a comprehensive and proactive approach. By following the steps in this post identifying and classifying data, locating data stores, monitoring and protecting data in transit and at rest, and implementing SCPs for Amazon Lex and Amazon CloudWatch organizations can create a robust security framework. This framework not only protects sensitive data, but also complies with regulatory standards and mitigates potential risks associated with data breaches and unauthorized access. Emphasizing the need for regular audits, continuous monitoring, and updating security measures in response to emerging threats and technological advancements is crucial. Adopting these practices allows organizations to safeguard their digital assets, maintain customer trust, and build a reputation for strong data privacy and security in the digital landscape. About the Authors Rashmica Gopinath is a software development engineer with Amazon Lex. Rashmica is responsible for developing new features, improving the service’s performance and reliability, and ensuring a seamless experience for customers building conversational applications. Rashmica is dedicated to creating innovative solutions that enhance human-computer interaction. In her free time, she enjoys winding down with the works of Dostoevsky or Kafka. Dipkumar Mehta is a Principal Consultant with the Amazon ProServe Natural Language AI team. He focuses on helping customers design, deploy, and scale end-to-end Conversational AI solutions in production on AWS. He is also passionate about improving customer experience and driving business outcomes by leveraging data. Additionally, Dipkumar has a deep interest in Generative AI, exploring its potential to revolutionize various industries and enhance AI-driven applications. David Myers is a Sr. Technical Account Manager with AWS Enterprise Support . With over 20 years of technical experience observability has been part of his career from the start. David loves improving customers observability experiences at Amazon Web Services. Sam Patel is a Security Consultant specializing in safeguarding Generative AI (GenAI), Artificial Intelligence systems, and Large Language Models (LLM) for Fortune 500 companies. Serving as a trusted advisor, he invents and spearheads the development of cutting-edge best practices for secure AI deployment, empowering organizations to leverage transformative AI capabilities while maintaining stringent security and privacy standards.

Published: 2024-07-23T20:01:37











© Digital Event Horizon . All rights reserved.

Privacy | Terms of Use | Contact Us