Privacy-Preserving Data Security and Identity Management to Support Machine Learning Use Cases
The proliferation of wearable devices that are equipped with data sensors which monitor human activity and biometrics allows for cost effective data collection at the edge. The availability of cloud-based machine learning environments provides researchers and clinicians with access to powerful data analytic tools that can analyze and interpret the data to provide actionable insights; often in the form of machine learning based algorithms. However, the ability to collect user data brings new security complexities with regards to data privacy, data access, and control. To overcome the security challenges, Tozny integrated its end-to-end encryption and Identity and Access Management solutions, TozStore and TozID, with the Databricks cloud-based machine learning tools to provide a complete solution for edge-based biometric data collection and analysis.
The next generation of edge-based data collection and analysis will include human biometric data that will be used for many applications including health monitoring, anxiety and stress management, and cancer research to name only a few. The availability of activity monitoring devices such as Fitbit, Garmin, and Apple watches that are equipped with GPS, Blood Oxygen Saturation, Electrocardiogram (ECG), Electrodermal activity (EDA), temperature, stress, and other sensors makes distributed large scale human biometric data collection possible. Integrating monitoring devices with mobile phones, which serve as the data gateways to the cloud, have eliminated traditional hurdles such as data acquisition costs and data transport complexities while opening participation to essentially anyone with Internet access. However, many challenges still exist including protecting the identity of the participant, protecting the data throughout the data lifecycle, controlling data access, and integration with the state-of-art analysis tools required for sophisticated data analysis. Tozny’s end-to-end data encryption and Identity and Access Management solutions have been integrated with the Databricks cloud-based environment with Machine Learning models and data analysis tools to deliver a complete solution for mobile and edge-based biometric data collection and analysis.
When collecting human biometric data, privacy is critical in order to both recruit participants and maintain compliance with the latest regulations including HIPPA, GDPR and others. Secure Cryptographic Authentication using zero trust technology provides the most advanced security available while providing a frictionless methodology for participants, clinicians, and researchers to authenticate themselves and obtain authorization to sensitive biometric data. TozID facilitates encryption-backed identity management to easily and securely manage authentication and authorization using zero trust technology. With TozID, the credentials are never transmitted in plain text from a mobile device or browser. For increased security, the credentials can consist of both biometric and user-passwords that are used to derive the set of encryption keys used for identity authentication and controlling data access. The private signing key is used to sign a login challenge sent by TozID when the device tries to login. TozID then verifies the user using the public key associated with that Identity, and only if the signature is correct will the user be allowed to authenticate. Once a user is authenticated, they are able to use their encryption keys to encrypt data and send it from the user’s mobile device to TozStore for storage, searching, and sharing the data with other users or platforms (e.g., Databricks)
TozStore provides end-to-end data security and privacy using strong encryption to protect sensitive user information and data. Data remains encrypted from generation on the edge device, as well as through transmission and storage. The use of cryptography upon data generation provides critical protection for sensitive user data and private information throughout the entire data lifecycle. Even though data remains encrypted while in storage, the data can be searched using the metadata, and policies can be established to control access to all or part of the data based on authorization levels. TozID provides the framework for user authentication and authorization to control data access to particular data types or segments using policies such as Role Based Access (RBAC), Attribute Based Access (ABAC) or others. Based on a positive identification of the user, TozID looks up the user’s recorded attributes and roles and compares them to the required authentication rules governing data access. For instance, if the user is in a “research” group, they may have access to some resources and not others.
Databricks is a commercial and open-source platform which enables data scientist and clinicians to collaborate on research using Machine Learning models and data analysis tools. Facilitating secure data transfers is the Tozny Data Agent. The Tozny Data Agent can be installed on any Linux, MacOS, or Windows compatible environment such as Databricks. For authenticated users and their authorized data sets, the Tozny Data Agent will securely transfer the encrypted data for which data access is authorized from TozStore to the Databricks environment. Once transferred, the Tozny Data Agent will decrypt the data for consumption within the Databricks environment for Machine Learning model training, validation, and analysis.
Integrated Tozny and Databricks Workflow
The workflow will allow researchers to initiate complex statistical queries or Machine Learning model training and analysis from their devices using the Tozny API to orchestrate the Tozny Data Agent to access and decrypt the data. When researchers log into the portal, based on the roles and groups they have been assigned by Administrators, they will be able to see the list of data sets they have access to use. If the researcher’s access is valid and approved by TozID, the Tozny Data Agent will effectuate the data transfer from TozStore to Databricks, and decrypt that data within the Databricks environment facilitating data analysis using the Machine Learning models and tools in the environment.
The proliferation of biometric data collected at the edge will accelerate as the adoption of activity monitoring devices increases. The biometric data can provide clinicians and researchers with actionable insights that can be used for example, for both therapeutic and preventive care. Improvements in data analysis and the availability of cloud-based machine learning tools makes data analysis possible. However, challenges must be overcome to ensure the privacy of the user is not compromised and data access is controlled. Tozny’s end-to-end encryption and Identity and Access Management have been integrated with Databricks to provide secure data collection and data analysis in a privacy-preserving manner.
All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only.