[ad_1]
At SambaSafety, their mission is to promote secure communities by reducing risk through data analysis. Since 1998, SambaSafety has been North America’s leading provider of cloud-based mobility risk management software for organizations with commercial and non-commercial drivers. SambaSafety serves more than 15,000 global employers and insurance carriers with driver risk and compliance monitoring, online training and deep risk analytics, as well as risk pricing solutions. By collecting, correlating and analyzing driver records, telematics, corporate and other sensor data, SambaSafety not only helps employers better enforce safety policies and reduce claims, but also helps insurers make informed underwriting decisions and background screenings to perform accurate, efficient pre-hire. checks.
Not all drivers have the same risk profile. The more time you spend behind the wheel, the higher your risk profile. SambaSafety’s team of data scientists have developed complex and robust modeling solutions designed to accurately quantify this risk profile. However, they sought support to deploy this solution for batch and real-time inference in a consistent and reliable manner.
In this post, we discuss how SambaSafety used AWS machine learning (ML) and continuous integration and continuous delivery (CI/CD) tools to deploy their existing data science application for batch inference. SambaSafety worked with AWS Advanced Consulting Partner Firemind to deliver a solution that leveraged AWS CodeStar, AWS Step Functions, and Amazon SageMaker for this workload. With AWS CI/CD and AI/ML products, SambaSafety’s data science team did not have to change their existing development workflow to take advantage of continuous model learning and inference.
User use case
SambaSafety’s data science team has long used the power of data to inform their business. They had several experienced engineers and scientists creating insightful models that improved the quality of risk analysis on their platform. The challenges this team faced were not related to data science. SambaSafety’s data science team needed help connecting their existing data science workflow to a continuous delivery solution.
SambaSafety’s data science team maintained several script-like artifacts as part of their development workflow. These scripts performed several tasks including data preprocessing, feature engineering, model generation, model tuning, and model comparison and validation. These scripts were all run manually when new data came into their environment for training. Additionally, these scripts did not perform any model versions or hosting for inference. SambaSafety’s data science team developed manual solutions to help produce the new models, but the process became tedious and time-consuming.
To free up SambaSafety’s highly skilled data science team to innovate on new ML workloads, SambaSafety needed to automate the manual tasks associated with maintaining existing models. In addition, the solution required repeating the manual workflow used by SambaSafety’s data science team and deciding whether to proceed based on the results of these scripts. Finally, the solution must be integrated with their existing code base. The SambaSafety data science team used AWS’ external code storage solution; The final pipeline had to be intelligent enough to incorporate updates to their code base, which was written primarily in R.
Solution overview
The following diagram illustrates the solution architecture, which was informed by one of the open source architectures maintained by SambaSafety delivery partner Firemind.
The solution provided by Firemind for SambaSafety’s data science team was built around two ML pipelines. The first ML pipeline trains the model using SambaSafety’s custom data preprocessing, training, and testing scripts. The resulting model artifact is deployed to SageMaker-managed model endpoints for batch and real-time inference. A second ML pipeline facilitates the inference query on the hosting model. In this way, the training pipeline is separated from the inference pipeline.
One of the challenges of this project is replicating the manual steps taken by the SambaSafety data scientists. The Firemind team used Step Functions and SageMaker Processing to accomplish this task. Step Functions allow you to perform discrete tasks in AWS using AWS Lambda functions, Amazon Elastic Kubernetes Service (Amazon EKS) workers, or in this case SageMaker. SageMaker Processing lets you define jobs that run on managed ML instances in the SageMaker ecosystem. Each run of a Step Function job keeps its own logs, run history, and details about the success or failure of the job.
The team used Step Functions and SageMaker, along with Lambda, to automate training, setup, deployment, and inference workloads. The only remaining part was the continuous integration of code changes into this deployment pipeline. Firemind implemented the CodeStar project, which maintained a link to the existing SambaSafety code repository. When SambaSafety’s hardworking data science team releases an update to a specific branch of their codebase, CodeStar captures the changes and triggers the automation.
conclusion
SambaSafety’s new server-side MLOps pipeline has had a significant impact on their delivery capabilities. The integration of data science and software development enables their teams to work seamlessly together. Their automated model deployment solution reduces lead times by up to 70%.
SambaSafety also had the following to say:
“By automating our data science models and integrating them into the software development lifecycle, we have been able to achieve new levels of efficiency and accuracy in our services. This has enabled us to stay ahead of the competition and provide innovative solutions to our clients. Our clients will greatly benefit from this with faster turnaround times and improved accuracy of our solutions. “
SambaSafety has contacted the AWS account teams regarding their issue. The AWS account and solution architecture teams worked to identify this solution by drawing from our strong partner network. Connect with your AWS account team to identify similar transformative opportunities for your business.
About the authors
Dan Ferguson is an AI/ML Specialist Solutions Architect (SA) on the Private Equity Solutions Architecture at Amazon Web Services. Dan helps Private Equity backed portfolio companies leverage AI/ML technologies to achieve their business goals.
Khalil Adib is a Data Scientist at Firemind, driving the innovation Firemind can deliver to its customers in the magical worlds of AI and ML. Khalil takes care of the latest and greatest technologies and models, ensuring Firemind is always on the bleeding edge.
Jason Matthews is a Cloud Engineer at Firemind, leading the end-to-end delivery of projects to customers by writing pipelines with IaC, building data engineering with Python, and pushing the boundaries of ML. Jason is also a major contributor to the Firemind open source projects.
[ad_2]
Source link