Improve Diagnostic Accuracy
Utilize advanced AI and ML models trained on extensive datasets to enhance the precision of medical image analysis, despite variations in image quality and scanner types.
Enhance Processing Efficiency
Implement a scalable and high-performance architecture capable of ingesting, processing, and storing a continuous stream of incoming data without delay. This is essential to manage the sheer volume and velocity of data, with patient scans arriving every few seconds.
Enable near Real-Time Analysis
Facilitate real-time data processing to provide radiologists with timely results for review. This is particularly critical during peak hours when scans accumulate in queues, necessitating rapid processing to avoid delays in patient care.
Streamline Medical Imaging Workflow
Optimize each stage of the complex medical imaging workflow, from preprocessing raw DICOM (Digital Imaging and Communications in Medicine, the standard for the communication and management of medical imaging information) data to running computationally intensive inference models and saving results back into DICOM storage. Efficiently manage the computational resources required for these tasks to control costs.
Ensure Compliance and Security
Maintain the security and privacy of patient data to comply with stringent healthcare regulations. Implement robust security measures throughout the workflow to mitigate the risk of data breaches or unauthorized access.
Adapt to Ongoing Challenges
Continuously update and retrain ML models to adapt to evolving medical knowledge and emerging pathologies. Address data heterogeneity to ensure model accuracy across various image qualities and scanner types.
Databricks forms the heart of our solution, providing a robust data intelligence platform. This solution leverages advanced data engineering and machine learning capabilities to enhance diagnostic accuracy, processing efficiency, and effective data management.
The architecture diagram illustrates our implementation:
1. Data Ingestion and Storage:
• CT scans and associated metadata are ingested and stored in Azure Blob Storage and SQL Server respectively. This setup ensures that all incoming data is efficiently captured and securely stored for further processing.
• Azure Blob Storage and SQL Server are used for their robust storage solutions, providing scalability and security for large volumes of medical imaging data.
2.Data Preprocessing:
• The ingested data is preprocessed to prepare it for model training and inference. This involves cleaning and normalizing the data, converting it into a format suitable for machine learning algorithms.
Image depicts some of the dicom image preprocessing steps
• Databricks Jobs and Workflows are used to automate the preprocessing steps, ensuring consistency and efficiency in handling the data.
3. Model Training and Inference:
• Preprocessed data is used to train proprietary deep learning models, particularly Convolutional Neural Networks (CNNs) for image segmentation and classification. These models are crucial for accurate diagnosis.
◦ mlflow Integration: Databricks integrates seamlessly with mlflow to track
experiments, manage models, and facilitate reproducible workflows.
◦ GPU Utilization: GPU resources are leveraged for computationally intensive tasks, accelerating model training and inference.
4. Metrics and Monitoring:
• The performance of the models is monitored continuously to ensure they meet the desired accuracy and efficiency standards.
• mlflow provides metrics tracking and model versioning, enabling continuous evaluation and improvement of the models.
5. Results Storage and Access:
• The predicted results are stored back into SQL Server and made accessible through a user portal. This ensures that radiologists can access and review the diagnostic results promptly.
• SQL Server for storing results and a user portal interface for easy access and review by healthcare professionals.
By leveraging the comprehensive suite of Databricks services, including Databricks Job/Workflow, integrated mlflow, and seamless integration with Azure storage solutions, we efficiently streamline the preprocessing of vast quantities of DICOM files and associated annotations. This optimized workflow ensures seamless data transfer and management through Azure Blob Storage, intelligently balancing CPU and GPU resources to tackle complex data processing tasks with ease.
The Databricks environment significantly accelerates our deep learning model training, particularly for 3D image segmentation and binary classification, resulting in precise diagnostics and expedited results for our valued clients. This integrated approach not only enhances the accuracy of our diagnostics but also ensures that results are delivered in a timely manner, crucial for effective patient care.
Databricks Jobs and Workflows are used to automate the preprocessing steps, ensuring consistency and efficiency in handling the data.
Model training data volumes
The Databricks platform was leveraged to process a substantial volume of data across various body parts for model training. Below are the detailed counts of 3D scans and 2D images processed:
Operations metrics
Our use of the Databricks platform also improved our operational metrics, as shown in the table below:
Number of Jobs executed each day:
The following chart shows the daily execution of jobs, highlighting the platform’s capacity to handle a high volume of tasks efficiently:
Number of Jobs executed by month:
This chart illustrates the monthly job execution metrics, reflecting consistent high performance and reliability:
Interested in learning more about how ThoughtsWin Systems and Databricks can transform your medical imaging processes and workflows? Contact mahesh.shankar@thoughtswinsystems.com today to discover how our innovative solutions can help you achieve higher diagnostic accuracy and operational efficiency.
ThoughtsWin Systems leads the way in transformative technology solutions. Our proficiency in Data & Cloud Engineering, AI & Analytics, Modernization & Migration to Cloud, and Strategy & Governance powers business innovation.