Databricks Launches Lakehouse for Healthcare and Life Sciences

Databricks announced the launch of its new Lakehouse platform, the Databricks Lakehouse for Healthcare and Life Sciences.

According to the company’s press release, Databricks Lakehouse for healthcare and life sciences is a “one-stop platform for data management, analytics, and advanced AI use cases like disease prediction, medical image classification and biomarker discovery”. GE Healthcare, Regeneron, ThermoFisher and Walgreens are among the early adopters of the platform.

On its product page, Databricks notes four main problems in healthcare data, including incomplete or fragmented patient care data, the high cost and complexity of managing rapidly growing volumes of healthcare data, slow delivery of real-time information for critical care decisions and a lack of strong machine learning capabilities for predictive analytics and data modeling.

The new platform promises to solve these problems by unifying structured and unstructured patient data, scaling data in the cloud for population-scale health insights, enabling analytics in real time with fast streaming data ingestion and processing, and advancing machine learning for forecasting and research. analytic.

Source: Databricks

Specifically, the company says the platform “offers customers tailored data and AI solutions” through analytics accelerators, open source libraries, and a community of partners and organizations that includes Lovelytics for automated streaming data ingestion, John Snow Labs for unstructured textual data analysis. with natural language processing, and ZS Associates for whole genome processing in biomedical research. Other features of the platform include ML-based disease risk prediction, automated numerical disease classification with deep learning, and tools for data modeling and cohort building.

“The opportunity for healthcare to be transformed with data and AI cannot be overemphasized. As organizations fully transition to electronic medical records, new types of data like genomics evolve, and IoT and wearables are taking off, the industry is inundated with massive amounts of data, but that data is siled and teams don’t have the tools to use it properly,” said Michael Hartman, senior vice president of regulated industries. at Databricks.” Together with Lakehouse for Healthcare and Life Sciences, we can drive transformation across the entire healthcare ecosystem and help our customers solve industry-specific challenges and, ultimately driving better outcomes for the future of healthcare.”

This is the third Data Lakehouse platform the company has released so far this year, following Databricks Lakehouse for retail and Databricks Lakehouse for financial services. To the uninitiated, the word “lakehouse” may seem like an empty buzzword, but the technology is gaining popularity for its efficiency. Organizations in the healthcare and life sciences industries have typically used more traditional data architectures such as data warehouses and silos, which are initially easy to use but difficult to scale and maintain. costly as a company’s data and AI/ML workloads grow. Data lakes were born out of the need for high-performance, large-scale platforms capable of supporting heavy workloads with real-time data ingestion, but they can be difficult to build and maintain due to the time, resources and skilled data engineers needed to do so. .

Source: Databricks

When you combine the simplicity and functionality of a traditional warehouse with the speed and scalability of a data lake, you get a lake house. Like The datanami Alex Woodie noted that a Lakehouse “provides the flexibility to handle less structured data types, such as text and image files, which are commonly used in data science and machine learning projects, but it borrows also to the discipline of the data warehouse, particularly in terms of ensuring data quality and ensuring its lineage is tracked and governed.Lakehouse platforms can automate ingestion, processing and optimization data within an infrastructure, which can enable companies to do more with their data, in this case, driving better patient outcomes and facilitating innovation in healthcare research and pharmaceutical manufacturing.

“We recognize the important role data plays in getting our products into the hands of those who need them most, and the Databricks Lakehouse solution for healthcare and life sciences helps us achieve this goal,” said Feng Liang, senior IT manager, Thermo Fisher Scientific. “This modern data and AI platform has allowed us to break down costly data silos, open up new opportunities for innovation, and become a more data-driven organization.”

Related articles:

Databricks sees Lakehouse validation in $1.6 billion round

Lakehouses prevents data swamps, says Bill Inmon

Databricks SQL Now GA, bringing traditional BI to the Lakehouse

Comments are closed.