Today, data exists in many formats, is provided in real-time streams, and stretches across many different data centers and clouds, all over the world. From analytics, to data engineering, to AI/ML, to data-driven applications, the ways in which we leverage and share data continues to expand. Data has moved beyond the analyst and now impacts every employee, every customer, and every partner. With the dramatic growth in the amount and types of data, workloads, and users, we are at a tipping point where traditional data architectures – even when deployed in the cloud – are unable to unlock its full potential. As a result, the data-to-value gap is growing.
To address these challenges, we are unveiling several data cloud innovations today that allow our customers to work with limitless data, across all workloads, and extend access to everyone. These announcements include BigLake and Spanner change streams to further unify customer data while ensuring it’s delivered in real-time, as well as Vertex AI Workbench and Model Registry to close the data to AI value gap. And to bring data within reach for anyone, we are announcing a unified business intelligence (BI) experience that includes a new Workspace integration, along with new programs that further enable our data cloud partner ecosystem.
Removing all data limits
Today, we are announcing the preview of BigLake, a data lake storage engine, to remove data limits by unifying data lakes and warehouses. Managing data across disparate lakes and warehouses creates silos and increases risk and cost, especially when data needs to be moved. BigLake allows companies to unify their data warehouses and lakes to analyze data without worrying about the underlying storage format or system, which eliminates the need to duplicate or move data from a source and reduces cost and inefficiencies.
With BigLake, customers gain fine-grained access controls, with an API interface spanning Google Cloud and open file formats like Parquet, along with open-source processing engines like Apache Spark. These capabilities extend a decade’s worth of innovations with BigQuery to data lakes on Google Cloud Storage to enable a flexible and cost-effective open lake house architecture.
Twitter already uses storage capabilities with BigQuery to remove the limits of data to better understand how people use their platform, and what types of content they might be interested in. As a result, they are able to serve content across trillions of events per day with an ads pipeline that runs more than 3M aggregations per second.
Another major innovation we’re announcing today is Spanner change streams. Coming soon, this new product will further remove data limits for our customers, allowing them to track changes within their Spanner database in real time in order to unlock new value. Spanner change streams tracks Spanner inserts, updates, and deletes to stream the changes in real time across a customer’s entire Spanner database. This ensures customers always have access to the freshest data as they can easily replicate changes from Spanner to BigQuery for real-time analytics, trigger downstream application behavior using Pub/Sub, or store changes in Google Cloud Storage (GCS) for compliance. With the addition of change streams, Spanner, which currently processes over 2 billion requests per second at peak with up to 99.999% availability, now gives customers endless possibilities to process their data.
Remove the limits of your data workloads
Our AI portfolio is powered by Vertex AI, a managed platform with every ML tool needed to build, deploy and scale models, and is optimized to work seamlessly with data workloads in BigQuery and beyond. Today, we’re announcing new Vertex AI innovations that will provide customers with an even more streamlined experience to get AI models into production faster and make maintenance even easier.
Vertex AI Workbench, which is now generally available, brings data and ML systems into a single interface so that teams have a common toolset across data analytics, data science, and machine learning. With native integrations across BigQuery, Serverless Spark, and Dataproc, Vertex AI Workbench enables teams to build, train and deploy ML models 5X faster than traditional notebooks. In fact, a global retailer was able to drive millions of dollars in incremental sales and deliver 15% faster speed to market with Vertex AI Workbench.
With Vertex AI, customers have the ability to regularly update their models. But managing the sheer number of artifacts involved can quickly get out of hand. To make it easier to manage the overhead of model maintenance, we are announcing new MLOps capabilities with Vertex AI Model Registry. Now in preview, Vertex AI Model Registry provides a central repository for discovering, using, and governing machine learning models, including those in BigQuery ML. This makes it easy for data scientists to share models and application developers to use them, ultimately enabling teams to turn data into real-time decisions, and be more agile in the face of shifting market dynamics.
Extending the reach of your data
Today, we are launching Connected Sheets for Looker, and the ability to access Looker data models within Data Studio. Customers now have the ability to interact with data however they choose, whether it be through Looker Explore, from Google Sheets, or using the drag-and-drop Data Studio interface. This will make it easier for everyone to access and unlock insights from data in order to drive innovation, and to make data-driven decisions with this new unified Google Cloud business intelligence (BI) platform. This unified BI experience makes it easy to tap into governed, trusted enterprise data, to incorporate new data sets and calculations, and to collaborate with peers.
Mercado Libre, the largest online commerce and payments ecosystem in Latin America, has been an early adopter of Connected Sheets for Looker. Using this integration, they have been able to provide broader access to data through a spreadsheet interface that their employees are already familiar with. By lowering the barrier to entry, they have been able to build a data-driven culture in which everyone can inform their decisions with data.
Doubling down on the data cloud partner ecosystem
Closing the data-to-value gap with these data innovations would not be possible without our incredible partner ecosystem. Today, there are more than 700 software partners powering their applications using Google’s data cloud. Many partners like Bloomreach, Equifax, Exabeam, Quantum Metric, and ZoomInfo, have started using our data cloud capabilities with the Built with BigQuery initiative, which provides access to dedicated engineering teams, co-marketing, and go-to-market support.
Our customers want partner solutions that are tightly integrated and optimized with products like BigQuery. So today, we’re announcing Google Cloud Ready – BigQuery, a new validation that recognizes partner solutions like those from Fivetran, Informatica and Tableau that meet a core set of functional and interoperability requirements. Today, we already recognize more than 25 partners in this new Google Cloud Ready – BigQuery program that reduces costs for customers associated with evaluating new tools while also adding support for new customer use cases.
We’re also announcing a new Database Migration Program to help our customers efficiently and effectively accelerate the move from on-premise and other clouds to Google’s industry-leading managed database services. This includes tooling, resources, and knowledgeable experience from alliances like Deloitte, as well as incentives from Google to offset the cost of migrating databases.
We remain committed to continued innovation with the leading data and analytics companies where our customers are investing. This week Databricks, Fivetran, MongoDB, Neo4j, and Redis are all announcing significant new capabilities for customers on Google Cloud.
All of these announcements and more will be shared in detail at our Data Cloud Summit. Be sure to watch the data cloud strategy sessions, breakouts, and get access to hands on content. There is no doubt the future of data holds limitless possibilities, and we are thrilled to be on this data cloud journey.