BigQuery’s AI-assisted data preparation is now in preview

In today’s data-driven world, the ability to efficiently transform raw data into actionable insights is paramount. However, data preparation and cleaning is often a significant challenge. According to Gartner®1 “Gartner clients now report that 90% or more of their time is spent preparing data (as high as 94% in complex industries) for advanced analytics, data science and data engineering.

Reducing this time and efficiently transforming raw data into insights is crucial for staying competitive. Earlier this month, Google Cloud introduced BigQuery data preparation, an AI-first solution that streamlines and simplifies the data preparation process as part of Gemini in BigQuery

Now in preview, BigQuery data preparation provides a number of capabilities:

BigQuery data preparation helps you ensure the accuracy and reliability of your data, leading to more informed business decisions. BigQuery data preparation automates data quality checks and integrates with other Google Cloud services such as Dataform and Cloud Storage, providing a unified and scalable environment for your data needs.

How does it work?

Getting started is easy. When you sample a BigQuery table in BigQuery data preparation, it uses state-of-the-art foundation models to evaluate the data and schema using Gemini in BigQuery to generate data preparation recommendations like filter and transformation suggestions. For example, it knows how to identify valid date formats by country and which columns can act as join keys, accelerating the data engineering process.

In the above example (using synthetic data), the Birthdate column contains two different date formats and is of type STRING. BigQuery data preparation suggests to “Convert column Birthdate from type string to date with the following format(s): ‘%Y-%m-%d’,’%m/%d/%Y”. After you apply the suggestion card, you can verify the transformed preview data in a DATE format column.

With BigQuery’s AI-assisted data preparation, you can:

What BigQuery customers are saying 

Customers are already solving numerous challenges with BigQuery data preparation. 

GAF is a major manufacturer of roofing materials in North America, and is adopting data preparation for creating data transformation pipelines on BigQuery.

“GAF is looking to modernize the ETL infrastructure and adopt a BigQuery native, low-code solution. BigQuery data preparation will help our skilled business users and the analytics team in the data preparation processes for the enablement of self-service analytics.” – Puja Panchagnula, Management Director – Enterprise Data Management & Analytics, GAF

mCloud Technologies helps businesses in sectors like energy, buildings, and manufacturing to optimize the performance, reliability, and sustainability of their assets.

“We receive data feeds from our partners. BigQuery data preparation allows our product managers to prepare and operate the file data feeds with little to no help from our data engineering team.” – Jim Christian, Chief Product and Technology Officer, mCloud Technologies

Public Value Technologies is a joint venture between two German public broadcasting organizations (ARD).

“Public Value Technologies receives data feeds from our media partners for our data mesh solution and AI applications. BigQuery data preparation allows our data analysts and scientists to rapidly integrate the data feeds that standardize and preprocess the data in a low code way.” – Korbinian Schwinger, Team Lead Data Engineer, Public Value Technologies 

Getting started

With its powerful AI capabilities, intuitive interface, and tight integration with the Google Cloud ecosystem, BigQuery data preparation is set to revolutionize the way organizations manage and prepare their data. By automating tedious tasks, improving data quality, and empowering users, this innovative solution reduces the time you spend preparing data and improves your productivity. 

Related posts

Accelerate your move to the cloud with the new Database Migration Program

by Cloud Ace Indonesia
2 years ago

Building core strength: New technical papers on infrastructure security

by Cloud Ace Indonesia
1 year ago

Using GeoJSON in BigQuery for geospatial analytics

by Kartika Triyanti
3 years ago