Recently, we have often heard words such as machine learning and deep learning. We know machine learning, but some people might not really apply it. Last year, Google announced a cloud service for machine learning, namely Google Cloud Machine Learning.
Google Cloud Machine Learning is a managed platform (provided by the Google Cloud Platform) which allows you to easily create machine learning models in a variety of sizes. Behind Google Cloud Machine Learning (Cloud ML), there is TensorFlow, something which works like a machine learning library developed by Google. However, without the need to know TensorFlow or machine learning, actually Cloud ML can be tested easily.
This time, we’d like to use Cloud ML to analyze images using MNIST, which is often used in machine learning tutorials. MNIST is a collection of data containing numerical image data in the form of handwritten and correct labels. We can manage the machine learning model using this data set. Machine learning models can predict which digits represent handwritten digit images. Training scripts and data sets are available, so we just follow the instructions.
Just keep in mind that Cloud ML is currently in beta, and the specifications are subject to change.
MNIST consists of handwritten digital images such as those above, each with a correct answer label. (Quoted from MNIST For ML Beginners)
What is Google Cloud Machine Learning?
Cloud ML is a service which allows general users to use the machine learning cloud function that Google uses in their companies. (As of February 2017, provided as a beta.)
To build a model, we use a strong TensorFlow framework which supports many Google products, from photos to Google Cloud Speech. Because it is integrated with global load balancing services, machine learning applications can be upgraded automatically and available to users throughout the world.
Cloud Machine Learning Costs
In order that you can get access to Cloud ML, you will be charged for training models and predictions. However, actually you can manage the resources of machine learning in the cloud for free. Details of the costs of each action will be explained in the next section.
In addition to the costs listed above, models and resources need to be stored in a Google Cloud Storage bucket throughout the Cloud ML process,for example, posting a training package when doing a training model, or saving a model file when you are ready to deploy a version. In addition, training results and batch prediction are also stored in the Cloud Storage bucket. After the operation is complete, you can immediately delete the file, unless it is used on a very large scale and it will not be at expense.
Cloud ML Training Costs
ML Training Units
Users can adjust the type of processing cluster they use when training their work. As an easy way, you can also choose one of the predefined configurations called “hierarchical scales”. Each scale has a number of different ML Training units to determine prices.
Training costs can be calculated from the following formula:
(Number of ML Training units) × (cost per unit) × (time of job implementation [in minutes]) ÷ 60
Example:
For example, a data scientist takes the STANDARD_1 scale, the duration is 15 minutes. Then the costs are calculated as follows:
10 x $ 0.49 x 15 ÷ 60 = 1.23
So the total cost for this ML training is $ 1.23.
Estimated Cost of Cloud ML
Batch Prediction
In batch prediction, costs will appear for the number of predictions *) and processing time **)
*) Predictions are accumulated every month and rates are rounded to the nearest nominal.
**) After processing, you will be charged per minute for each training. A minimum of every 10 minutes will always be charged.
***) Discount is valid after 100 million requests have been reached for the month.
Online Predictions
Another feature of Cloud ML is available in beta, but unfortunately the online prediction service is available in alpha. As long as the feature is available in alpha, the use of this feature is still free. (As of February 2017)
Moving Cloud ML
Now, let’s try to move Cloud ML properly. Cloud ML operations are mainly performed on the shell. You can run it locally, but for that, you have to install the software locally too. We recommend Cloud Shell which can be run in a cloud environment. Let’s move Cloud ML using Cloud Shell this time. The procedure is as follows.
- Equate the environment needed for the process of executing Cloud ML
- Hold training
- Make a model
- Predict handwritten images
Match the environment needed for the process of executing Cloud ML
To match the environment, the first thing to do is to configure the project and install the file.
Activate the required API
From the menu bar, select “API Manager”, then activate the API below. If it’s already active, it doesn’t need to be changed.
- Cloud Machine Learning API
- Compute Engine API
- Cloud Logging API
- Cloud Storage API
- Cloud Storage JSON API
- BigQuery API
Configuring Environment
After completing the project configuration, then set the environment in Cloud Shell.
Code
# Installing the tools needed to execute Cloud ML
curl https://raw.githubusercontent.com/GoogleCloudPlatform/cloudml-samples/master/tools/setup_cloud_shell.sh | bash
# Add the installed tools in the new path
export PATH = $ {HOME} /. local / bin: $ {PATH}
Training
If the preparation for Cloud ML is complete, the next step is to conduct training to create a machine learning model. The steps to run the training are as follows:
- Confirm whether local training can be done or not.
- Send scripts training in the Cloud.
Local training
Before conducting training in the cloud, it is important to ascertain whether training can be carried out locally. Generally, training is done by preparing a small set of data sets. However, this time around 70,000 data sets were used. So, keep using the 70,000 data sets for training.
The steps below are not necessary, so they can be skipped.
Code
# Change the operating directory.
cd ~ / google-cloud-ml / samples / mnist / deployable /
# Delete the previous training output file.
rm -f data / {checkpoint, events, export} *
# Running training locally
gcloud beta ml local train \
–package-path = trainer \
–module-name = trainer.task
Make a model
There are 2 ways to make a model, namely:
Make it on Cloud Shell
Code
# Configure the model name (Below is an example of the model name)
MODEL_NAME = mnist _ $ {USER} _ $ (date +% Y% m% d_% H% M% S)
# Creating a Cloud ML model
gcloud beta ml models create $ {MODEL_NAME}
# Make the version used to predict (the version that has been created for the first time will be the default version).
gcloud beta ml models versions create \
–origin = $ {TRAIN_PATH} / model / \
–model = $ {MODEL_NAME} \
v1
Make with GCP Console
It is possible to create versions and models in the GCP Console. The steps are as follows:
- From the menu, choose [Machine Learning]> [Model]
- Click [Create Model]
- Type the name of the model
- Click [Create]
- Return to the model list screen, then click the name of the new model.
- Click [+ Create Version]. Type in the name and source (The source is $ {TRAIN_PATH} / model / を 入 力)
- Click [Create] and return to the model list screen.
Memprediksi Gambar Numerik Tulisan Tangan
Nah, sekarang model yang mampu melatih pola identifikasi angka tulisan tangan telah selesai dibuat. Persiapan Anda untuk memiliki model yang memprediksi angka yang diwakili oleh gambar angka tulisan tangan pun telah selesai. Untuk memprediksi Anda bisa menggunakan model yang sudah ada dan data gambar terprediksi. Data untuk prediksi disiapkan sebagai file JSON. Mari kita lihat sedikit isinya.
Kode
head -n1 data/predict_sample.tensor.json
Predict Handwriting Numerical Images
Now, the model which is able to practice handwritten number identification patterns has been completed. Your preparation for having a model which predicts the numbers represented by handwritten figures is complete. To predict you can use existing models and predictable image data. Data for predictions is prepared as a JSON file. Let’s look at the contents.
Code
head -n1 data / predict_sample.tensor.json
The figure contains handwritten numerical image data (the model that you want to be predicted) as a float value. The key attribute contains an image index in the JSON file.
Below is the conversion of the data above into an image. Let’s see how the model identifies it.
Use online prediction services
Run the following command to send prediction requests to the Cloud ML online forecast service. Specify the model name and JSON file name.
Code
gcloud beta ml predict –model = $ {MODEL_NAME} \
–json-instances = data / predict_sample.tensor.json
This is what the screen looks like after the prediction is done. Let’s compare with the handwritten numerical picture above.
Everything is true except for the first prediction. Because 9 out of 10 are correct, can the identification accuracy be 90%?
This time, I practiced with about 70,000 small data sets. Next time, training with larger data sets will improve accuracy.
Closing remarks
Machine learning is a very deep discussion and we think it is also difficult to get started, but I hope this article will make you interested in machine learning.
The articles in this blog are not limited to machine learning, but also about GCP. Please proceed to apps-gcp. In addition, at Cloud Ace we can help with the development of GAE contracts, consulting on cloud and machine learning, so don’t hesitate to contact us.