Linear Regression Using TensorFlow with Examples

Linear Regression Using TensorFlow with Examples

Table of Contents

TensorFlow is a popular open-source library used for high-end numerical computations in machine learning and deep learning. The major reason for its wide acceptance is on accounts of the library’s support for APIs in various languages. APIs are used to ease the building of models and enhance the performance of project execution. With TensorFlow APIs, you can fully control your computations and execute many machine learning models much faster. The Tensorflow APIs can be broadly classified into two, low-level API and high-level APIs. 

Low-level APIs are generally more elementary and detailed, allowing you to have full control of your functions and computations. Low levels APIs allow you to build and optimize your model from scratch. High-level APIs on the other hand are relatively simpler with predefined functions. You can easily execute computations that would require long lines on codes with low-level APIs, in one statement. Simply put, high-level APIs are easier to use. 

Some common APIs include Keras and the estimator toolbox. In this tutorial, we will be using the estimator toolbox to build, train, and evaluate a machine learning model. We will streamline our focus to the Linear Regression algorithm and see the various methods that can be used to build the model. 

By the end of this tutorial, you will learn:

  • What linear regression is
  • How to build a line of best fit with python
  • How to build a linear regression model with Tensorflow
  • What feature columns are
  • What an input function is and why it is necessary
  • Understanding what batch sizes, epoch, steps are
  • How to feed data to your TensorFlow model using Pandas Dataframe
  • How to feed data to your TensorFlow model using Numpy arrays and dictionaries. 

Let’s begin with understanding what linear regression is. 

Introduction to Linear Regression

A linear regression model is a model that is used to show how two variables are related. The linear regression algorithm seeks to find a line that best fits the two variables and can be used to predict the output of one variable given the other variable. The variable that is being predicted is called the dependent variable whereas the variable that is needed for the prediction is called the independent variable. This kind of model with two variables is called a simple linear regression model. 

A system can have more than two variables nevertheless. When only more than two variables are involved, the model is called a multiple linear regression problem. 

Let’s say we are dealing with a simple linear regression model. The independent variable is conventionally denoted by x while the dependent variable is denoted by y. The line that best describes how x is related to y is given by the formula,

y = mx + b

Where y is the dependent variable 

x is the independent variable
m is the weight of the regression line
And b is the bias of the regression line
The slope m can be found using the formula 
m=(y2-y1)/(x2-x1)

Sometimes, an error term β is added to the formula to make up for the fact that x and y cannot always have a linear relationship. The equation of the regression line now becomes 

y = mx + b+ β

Where it is not added, it implies that knowing x and b are sufficient enough to ascertain the value of y. 

The slope of the regression indicates whether the relationship between the dependent and independent variables are positive, negative, or non-existent. 

  • If the regression line is flat such that the slope of the line is zero, it means there is no relationship whatsoever between the dependent and independent variables. In other words, an increase in one variable does not affect the other variable. 
  • If the regression line slopes downwards, with the upper end of the line pointing towards the y-axis and the lower end pointing towards the x-axis, it implies that there exists a negative relationship between the dependent and independent variables. In other words, as one variable increases, the other decreases. 
  • If the regression line slopes upwards, such that the upper end of the line points away from the graph and the lower end pointing towards the x or y-intercept, it implies that there exists a positive relationship between the dependent and independent variables. In other words, as one variable increases, the other increases as well. 

Now, we have a good understanding of what Linear Regression is about, let’s see how to build one in Python. 

Creating a Linear Regression Line in Python

We can create the line of best fit between two variables using python. First, we would need to create some random data for the x and y-axis. To do this we would be needing the NumPy and matplotlib libraries

#import the necessary libraries
import numpy as np
from matplotlib import pyplot as plt

We will define a random seed to ensure the randomly generated numbers remain the same even if the program is run again. This is a good practice to ensure homogeneity in our program.

#define a random seed
np.random.seed(42)

At this point, we can now create our randomly generated data for both the x and y axis alongside some noise. 

#create random points for the x and y axis from 0 to 20
xs = np.linspace(0, 20, 50)
ys = np.linspace(20, 0, 50)

# add some positive and negative noise on the y axis
ys += np.random.uniform(-2, 2, 50)
ys += np.random.uniform(-2, 2, 50)

Plotting the graph, we have

#plot the graph
plt.plot(xs, ys, '.')
Linear Regression Using TensorFlow with Examples

Using the formula earline defined, we can calculate the weight and bias of the graph and plot the line of best fit. 

#define the weight of the regression line
m = (((np.mean(xs) * np.mean(ys)) - np.mean(xs * ys)) /
    ((np.mean(xs) ** 2) - np.mean(xs ** 2)))

#define the bias of the regression line
b = np.mean(ys) - (m * np.mean(xs))

#define the equation of the regression line
regression_line = [(m * x) + b for x in xs]

#plot the graph
plt.plot(xs, ys, '.')
plt.plot(xs, regression_line)

Output:

Linear Regression Using TensorFlow with Examples

And there you have it – the regression line. The red line represents the line of best fits for the randomly generated data.

But this is a simple scenario where you have just one dependent and independent variable each. In many real-life situations, you will be dealing with more than one independent variable and even more than more dependent variables. For instance, the popular iris dataset has 4 independent variables (sepal length, petal length, sepal width, and sepal width) to determine the dependent variable (the species of the flower). 

In such cases, to determine the line of best fit will be difficult using the above method. It would be impossible to even visualize the data since it has 5 variables in total (we can at most, visualize in 3 dimensions). Thus, a more sophisticated approach that involves training and evaluating the model is employed to determine the regression line. Let’s understand how this works. 

How Training a Linear Regression Model Works

Let’s say we are dealing with the iris dataset. The independent variables are the sepal length, petal length, sepal width, and petal width, while the dependent variable is the species of the flower. In other words, to predict the species of the flower, you will need to define the petal length, sepal length, petal width, and petal width. The equation for the linear regression line is therefore 

y = m1(sepal length) +m2(petal length) +m3(sepal width) +m4(petal width) + b+ β

Where m1, m2, m3, and m4 are the weights and b is the bias

When training the linear regression algorithm initializes a random number for the weight and bias equation and computes the predicted value for all the observations in the data. After this is done, the error in the prediction is calculated by subtracting the predicted values from the actual values. 

Error = yactual – ypred

The goal is to attempt to make the error as minimal as possible. This error is technically called the cost function. For linear regression problems, the cost function fondly used is the mean of the sum of the errors. The value is called mean squared error and is mathematically represented as 

MSE= 1mi=1nTxi-yi2 

Txi is the predicted value while y is the actual value. T is the weight which is altered continuously until the MSE is as minimal as possible. But how is the weight altered?

After the mean squared error has been computed, the weights are calculatedly corrected using an optimizer. There are a plethora of optimizers but the common is the Gradient Descent optimizer. The gradient descent finds the derivative or the gradient by measuring how a change in the weight will affect the error. If the gradient descent is positive, then the weight needs to be reduced. If however, the gradient descent is negative, it implies that the weight must be increased. The process keeps on happening for different weights until the derivative is very close to zero. Each of the processes is called an iteration and the point where the derivative is approximately zero is called the local minimum. 

But there’s one thing to note in this process. What informs how much the weight should be changed after each iteration? The idea of gradient descent is better explained with the analogy of a man going down the hill. With him taking giant strides, he will most likely not get to the steepest part of the hill because he will take giant strides when he is just close already. On the other hand, taking baby steps will take a longer time to get to the lowest part of the hill. The best bet is to take giant strides in the starting and reduce it as he goes down the hill. 

Bringing it back to the machine learning algorithm, the difference in weight changes is determined by the learning rate. The learning rate determines how large or small the weights should be changed to get to the local minimum quickly. If the learning rate is large, the gradient descent would not get to the local minimum. If it is too small, it will take a lot of time to get there. 

Linear Regression Using TensorFlow with Examples

 Source: Builtin

Your learning rate must be carefully defined such that the cost function decreases very rapidly in the first few iterations and stabilizes at some point as seen in the figure below. 

Linear Regression Using TensorFlow with Examples

In the figure above, we can see that the loss stabilizes after the 600th iteration. That means the algorithm found the local minimum after tweaking the weights 600 times. The model has learned the data and is ready to make predictions for completely new data. 

Training a Linear Regression Model with TensorFlow (Example)

In this session, we will go ahead to train a linear regression model using the Tensorflow API, TensorFlow.estimator. We will be using the popular Boston housing dataset for this example. The dataset will be imported from Scikit learn dataset repository. 

We will start by importing the necessary libraries

#import necessary libraries
import pandas as pd
import numpy as np
import tensorflow as tf
from sklearn.datasets import load_boston

Then we’d go-ahead to load the dataset

#load the dataset
boston = load_boston()

A dataset is a form of a dictionary where the keys are a list of information that can be extracted from the data. To check what the dataset is about, we use the DESCR method. 

#check the description of the dataset
print(boston.DESCR)

Output:

Boston house prices dataset
---------------------------

**Data Set Characteristics:**  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive. Median Value (attribute 14) is usually the target.

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pupil-teacher ratio by town
        - B        1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
        - LSTAT    % lower status of the population
        - MEDV     Median value of owner-occupied homes in $1000's

    :Missing Attribute Values: None

    :Creator: Harrison, D. and Rubinfeld, D.L.

This is a copy of UCI ML housing dataset.
https://archive.ics.uci.edu/ml/machine-learning-databases/housing/

As seen above, the dataset shows the median values of owner-occupied homes in $1000’s given various attributes such as per capita crime rate by town, the proportion of residential land zoned for lots over 25000 square feet, and the others. Let’s see what the dataset looks like

#convert the dataset into a dataframe
df = pd.DataFrame(boston.data, columns=boston.feature_names)
#print the first 5 rows of the dataframe
print(df.head())

Output:

CRIM    ZN  INDUS  CHAS    NOX     RM   AGE     DIS  RAD    TAX  \
0  0.00632  18.0   2.31   0.0  0.538  6.575  65.2  4.0900  1.0  296.0   
1  0.02731   0.0   7.07   0.0  0.469  6.421  78.9  4.9671  2.0  242.0   
2  0.02729   0.0   7.07   0.0  0.469  7.185  61.1  4.9671  2.0  242.0   
3  0.03237   0.0   2.18   0.0  0.458  6.998  45.8  6.0622  3.0  222.0   
4  0.06905   0.0   2.18   0.0  0.458  7.147  54.2  6.0622  3.0  222.0   

   PTRATIO       B  LSTAT  
0     15.3  396.90   4.98  
1     17.8  396.90   9.14  
2     17.8  392.83   4.03  
3     18.7  394.63   2.94  
4     18.7  396.90   5.33 

The target column is separated and needs to be added to the dataframe. This is done using fancy indexing,

#add the target column to the dataframe
df['MEDV'] = boston.target 
#print the first 5 rows of the data frame
print(df.head())

Output:

CRIM    ZN  INDUS  CHAS    NOX     RM   AGE     DIS  RAD    TAX  \
0  0.00632  18.0   2.31   0.0  0.538  6.575  65.2  4.0900  1.0  296.0   
1  0.02731   0.0   7.07   0.0  0.469  6.421  78.9  4.9671  2.0  242.0   
2  0.02729   0.0   7.07   0.0  0.469  7.185  61.1  4.9671  2.0  242.0   
3  0.03237   0.0   2.18   0.0  0.458  6.998  45.8  6.0622  3.0  222.0   
4  0.06905   0.0   2.18   0.0  0.458  7.147  54.2  6.0622  3.0  222.0   

   PTRATIO       B  LSTAT  MEDV  
0     15.3  396.90   4.98  24.0  
1     17.8  396.90   9.14  21.6  
2     17.8  392.83   4.03  34.7  
3     18.7  394.63   2.94  33.4 
4     18.7  396.90   5.33  36.2 

You can input your data to the tf.estimator method using various means. In this tutorial, we will use 2 different methods to pass the data into the tensorflow.estimator method: using pandas dataframe, using numpy arrays. There are other ways to feed the data into your tensorflow model but we will limit this tutorial to these two methods. Let’s begin with using pandas. 

Using Pandas

Step 1: Specify the feature and target columns

For easy accessibility, the dataset will be split into columns containing the features and the column containing the target. 

#define the feature columns and target columns
features = df[boston.feature_names]
target = 'MEDV'

Step 2: Define the estimator

Just before you define your model or estimator, TensorFlow requires you to define what is called feature columns. Feature columns is a data preprocessing step that transforms raw data into a form that can be understood by the TensorFlow estimator. You may see the feature_columns as a bridge between the raw data and the TensorFlow estimator. Note that only the features need to be passed and not the target column. 

You can transform the feature columns with tensorflow using tf.feature_columns(). Since all the columns contain continuous numbers, the numeric_column() method will be used. To transform all the columns in one line of code, you can use a list comprehension.

#convert the feature columns into Tensorflow numeric column
feature_columns = [tf.feature_column.numeric_column(i) for i in features]

If however, the dataset contains categorical features, you will need to convert that using other methods such as the categorical)column_with_vocabulary_list() or the indicator_column(). 

After defining the feature columns, you can define your estimator. The estimator would require 2 arguments, the feature columns (which we just defined) and the model directory where the model parameters and graph will be stored. We will name the model directory ‘LinRegTrain’

There are 6 estimators in the tensorflow.estimator method – 3 each for regression and classification problems. 

For regression problems, you may select

1. LinearRegressor

2. DNNRegressor

3. DNNLineaCombinedRegressor

For classification problems, you may select

1. LinearClassifier

2. DNNClassifier

3. DNNLineaCombinedClassifier

For this tutorial, we shall be using the LinearRegressor estimator. We can call the method using the tf.estimator.LinearRegressor method. 

#define the linear regression estimator

estimator = tf.estimator.LinearRegressor(feature_columns=feature_columns, model_dir='LinRegTrain')

Output:

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': 'LinRegTrain', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x00000146D467B908>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

Step 3: Split the data into train and test data

We will need to split the data into train data and test data. The model will be trained with the training dataset while it would be evaluated using the test dataset. First, you’d need to create a training size which is set to 75% of the entire observation. The train and test data were then specified using the iloc attribute. 

 #define the training size of the dataset to be 75%
training_size = int(len(df) * 0.75)

#define the train and test data
train = df.iloc[:training_size, :]
test = df.iloc[training_size:, :]

Step 4: Train the model. 

The next step is to go ahead to train the model. It can be done using the estimator.train()

The method takes arguments that include the input function, the X data (features), y data (labels), batch size, number of epochs, shuffling, etc. These terms may sound alien but let’s take out time to demystify them one at a time. 

  1. Input function: Tensorflow estimators take in data through what is called an input function. Tensorflow deals with tensors and so, all forms of data (it could be streaming data, in-memory data, custom data, etc) must be converted into tensors. The input function generates tensors from the raw data and supplies them to the TensorFlow estimator. The input function also configures how the model trains or evaluates the data. This is why you’d also need to define the batch size, the number of epochs, shuffling, etc. We’d discuss these terms momentarily. 

The input data needed for the input function can either be a NumPy or pandas. In this method, the input data will be created as a pandas dataframe. We’d discuss how to use NumPy in the next method. 

  1. Batch size: Tensorflow was designed to accommodate large datasets and parallel computing. When working with large datasets, you can train the data on various computers (parallel computing) or decide to use one computer if you don’t have the resources. If you’re using one computer to train a large dataset, it is impossible to expose your model to all the data at once. Your computer’s memory will run out of space. This is where defining your batch size comes into play. 

The batch size helps to feed your data to the model in batches. If you have data with 5000 observations and you define a batch size of 100, the data will be split into 100 places. That means 50 observations will be fed to your model per iteration. 

  1. Epoch: An epoch is the term used when the model has been exposed to all the data. If the epoch is set to 2, it means the model will be exposed to the data twice. The second time, it uses different weights with the aim of reducing the loss. 

By default, the epoch is set to None. If you leave it this way, the model will see the data just once. That is, it will end after all the batches of the data have been fed to the model. You could define a parameter called steps. The number of steps is simply the number of iterations you want.

If you have 5000 observations and you set the batch size to be 100. It means it will take 50 iterations to see all the data. Setting the epoch to 4 will require 4 × 50 iterations. Another way of doing this is to set the steps argument to 200 and leave the number of epochs as None. It will run, 200 iterations.

  1. Shuffling: It is always a good idea to shuffle your data during training. This will ensure your model does not learn specific patterns in your data hook, line, and sinker. When it does learn patterns as it is, your model would not make good predictions even though it learns the data pretty well during training. This is called overfitting your model and it must be avoided. 

Now we understand the useful parameters when creating an input function, let’s create an input function. Remember we are using the pandas_input function here. 

'''This function defines the input function for the dataset'''
def input_fn(dataset, batch_size=128, num_epochs=None, shuffle=True):
    return tf.estimator.inputs.pandas_input_fn(
    x = dataset[boston.feature_names],
    y = dataset['MEDV'],
    batch_size=batch_size,
    num_epochs=num_epochs,
    shuffle=shuffle
    )

And then, we can finally train the model.

#train the model with 2000 steps
estimator.train(input_fn=input_fn(train, num_epochs=None), steps=2000)

Output:

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from LinRegTrain\model.ckpt-19081
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 19081 into LinRegTrain\model.ckpt.
INFO:tensorflow:loss = 2068.8728, step = 19082
INFO:tensorflow:global_step/sec: 159.792
INFO:tensorflow:loss = 3538.3533, step = 19182 (0.631 sec)
INFO:tensorflow:global_step/sec: 200.115
INFO:tensorflow:loss = 1659.8722, step = 19282 (0.500 sec)
INFO:tensorflow:global_step/sec: 203.782
INFO:tensorflow:loss = 3306.3003, step = 19382 (0.491 sec)
INFO:tensorflow:global_step/sec: 207.158
INFO:tensorflow:loss = 3048.774, step = 19482 (0.483 sec)
INFO:tensorflow:global_step/sec: 206.305
INFO:tensorflow:loss = 3164.751, step = 19582 (0.485 sec)
INFO:tensorflow:global_step/sec: 208.453
INFO:tensorflow:loss = 2899.857, step = 19682 (0.481 sec)
INFO:tensorflow:global_step/sec: 206.728
INFO:tensorflow:loss = 3106.3613, step = 19782 (0.482 sec)
INFO:tensorflow:global_step/sec: 200.517
INFO:tensorflow:loss = 2640.4854, step = 19882 (0.499 sec)
INFO:tensorflow:global_step/sec: 202.545
INFO:tensorflow:loss = 2857.7683, step = 19982 (0.494 sec)
INFO:tensorflow:global_step/sec: 204.616
INFO:tensorflow:loss = 2167.958, step = 20082 (0.489 sec)
INFO:tensorflow:global_step/sec: 204.198
INFO:tensorflow:loss = 2442.528, step = 20182 (0.491 sec)
INFO:tensorflow:global_step/sec: 207.588
INFO:tensorflow:loss = 2945.9646, step = 20282 (0.482 sec)
INFO:tensorflow:global_step/sec: 210.646
INFO:tensorflow:loss = 3567.1733, step = 20382 (0.477 sec)
INFO:tensorflow:global_step/sec: 211.093
INFO:tensorflow:loss = 3195.5977, step = 20482 (0.472 sec)
INFO:tensorflow:global_step/sec: 205.035
INFO:tensorflow:loss = 2235.641, step = 20582 (0.488 sec)
INFO:tensorflow:global_step/sec: 208.453
INFO:tensorflow:loss = 2183.0503, step = 20682 (0.480 sec)
INFO:tensorflow:global_step/sec: 206.729
INFO:tensorflow:loss = 3767.7236, step = 20782 (0.484 sec)
INFO:tensorflow:global_step/sec: 198.135
INFO:tensorflow:loss = 3152.8857, step = 20882 (0.507 sec)
INFO:tensorflow:global_step/sec: 182.586
INFO:tensorflow:loss = 2599.474, step = 20982 (0.546 sec)
INFO:tensorflow:Saving checkpoints for 21081 into LinRegTrain\model.ckpt.
INFO:tensorflow:Loss for final step: 2783.6982.

Step 5: Evaluate the model

Evaluating the models enables you to check how well the model can make predictions. Just as it is when training the model, you’d require an input function when evaluating the model as well. 

It’s good practice to use the same input function you used when training the model, to evaluate the model. You’d however change the data passed in the input function. This time, the test data. 

 Let’s evaluate the model 

#evaluate the model
evaluation = estimator.evaluate(input_fn=input_fn(test, num_epochs=10, shuffle=True))

Output:

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2020-11-09T23:06:48Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from LinRegTrain\model.ckpt-21081
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Finished evaluation at 2020-11-09-23:06:49
INFO:tensorflow:Saving dict for global step 21081: average_loss = 43.917557, global_step = 21081, label/mean = 14.948031, loss = 5577.53, prediction/mean = 18.892693
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 21081: LinRegTrain\model.ckpt-21081

The model has a loss of $5577. To put this in perspective, let’s see the average price for a house according to the data. 

train['MEDV'].describe()

Output:

count    379.000000
mean      25.074406
std        8.801969
min       11.800000
25%       19.400000
50%       22.800000
75%       28.700000
max       50.000000
Name: MEDV, dtype: float64

As seen above, the average price for a house is $25,000. The model’s performance can however still be tweaked by changing the training parameters such as the number of epochs, batch size, etc. 

Using Numpy Arrays

We can also feed the data into the tensorflow estimator using numpy arrays. We start by splitting the data into train and test data. 

#define the training size of the dataset to be 75%
training_size = int(len(df) * 0.75)

# #define the train and test data
train = df[:training_size].values
test = df[training_size:].values

The values attribute converts the data frame into a numpy array. The train and test data is then split into X_train, X_test, y_test, y_split

def prepare_data(df):
    """This function splits the data frame into X and y data"""
    X = df[:, :-1]    
    y = df[:,-1]
    return X, y

#define the X_train, y_train, X_test, y_test data
X_train, y_train = prepare_data(train)
X_test, y_test = prepare_data(test)

The feature columns can now be defined using the code below. 

#convert the feature columns into Tensorflow numeric column
feature_columns = [tf.feature_column.numeric_column('x', shape=X_train.shape[1:])]

And now, the model can be trained using the feature columns defined above. 

#define the linear regression estimator
estimator = tf.estimator.LinearRegressor(feature_columns=feature_columns, model_dir='LinRegTrain1')

Output:

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': 'LinRegTrain1', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x000001B878B46278>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

Afterward, the input function for the train dataset is defined. 

#define the train input function
train_input_fn = tf.estimator.inputs.numpy_input_fn(
    x = {'x': X_train},
    y = y_train,
    batch_size=128,
    num_epochs=None,
    shuffle=True,
    )

Now, the model can be trained with the input function defined above. 

#train the model
estimator.train(input_fn=train_input_fn, steps=5000)

Output:

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from LinRegTrain1\model.ckpt-31500
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 31500 into LinRegTrain1\model.ckpt.
INFO:tensorflow:loss = 2298.775, step = 31501
INFO:tensorflow:global_step/sec: 529.409
INFO:tensorflow:loss = 3787.5515, step = 31601 (0.192 sec)
INFO:tensorflow:global_step/sec: 637.311
INFO:tensorflow:loss = 4150.9785, step = 31701 (0.157 sec)
INFO:tensorflow:global_step/sec: 515.754
INFO:tensorflow:loss = 2896.7476, step = 31801 (0.193 sec)
INFO:tensorflow:global_step/sec: 592.061
INFO:tensorflow:loss = 2679.3975, step = 31901 (0.169 sec)
INFO:tensorflow:global_step/sec: 621.468
INFO:tensorflow:loss = 2945.3008, step = 32001 (0.162 sec)
INFO:tensorflow:global_step/sec: 671.53
INFO:tensorflow:loss = 3135.522, step = 32101 (0.148 sec)
INFO:tensorflow:global_step/sec: 645.523
INFO:tensorflow:loss = 1737.3533, step = 32201 (0.155 sec)
INFO:tensorflow:global_step/sec: 667.063
INFO:tensorflow:loss = 2122.1785, step = 32301 (0.150 sec)
INFO:tensorflow:global_step/sec: 625.357
INFO:tensorflow:loss = 2111.141, step = 32401 (0.160 sec)
INFO:tensorflow:global_step/sec: 667.048
INFO:tensorflow:loss = 3291.2444, step = 32501 (0.151 sec)
INFO:tensorflow:global_step/sec: 667.05
INFO:tensorflow:loss = 4172.6313, step = 32601 (0.150 sec)
INFO:tensorflow:global_step/sec: 667.041
INFO:tensorflow:loss = 3981.296, step = 32701 (0.149 sec)
INFO:tensorflow:global_step/sec: 662.638
INFO:tensorflow:loss = 3058.733, step = 32801 (0.151 sec)
INFO:tensorflow:global_step/sec: 690.052
INFO:tensorflow:loss = 2693.0422, step = 32901 (0.146 sec)
INFO:tensorflow:global_step/sec: 667.048
INFO:tensorflow:loss = 3583.536, step = 33001 (0.151 sec)
INFO:tensorflow:global_step/sec: 667.046
INFO:tensorflow:loss = 2732.2446, step = 33101 (0.150 sec)
INFO:tensorflow:global_step/sec: 694.842
INFO:tensorflow:loss = 1517.0491, step = 33201 (0.143 sec)
INFO:tensorflow:global_step/sec: 699.704
INFO:tensorflow:loss = 3136.3606, step = 33301 (0.142 sec)
INFO:tensorflow:global_step/sec: 637.307
INFO:tensorflow:loss = 2074.9668, step = 33401 (0.157 sec)
INFO:tensorflow:global_step/sec: 690.049
INFO:tensorflow:loss = 2896.9585, step = 33501 (0.145 sec)
INFO:tensorflow:global_step/sec: 645.534
INFO:tensorflow:loss = 3446.4941, step = 33601 (0.156 sec)
INFO:tensorflow:global_step/sec: 595.577
INFO:tensorflow:loss = 3641.3157, step = 33701 (0.168 sec)
INFO:tensorflow:global_step/sec: 602.755
INFO:tensorflow:loss = 2726.9165, step = 33801 (0.166 sec)
INFO:tensorflow:global_step/sec: 625.359
INFO:tensorflow:loss = 2953.2432, step = 33901 (0.163 sec)
INFO:tensorflow:global_step/sec: 617.635
INFO:tensorflow:loss = 1818.0906, step = 34001 (0.158 sec)
INFO:tensorflow:global_step/sec: 671.529
INFO:tensorflow:loss = 3186.2725, step = 34101 (0.149 sec)
INFO:tensorflow:global_step/sec: 667.047
INFO:tensorflow:loss = 3824.3748, step = 34201 (0.150 sec)
INFO:tensorflow:global_step/sec: 461.072
INFO:tensorflow:loss = 2467.5938, step = 34301 (0.221 sec)
INFO:tensorflow:global_step/sec: 446.707
INFO:tensorflow:loss = 4125.1543, step = 34401 (0.220 sec)
INFO:tensorflow:global_step/sec: 483.367
INFO:tensorflow:loss = 2444.977, step = 34501 (0.212 sec)
INFO:tensorflow:global_step/sec: 555.873
INFO:tensorflow:loss = 2906.3044, step = 34601 (0.176 sec)
INFO:tensorflow:global_step/sec: 581.73
INFO:tensorflow:loss = 2992.641, step = 34701 (0.171 sec)
INFO:tensorflow:global_step/sec: 456.846
INFO:tensorflow:loss = 2374.501, step = 34801 (0.222 sec)
INFO:tensorflow:global_step/sec: 529.453
INFO:tensorflow:loss = 2264.888, step = 34901 (0.187 sec)
INFO:tensorflow:global_step/sec: 481.045
INFO:tensorflow:loss = 2630.542, step = 35001 (0.208 sec)
INFO:tensorflow:global_step/sec: 529.405
INFO:tensorflow:loss = 3225.4219, step = 35101 (0.190 sec)
INFO:tensorflow:global_step/sec: 492.893
INFO:tensorflow:loss = 1567.6238, step = 35201 (0.202 sec)
INFO:tensorflow:global_step/sec: 431.279
INFO:tensorflow:loss = 3448.526, step = 35301 (0.234 sec)
INFO:tensorflow:global_step/sec: 629.291
INFO:tensorflow:loss = 2485.2834, step = 35401 (0.157 sec)
INFO:tensorflow:global_step/sec: 653.975
INFO:tensorflow:loss = 2805.2188, step = 35501 (0.152 sec)
INFO:tensorflow:global_step/sec: 667.048
INFO:tensorflow:loss = 2969.5796, step = 35601 (0.151 sec)
INFO:tensorflow:global_step/sec: 676.062
INFO:tensorflow:loss = 2702.0142, step = 35701 (0.147 sec)
INFO:tensorflow:global_step/sec: 641.384
INFO:tensorflow:loss = 2972.1235, step = 35801 (0.157 sec)
INFO:tensorflow:global_step/sec: 575.043
INFO:tensorflow:loss = 3671.458, step = 35901 (0.175 sec)
INFO:tensorflow:global_step/sec: 662.639
INFO:tensorflow:loss = 2267.0298, step = 36001 (0.150 sec)
INFO:tensorflow:global_step/sec: 704.633
INFO:tensorflow:loss = 2972.4639, step = 36101 (0.141 sec)
INFO:tensorflow:global_step/sec: 694.84
INFO:tensorflow:loss = 2400.421, step = 36201 (0.146 sec)
INFO:tensorflow:global_step/sec: 667.038
INFO:tensorflow:loss = 2839.9067, step = 36301 (0.148 sec)
INFO:tensorflow:global_step/sec: 680.673
INFO:tensorflow:loss = 1892.6975, step = 36401 (0.147 sec)
INFO:tensorflow:Saving checkpoints for 36500 into LinRegTrain1\model.ckpt.
INFO:tensorflow:Loss for final step: 2119.7617.

To evaluate the data, you need to define another input function. This time, using the test dataset. 

#define the test input function
test_input_fn = tf.estimator.inputs.numpy_input_fn(
    x = {'x': X_test},
    y = y_test,
    batch_size=128,
    num_epochs=10,
    shuffle=True
    )

Finally, we can evaluate the model to see how it performs. 

#evaluate the model 
estimator.evaluate(input_fn=test_input_fn)

Output:

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2020-11-10T01:12:56Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from LinRegTrain1\model.ckpt-36500
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Finished evaluation at 2020-11-10-01:12:56
INFO:tensorflow:Saving dict for global step 36500: average_loss = 46.48516, global_step = 36500, label/mean = 14.948031, loss = 5903.6157, prediction/mean = 19.212402
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 36500: LinRegTrain1\model.ckpt-36500
Out[70]:
{'average_loss': 46.48516,
 'label/mean': 14.948031,
 'loss': 5903.6157,
 'prediction/mean': 19.212402,
 'global_step': 36500}

Rounding off, we have discussed the pathway to building a TensorFlow model using a TensorFlow High-level API, tensorflow.estimator. We started by explaining the theory behind linear regression and afterward created a linear regression algorithm with python. 

We took a step to build a linear regression model. The model was trained using the Boston housing dataset. In the example, we outlined the steps necessary to train and evaluate your model and how to tweak the performance of the model accordingly. In the next tutorial, we shall introduce you to the data preprocessing techniques and how to improve a model built on another TensorFlow high-end API, Keras.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Share this article
Subscribe
By pressing the Subscribe button, you confirm that you have read our Privacy Policy.
Need a Free Demo Class?
Join H2K Infosys IT Online Training
Enroll Free demo class