Predicting building energy consumption with tensorflow — custom estimators
In my previous post I demonstrated a very coarse way of how to use tensorflow for predicting energy consumption of a building. It showed that it was quite easy to quickly get something running. However, visibility of what was actually going on under the hood and modifying the model were very limited.
The obvious thing next is naturally to consider a more modifiable neural network which I have done by building upon the tensorflow tutorial on custom estimators. My aim was to build an estimator where the activation functions could be changed and see how the activation functions affected the outcomes. The full code with some example figures from my tests can be found from github.
Let’s go through the process of creating a custom estimator compared to an “out-of-the-box” solution. The most substantial changes are in the function declarations, i.e. the file “pred_func_custom.py”. Last time creation of the estimation, training and activation methods were left for tensorflow but this time all that will be done separately. Below is the function used to define our model.
Function defining the main model
def my_dnnmodel(features,labels,mode,params): | |
""" This function defines a DNN model """ | |
# Define the feature columns i.e. the input layer | |
net = tf.feature_column.input_layer(features,params['feature_cols']) | |
""" | |
Construct the hidden layers - activation function is here Rectified Linear Unit | |
more: | |
https://www.tensorflow.org/versions/r0.12/api_docs/python/nn/activation_functions_ | |
""" | |
for units in params['hidden_units']: | |
net = tf.layers.dense(net,units=units,activation=tf.nn.relu) | |
# change the activation to see what happens for example: relu6, crelu, selu, softplus or dropout (random) | |
''' | |
And the output layer - as logits final form will be calculated with some | |
activation operator i.e. tf.nn... | |
''' | |
preds = tf.layers.dense(net,params['n_out'],activation=None) | |
# Compute predictions | |
pred_gas = tf.nn.relu(preds) | |
#pred_gas = preds | |
#print(pred_gas) | |
if mode == tf.estimator.ModeKeys.PREDICT: | |
predictions = { | |
'Preds': preds, # Predictions from the network | |
'Abs': tf.abs(preds,name=None), # Absolute values of those predictions | |
'Act_pred': pred_gas # With Relu-activation | |
} | |
return tf.estimator.EstimatorSpec(mode, predictions=predictions) | |
print('%%%%%-labels') | |
print(labels) | |
print('%%%%%-predictions') | |
print(preds) | |
print(pred_gas) | |
#print('FFFFF') | |
#print(tf.rank(preds)) | |
#print(tf.shape(preds)) | |
# Reshape for evaluation and make 64 point floats | |
preds = tf.reshape(preds,shape=(-1,)) | |
preds = tf.cast(preds,tf.float64) | |
pred_gas = tf.cast(pred_gas,tf.float64) | |
#print(preds) | |
# Computation of loss - using mean squared error | |
loss = tf.losses.mean_squared_error(labels=labels, predictions=preds) | |
# Compute evaluation metric | |
meanrelerror = tf.metrics.mean_relative_error(labels=labels, | |
predictions=pred_gas, | |
normalizer=labels, | |
name = 'meanrelerr_op' | |
) | |
# Define metrics for evaluation | |
metrics = {'meanrelerror': meanrelerror} | |
tf.summary.scalar('meanrelerror', meanrelerror[1]) | |
# Evaluation method | |
if mode == tf.estimator.ModeKeys.EVAL: | |
return tf.estimator.EstimatorSpec( | |
mode, loss=loss, eval_metric_ops=metrics) | |
# Train method - Adagrad Optimizer Used | |
assert mode == tf.estimator.ModeKeys.TRAIN | |
# Optimizer for training | |
optimizer = tf.train.AdagradOptimizer( | |
learning_rate=params['learn_rate'] | |
) | |
train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step()) | |
return tf.estimator.EstimatorSpec(mode,loss=loss,train_op=train_op) |
As one can see, we now have much more factors we can play with. Different activation functions can be tested and used within different layers of the network. So far I have been testing around with different activation functions and amounts of neurons in the network. I will be presenting some of the results next to give a picture of what the different activation functions actually mean.
Below are figures some some runs I made with different amounts of neurons and activation functions. I first just used relu activation function. Relu is an abbreviation for “rectified linear unit” which is the activation also used last time. The first two pictures show how changing the amount of neurons in the two layers affects the outcome. However, interestingly with more neurons capability to catch some of the distinguishable features of the data become better. Information about tensorflow activation functions can be found here.
Note that since the splitting of data was done randomly for each run direct comparison is not quite justified with this method. To accurately compare two models one would need to use same training and evaluation data. I know, I know have to do that…
From left to right: deep neural networks 80–80-relu, 5–5-relu.
I then created some models with activation functions like selu and softplus, which are both non-linear activation functions. Selu stands for scaled non-exponential linear. Softplus is simply computing the function *log(exp(features)+1). *As one can see the models are a lot “smoother”. It might be interesting to try out to do neural network with a combination of such a smoother activation function and then the relu.
From left to right: DNNs 80–80-selu, 80–80-softplus
So making those baby-steps towards something more elegant, slowly but surely. Next I will most probably be testing around different combinations of activation functions. I will also try to make the comparison of different models more transparent and easier. I was also planning to add another aspect into the model, for example wind speed or also internal temperature of the building. Just to see if the predictions become any better.
I have also been working on my Raspberry Pi3 and already have a working temperature logger with some nice LED-indicators. When I have time I will be giving out some more details on that and ideas of how I could embed a neural network into that. Exciting! Or at least I think it is.
Feel free to comment or give ideas if you feel like it.
To fellow practitioners,
Eramismus
This post was originally published in Medium
I am a practical Finn with interests spanning energy, digitalisation and society. I am currently working towards a PhD in England. Among other things I try to explore the layered nature of sustainability through philosophy and technology.