It has always been a debatable topic to choose between R and Python. The Machine Learning world has been divided over the preference of one language over the other. But with the explosion of Deep Learning, the balance shifted towards Python as it had an enormous list of Deep Learning libraries and frameworks which R lacked (till now).
I personally switched to Python from R simply because I wanted to dive into the Deep Learning space but with an R, it was almost impossible. But not anymore!
With the launch of Keras in R, this fight is back at the center. Python was slowly becoming the de-facto language for Deep Learning models. But with the release of Keras library in R with tensorflow (CPU and GPU compatibility) at the backend as of now, it is likely that R will again fight Python for the podium even in the Deep Learning space.
Below we will see how to install Keras with Tensorflow in R and build our first Neural Network model on the classic MNIST dataset in the RStudio.
The steps to install Keras in RStudio is very simple. Just follow the below steps and you would be good to make your first Neural Network Model in R.
install.packages("devtools")
devtools::install_github("rstudio/keras")
The above step will load the keras library from the GitHub repository. Now it is time to load keras into R and install tensorflow.
library(keras)
By default RStudio loads the CPU version of tensorflow. Use the below command to download the CPU version of tensorflow.
install_tensorflow()
To install the tensorflow version with GPU support for a single user/desktop system, use the below command.
install_tensorflow(gpu=TRUE)
For multi-user installation, refer this installation guide.
Now that we have keras and tensorflow installed inside RStudio, let us start and build our first neural network in R to solve the MNIST dataset.
Below is the list of models that can be built in R using Keras.
Let us start with building a very simple MLP model using just a single hidden layer to try and classify handwritten digits.
#loading keras library
library(keras)
#loading the keras inbuilt mnist dataset
data<-dataset_mnist()
#separating train and test file
train_x<-data$train$x
train_y<-data$train$y
test_x<-data$test$x
test_y<-data$test$y
rm(data)
# converting a 2D array into a 1D array for feeding into the MLP and normalising the matrix
train_x <- array(train_x, dim = c(dim(train_x)[1], prod(dim(train_x)[-1]))) / 255
test_x <- array(test_x, dim = c(dim(test_x)[1], prod(dim(test_x)[-1]))) / 255
#converting the target variable to once hot encoded vectors using keras inbuilt function
train_y<-to_categorical(train_y,10)
test_y<-to_categorical(test_y,10)
#defining a keras sequential model
model <- keras_model_sequential()
#defining the model with 1 input layer[784 neurons], 1 hidden layer[784 neurons] with dropout rate 0.4 and 1 output layer[10 neurons]
#i.e number of digits from 0 to 9
model %>%
layer_dense(units = 784, input_shape = 784) %>%
layer_dropout(rate=0.4)%>%
layer_activation(activation = 'relu') %>%
layer_dense(units = 10) %>%
layer_activation(activation = 'softmax')
#compiling the defined model with metric = accuracy and optimiser as adam.
model %>% compile(
loss = 'categorical_crossentropy',
optimizer = 'adam',
metrics = c('accuracy')
)
#fitting the model on the training dataset
model %>% fit(train_x, train_y, epochs = 100, batch_size = 128)
#Evaluating model on the cross validation dataset
loss_and_metrics <- model %>% evaluate(test_x, test_y, batch_size = 128)
The above code had a training accuracy of 99.14 and validation accuracy of 96.89. The code ran on my i5 processor and took around 13.5s for a single epoch whereas, on a TITANx GPU, the validation accuracy was 98.44 with an average epoch taking 2s.
For the sake of comparison, I implemented the above MNIST problem in Python too. There should not be any difference since keras in R creates a conda instance and runs keras in it. But still, you can find the equivalent python code below.
#importing the required libraries for the MLP model
import keras
from keras.models import Sequential
import numpy as np
#loading the MNIST dataset from keras
from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
#reshaping the x_train, y_train, x_test and y_test to conform to MLP input and output dimensions
x_train=np.reshape(x_train,(x_train.shape[0],-1))/255
x_test=np.reshape(x_test,(x_test.shape[0],-1))/255
import pandas as pd
y_train=pd.get_dummies(y_train)
y_test=pd.get_dummies(y_test)
#performing one-hot encoding on target variables for train and test
y_train=np.array(y_train)
y_test=np.array(y_test)
#defining model with one input layer[784 neurons], 1 hidden layer[784 neurons] with dropout rate 0.4 and 1 output layer [10 #neurons]
model=Sequential()
from keras.layers import Dense
model.add(Dense(784, input_dim=784, activation='relu'))
keras.layers.core.Dropout(rate=0.4)
model.add(Dense(10,input_dim=784,activation='softmax'))
# compiling model using adam optimiser and accuracy as metric
model.compile(loss='categorical_crossentropy', optimizer="adam", metrics=['accuracy'])
# fitting model and performing validation
model.fit(x_train,y_train,epochs=50,batch_size=128,validation_data=(x_test,y_test))
The above model achieved a validation accuracy of 98.42 on the same GPU. So, as we guessed initially, the results are the same.
If this was your first Deep Learning model in R, I hope you enjoyed it. With a very simple code, you were able to classify hand written digits with 98% accuracy. This should be motivation enough to get you started with Deep Learning.
If you have already worked on keras deep learning library in Python, then you will find the syntax and structure of the keras library in R to be very similar to that in Python. In fact, the keras package in R creates a conda environment and installs everything required to run keras in that environment. But, I am more excited to now see data scientists building real life deep learning models in R. As it is said – The competition should never stop. I would also like to hear your views on this new development for R. Feel free to comment.
Lorem ipsum dolor sit amet, consectetur adipiscing elit,
This article came at the right time for me...!!! Thank you so much...
i am not able to install keras in r studio, triggers an error post installing dev tools? any idea how...?
Can you post the error output below so that I may help?
Thanks for the qualitative article :)
Hi NSS, Thanks a lot for this. But there is an error and does not let install any of keras or tensorflow. > devtools::install_github("rstudio/keras") Error in loadNamespace(name) : there is no package called ‘devtools’ > library(keras) Error in library(keras) : there is no package called ‘keras’ > install_tensorflow() Error: could not find function "install_tensorflow" it is not recognizing "devtools" what to do for same? not reaching upto gthub for their libraries. ankit
you need devtools package. Run the command - install.packages("devtools")
Hi NSS, Thanks a lot for this. But there is an error and does not let install any of keras or tensorflow. > devtools::install_github(“rstudio/keras”) Error in loadNamespace(name) : there is no package called ‘devtools’ > library(keras) Error in library(keras) : there is no package called ‘keras’ > install_tensorflow() Error: could not find function “install_tensorflow” it is not recognizing “devtools” what to do for same? not reaching upto gthub for their libraries. ankit
You need devtools package. Run the command - install.packages("devtools")
Very helpful article, well written and explained. Also, most important thing is R is now back in competition and finally I can do Deep Learning in R much better.
Great! I definitely love R. And with this innovative breakthrough, we are in for business! Yes! Thanks so much @NSS <- You R.O.C.K.!
Hello NSS et al... I got this error/warning after installation. What does it mean? I am using a Windows 7 HP 440 CORE i3 machine. I have Anaconda 3 and Python 3.5 already installed, Does it have anything to do with it? "Installation of TensorFlow complete. Warning messages: 1: In normalizePath(path.expand(path), winslash, mustWork) : path[1]="C:\Users\efeo\AppData\Local\CONTIN~1\ANACON~2\envs\tensorflow/python.exe": The system cannot find the file specified 2: In normalizePath(path.expand(path), winslash, mustWork) : path[1]="C:\Users\efeo\AppData\Local\CONTIN~1\ANACON~2\envs\tensorflow/python.exe": The system cannot find the file specified" >
Unfortunately, this is still an open issue on Github. You can track the progress over the issue here and know when it is resolved. https://github.com/rstudio/htmltools/issues/61
Hi, the command install_tensorflow() (for CPU) brings this error message: Error: Installing TensorFlow requires a 64-bit version of Python 3.5 Please install 64-bit Python 3.5 to continue, supported versions include: - Anaconda Python (Recommended): https://www.continuum.io/downloads#windows - Python Software Foundation : https://www.python.org/downloads/release/python-353/ Note that if you install from Python Software Foundation you must install exactly Python 3.5 (as opposed to 3.6 or higher). All other steps worked fine devtools, keras. Thank you for your reply
Since R creates a conda instance for keras, You should first install the anaconda distribution for your system and then try installing keras.
Hi, Thank you for this article. But I am not able to fit the model. Seems like there is something wrong with Python. Here is the debug message: Detailed traceback: File "C:\Users\zjin\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\contrib\keras\python\keras\models.py", line 835, in fit initial_epoch=initial_epoch) File "C:\Users\zjin\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\contrib\keras\python\keras\engine\training.py", line 1494, in fit initial_epoch=initial_epoch) File "C:\Users\zjin\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\contrib\keras\python\keras\engine\training.py", line 1144, in _fit_loop callbacks.on_batch_end(batch_index, batch_logs) File "C:\Users\zjin\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\contrib\keras\python\keras\callbacks.py", line 131, in on_batch_end callback.on_batch_end(batch, logs) File "C:\Users\zjin\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\contr
Thx for the article! It will be very helpful! :) 1 Q: Do i need to install python to be able to install tensorflow in R? I got this message: Error: Installing TensorFlow requires a 64-bit version of Python 3.5
Yes, Keras creates a conda instance in R. So you will need a compatible python version to run Keras in R.
I omit some part. The full debug message is: Error in py_call_impl(callable, dots$args, dots$keywords) : AttributeError: 'NoneType' object has no attribute 'write' Detailed traceback: File "C:\Users\zjin\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\contrib\keras\python\keras\models.py", line 835, in fit initial_epoch=initial_epoch) File "C:\Users\zjin\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\contrib\keras\python\keras\engine\training.py", line 1494, in fit initial_epoch=initial_epoch) File "C:\Users\zjin\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\contrib\keras\python\keras\engine\training.py", line 1144, in _fit_loop callbacks.on_batch_end(batch_index, batch_logs) File "C:\Users\zjin\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\contrib\keras\python\keras\callbacks.py", line 131, in on_batch_end callback.on_batch_end(batch, logs) File "C:\Users\zjin\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\contr
Hi, this was an issue because of the reticulate package. To address this, close all your R sessions and run the following command. Everything should be fine. devtools::install_github("rstudio/reticulate")
must i have GitHub installed on my laptop before i can complete this installation? i ran the first 2 installation commands and got this error. Installation failed: Problem with the SSL CA cert (path? access rights?) is there something i am doing wrong?
You probably have your curl built against two SSL certificates but a single CA key. Probably you have openssl and nss installed at the same time. Follow the below steps. Inside The remove.packages('curl') Close R and open a shell window. Type the below command. apt-get remove libcurl4-nss-dev Open R and then type install.packages('curl') It should work fine now.
Great post.........Excited to see what lies ahead !!! :D
Even I am, Thank you .
Hi NSS, I have been using Anaconda distribution with Python 3.6. Now I installed devtools, keras and tensorflow packages in RStudio. When I am loading keras and reading the dataset using below commands, I am getting the below errors. library(keras) data library(keras) > data<-dataset_mnist() Error: Python module tensorflow.contrib.keras.python.keras was not found. Detected Python configuration: python: C:\Users\SXD76C~1.PAR\AppData\Local\CONTIN~1\ANACON~1/envs/r-tensorflow/python.exe libpython: C:/Users/SXD76C~1.PAR/AppData/Local/CONTIN~1/ANACON~1/envs/r-tensorflow/python35.dll pythonhome: C:\Users\SXD76C~1.PAR\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1 version: 3.5.3 |Continuum Analytics, Inc.| (default, May 15 2017, 10:43:23) [MSC v.1900 64 bit (AMD64)] Architecture: 64bit numpy: C:\Users\SXD76C~1.PAR\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\numpy numpy_version: 1.13.0 tensorflow: C:\Users\SXD76C~1.PAR\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow python versions found: C:\Users\SXD76C~1.PAR\AppData\Local\CONTIN~1\ANACON~1/envs/r-tensorflow/python.exe C:\Users\SXD76C~1.PAR\AppData\Local\CONTIN~1\ANACON~1\python.exe ------------------------ It seems issue with my Python version 3.6. How can we use Python 3.5 with Anaconda distribution? Can you please help on this one?
Try re-installing the reticulate package again. This usually happens when two python modules conflict. This should fix the issue. devtools::install_github(“rstudio/reticulate”)
How do I determine the training accuracy and validation accuracy of model? I ran the code and got this in the end, trying to get the accuracy. > table(loss_and_metrics) loss_and_metrics.2 loss_and_metrics.1 0.9869 0.0931646115280099 1
Thanks NSS. its very nice that for KERAS library we can implement NN in R. Hope we can have many more libraries in R in near future.
Hello, Thanks for sharing this topic and code with us. I tried using the code as it is and ran into multiple problems inside my RStudio. First of all, I had to struggle to install tensorflow and keras on RStudio version 1.0.143. After overcoming the installation hurdles, I cut-pasted the code as it is inside RStudio. I got an error about "%>%" not being recognized. So I added library(magrittr) at the beginning. Then I got an error: Error in eval(expr, envir, enclos) : object 'model' not found This was at each mention of "model". Please help me with the code.
Hello all, I am getting below error at the 2nd last line of the code,let me know its resolution NSS: model %>% fit(train_x, train_y, epochs = 100, batch_size = 128) Error in py_call_impl(callable, dots$args, dots$keywords) : AttributeError: 'NoneType' object has no attribute 'write' Detailed traceback: File "C:\Users\DIGVIJ~1.VYA\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\contrib\keras\python\keras\models.py", line 835, in fit initial_epoch=initial_epoch) File "C:\Users\DIGVIJ~1.VYA\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\contrib\keras\python\keras\engine\training.py", line 1494, in fit initial_epoch=initial_epoch) File "C:\Users\DIGVIJ~1.VYA\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\contrib\keras\python\keras\engine\training.py", line 1144, in _fit_loop callbacks.on_batch_end(batch_index, batch_logs) File "C:\Users\DIGVIJ~1.VYA\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\contrib\keras\python\keras\callbacks.py", line 131, in on_batch_end callback.on_batch_end(batch, logs) File "C:\Users\DIGVIJ~1.VYA\AppData\Local\CONTIN~1\ANACON~1\envs\R-T
I gave the solution to this in one of the comments. Install reticulate package again. Check comments for the command to install reticulate.
I am getting the following error while installing tensor flow backend:- Error: Prerequisites for installing TensorFlow not available.
Please mention the OS that you are using.
I am getting the error- Error: Prerequisites for installing TensorFlow not available.
Do you have an anaconda distribution installed on your system? If not i would suggest you to install it first. It will install all the required packages for installing tensorflow.
Hi, After install_tensorflow() , installation gets completed. Then following lines showing error : data <- dataset_mnist() Error: Python module tensorflow.contrib.keras.python.keras was not found. Detected Python configuration: python: C:\Users\AppData\Local\CONTIN~1\ANACON~1\python.exe libpython: C:/Users/AppData/Local/CONTIN~1/ANACON~1/python27.dll pythonhome: C:\Users\AppData\Local\CONTIN~1\ANACON~1 version: 2.7.13 |Anaconda 4.3.1 (64-bit)| (default, Dec 19 2016, 13:29:36) [MSC v.1500 64 bit (AMD64)] Architecture: 64bit numpy: C:\Users\AppData\Local\CONTIN~1\ANACON~1\lib\site-packages\numpy numpy_version: 1.11.3 tensorflow: [NOT FOUND] I was wondering how to fix the error. Thanx in anticipation.
Hi Team, I am getting same error as mentioned here by Amit and Shan. Please see below. > setwd("C:\\Users\\Kaustav\\Anaconda\\envs\\r-tensorflow") > library(keras) > data<-dataset_mnist() Error: Python module tensorflow.contrib.keras.python.keras was not found. Detected Python configuration: python: C:\Users\Kaustav\Anaconda\python.exe libpython: C:/Users/Kaustav/Anaconda/python27.dll pythonhome: C:\Users\Kaustav\Anaconda version: 2.7.12 |Anaconda 2.3.0 (64-bit)| (default, Jun 29 2016, 11:07:13) [MSC v.1500 64 bit (AMD64)] Architecture: 64bit numpy: C:\Users\Kaustav\Anaconda\lib\site-packages\numpy numpy_version: 1.9.2 tensorflow: [NOT FOUND] python versions found: C:\Users\Kaustav\Anaconda\envs\R-TENS~1\python.exe C:\Users\Kaustav\Anaconda\python.exe My tensorflow is getting installed under : C:\Users\Kaustav\Anaconda\envs\r-tensorflow BUT when using data<-dataset_minst() ..... it is looking in a different location. Please can someone assist me in resolving this issue.
Even I am also getting same kind of error... Please let me know how to resolve this.
Was wondering why you divide by 255 after converting the 2D array to a 1D array?
Just to normalize the data and bring it between 0 and 1.
I am getting an error: Error in is_backend("tensorflow") : could not find function "is_backend" I cannot find the 'is_backend' R function anywhere, Please let me know how I can resolve this error. Thanks!
Hi NSS, great article! I`m researching deep learning models optimized by genetic algorithms and have to run thousands of models in a row. Do you know if it is possible do restart the python environment created by keras? After some generations the system slows down big time.
Hello, I installed keras and tensorflow successfully in my system. Now, there is problem in loading the dataset. The following is the error. > data<-dataset_mnist() Error: could not find function "dataset_mnist" Please help Thanks.
Thanks for this wonderful walk-through. However, if I need to retrieve the images as given in this practice hack 'Identify the digits', how should I go about it? I'm weak in python but have a relatively better grasp over R. Say there is a train folder with an image folder inside it having the images. There is also a csv file in the train folder which contains the train set labels. Additionally, I also have a test set csv. I'm finding it hard to read the files and prepare it into train_x, train_y, test_x and test_y. Can someone tell me how to go about it? If not in codes, can someone point me to any article or blog. PS: I have gone through almost all blogs in AV. I love AV for that reason. Thanks in advance