JalFaizy Shaikh — Published On November 7, 2016 and Last Modified On July 5th, 2020
Advanced Deep Learning Image Machine Learning Project Python Unstructured Data


Art has always transcended eons of human existence. We can see its traces from pre-historic time as the Harappan art in the Indus Valley Civilization to the contemporary art in modern times. Mostly, art has been a means to express one’s creativity, viewpoints of how we perceive the world. As legendary Leonardo Da Vinci has said,

“Painting is poetry that is seen rather than felt”.

What we sometimes forget that most of the art follows a pattern. A pattern that pleases us and makes sense in our brain. The next time you see a painting, try to notice the brush strokes in it. You will see a pattern arising out of the painting. We as humans are skilled in recognising these patterns. Our neural mechanisms has developed to be exceptionally great over the years recognising patterns in the wild.

Now you may ask why I am ranting away about art and patterns? This is because I will show you how to create art with the help of artificial brains! In this article, we will build an artificial neural network which will extract style from one image and replicate it on the other. So are you ready?


Table of Contents

  • What is Neural Art?
  • Familiarizing with the crux
  • Coding it up
  • Where to go from here?
  • Additional Resources


What is Neural Art?

Let’s try to understand this topic with an example.

Source [1]

The above image is the famous “The Starry Night” by Vincent Van Gogh. Just look at the painting for a few minutes. What do you see? Do you notice the bush strokes? Do you see the curves and edges that define each and every object which makes it so easy for you to recognise them?

Now let’s do a quick assignment. Try to remember the patterns you see. Just cram your brain with every little detail. Done? Ok now take a look at the next image.

Source [2]

This is a photograph taken of a town called “Tubingen” located in Germany. For the next step of the assignment, just close your eyes and try to replicate the style of a starry night with this image. Ask yourself, if you are Van Gogh (hypothetically of course!) and are asked to draw this photograph keeping in mind the styles you memorized before, how would you do it?






Did you do it? Great! You just made a neural art!




Want to see what an artificial neural network can do?

Source [2]

You may ask how did a machine accomplish such a task. It’s simple once you get the gist of it!

What neural network does is, it tries to extract the “important points” from the both the images, that is it tries to recognize which attributes define the picture and learns from it. These learned attributes are an internal representation of the neural network, which can be seen as below.

neural-artSource [2]

So you got to know the theoretical concepts involved in neural art, now let’s get to know the practical aspects of implementing it.


Getting into the brain of artificial artist:

Neural Art works in the following way:

  • We first define the loss functions necessary to generate our result, namely the style loss, content loss and the total variational loss
  • We define our optimization function, i.e. back propagation algorithm. Here we use L-BFGS because it’s faster and more efficient for smaller data.
  • Then we set our style and content attributes of our model
  • Then we pass an image to our model (preferably our base image) and optimize it to minimize all the losses we defined above.

We will get to know some of the important points you ought to know before we jump in. While most of the fundamentals of Neural Networks are covered in this article, I will reiterate some of them and explain a few extra things.

  • What is a loss function? Loss function is a function that calculates the difference of the predicted values from the original values. It essentially says how much error has occurred in a calculation. In any machine learning algorithm, loss function is used to estimate how the model performs on data. This is especially helpful in case of neural networks, where you iteratively try to make your model perform better. When implementing neural art, you have to keep check of three loss functions, namely:
    • Content loss, i.e. the difference between “content” of resulting image and base image. This is done to ensure that your model does not get much deviated from the base image
    • Style loss, i.e. the difference between “style” of resulting image and base image. To do this, you have to first calculate gram matrix of both the images and then find their difference. Gram matrix is nothing but finding covariance of an image with itself. This is done to maintain style in the resulting image.
    • Total Validation loss, i.e. the difference between a pixel of resulting image with its neighbouring pixel. This is done so that the image remains visually coherent.
  • What is an optimization function? When we have calculated the loss function, we try to minimize our losses by changing parameters. Optimization function helps us find out how much change is required so that our model is better “optimized”. Here we implement a method of optimization called Broyden–Fletcher–Goldfarb–Shanno algorithm (BFGS). BFGS is a variant of gradient descent algorithm in which you do a second order differentiation to find the local minimum. Read this article to get a mathematical perspective of the algorithm.

Now that we’ve understood what our flow will be to build a neural art, let’s get down and start hacking stuff!


Coding it up!

This Diwali was an interesting one for me. I decided to do some research on neural art and how India illuminates during the Diwali day. I came across this image  “India on Diwali night”. And I thought of creating something similar on the same lines. To do that, we will be combining the two images below with the help of neural art.

reference_image base_image

Source [3]

So first we will first set the groundworks.

Step 0: Install Keras and its dependencies . For this, we will be using a Theano backend. Change your backend by following the steps mentioned here. Also additionally you have to set the proper ordering for image. In the keras.json file, where you have changed the backend, replace image_dim_ordering with ‘tr’. So it should look like this,

"image_dim_ordering": "th"

Step 1: Then go to your working directory and set your directory structure as below

|-- keras_NeuralStyle                 # this is your working directory
|   |-- base_image.jpg                # this is your base image
|   |-- reference_image.jpg           # this is your reference image

Step 2:  Start a jupyter notebook in your working directory by typing jupyter notebook and implement the following code. I will just provide you a step by step overview of what each block does.

  • First, you have to import all the modules necessary to implement the code


  • Then set the paths of the images you want to carry out the project on.


  • Define the necessary variables and give them values as below. Note that these values can be changed but that may change the output drastically. Also, make sure the value of img_nrows variable is same as img_ncols. This is necessary for gram matrix to work.



  • Then we define a helper functions. These are responsible for handling image preprocessing.



  • Create input placeholders to pass images to the model



  • Load a pre-trained neural network model (If you don’t know what pre-training is, go through this discussion)



  • Print the model summary to see what the model is



  • Store the names of all the layers of the neural network as a dictionary along with their outputs



  • As defined above, we set the loss functions



  • We then set the content and style attributes …

10 11


  • And set the gradients and final output function for neural art

12 13


  • We define the functions to calculate loss and gradients

14 15


  • Now we take the base image as input and iterate through it to get our final image. On my local machine, it takes a minute to get the output in one iteration. Based on your resources (and patience) it would take at most 5 minutes to get the output. You can also increase the number of iterations to more optimize the result.


  • And after a long wait, we will get this beautiful image!



NOTE: The code file can be viewed on github here.


Where to go from here?

We have seen a small demo of a significant discovery in the art world. There have been many modifications done to this method to make it aesthetically pleasing. For example, I really like this implementation in which they have taken different styles and applied them to different regions.

02270_mask_face 02270_mask_face_inv okeffe_red_cannaokeffe_iris02270

The first two images are the masks, which help to set which part should be stylized. The next two images represent the styles to be used. The last image is the base image that has to be stylized.

Below is the output that is generated by neural art.


Looks awesome, doesn’t it? I am sure like me, you are also fascinated to try your hands on neural art. To help you get started with it, I have covered the basics of neural art and how can you create your first image. I am sure you are eager to explore more and hence I am adding some additional resources only for you.


Additional Resources

These are some of the best resources I have come across on neural art. Go ahead and enter the fascinating world of neural art.




Image Sources

[1] https://www.wikiart.org/en/vincent-van-gogh/the-starry-night-1889

[2] https://arxiv.org/abs/1508.06576

[3] Google


End Notes

I hope you found this article inspiring. Now, it’s time for you to go through it and make art yourself! If you create an art do share it with the community. If you have any doubts, I’d love to interact with you in comments. And to gain expertise in working in neural network don’t forget to try out our deep learning practice problem – Identify the Digits.

You can test your skills and knowledge. Check out Live Competitions and compete with best Data Scientists from all over the world.

About the Author

JalFaizy Shaikh
JalFaizy Shaikh

Faizan is a Data Science enthusiast and a Deep learning rookie. A recent Comp. Sc. undergrad, he aims to utilize his skills to push the boundaries of AI research.

Our Top Authors

Download Analytics Vidhya App for the Latest blog/Article

35 thoughts on "Creating an artificial artist: Color your photos using Neural Networks"

Jack says: November 07, 2016 at 9:24 am
We generally use backpropagation to train a neural network to better estimate the weights during the training phase, but here the usage is much different as in the model used is already trained, if that is the case why do we need a loss function and why do we need backpropagation? Reply
Faizan Shaikh
Faizan Shaikh says: November 07, 2016 at 10:59 am
Hi Jack, Pre-training doesn't necessary mean that the model is trained on the "intended" dataset. It maybe trained on another dataset and the knowledge can be transferred to another dataset (refer this dicussion https://discuss.analyticsvidhya.com/t/pre-trained-deep-learning/11003/2?u=jalfaizy ). The model we've loaded here is trained on ImageNet dataset, and our motive of using it is as a fine-tuned feature extractor. Thats why we need a loss function and thats why we're optimizing it with backprop. I intend an article on an explaning pre-training & fine tuning in the future. Do check it out. Reply
Chiuyee Lau
Chiuyee Lau says: November 08, 2016 at 1:41 am
Just want to make sure when training the neural networks, the base image is the input and the reference image is the output. the blended image is actually the intermediate one in the cnn? Thanks! Reply
Chiuyee Lau
Chiuyee Lau says: November 08, 2016 at 1:46 am
Generally my question is what is the training image and what is the target image in this case? Are they both the base image? Reply
Shivam Adarsh
Shivam Adarsh says: November 08, 2016 at 7:28 am
How to choose style weight and content weight? If we are taking the base image as a face, do we need to increase the content weight and decrease style weight? Also what is the range of these weights? Reply
Faizan Shaikh
Faizan Shaikh says: November 08, 2016 at 7:34 am
Hi Chiuyee, both base image and reference image are inputs, and the blended image produced by the CNN is the output. Reply
Faizan Shaikh
Faizan Shaikh says: November 08, 2016 at 7:44 am
In neural art, you basically trying to extract attributes from both base image and reference image, so that the resulting image would have some of both but not exactly one of them. So you can say that both base image and reference image are training images, because you use them in optimizing losses. Also, unlike normal machine learning problems, you don't have a concrete "target". The output depends on what kind of blend you want. Try changing the initialized weights in block [3] (i.e. style_weight etc) and try it for yourself. Reply
Faizan Shaikh
Faizan Shaikh says: November 08, 2016 at 9:10 am
Hi Shivam, Choice of style and content weights depend upon the artistic style you want to produce. If we take an example of a face, it would be better to have a high ( content / style ) ratio because you don't want the face to be much distorted. The authors of original paper did a good survey of various ( content / style ) ratio. According to the original paper, "We can therefore smoothly regulate the emphasis on either reconstructing the content or the style (Fig 3, along the columns). A strong emphasis on style will result in images that match the appearance of the artwork, effectively giving a texturised version of it, but hardly show any of the photograph’s content (Fig 3, first column). When placing strong emphasis on content, one can clearly identify the photograph, but the style of the painting is not as well-matched (Fig 3, last column). For a specific pair of source images one can adjust the trade-off between content and style to create visually appealing images." As far as range of weights is considered, the paper mentions that it should be a non-zero number. I would suggest you to experiment it on your end and share the findings for the community. Reply
Shaby says: November 08, 2016 at 3:11 pm
For a noob like me, this looks awesome! Thank you very much for sharing. Thumbs up :-) Reply
Faizan Shaikh
Faizan Shaikh says: November 08, 2016 at 4:08 pm
Thanks! Reply
Chiuyee Lau
Chiuyee Lau says: November 08, 2016 at 4:49 pm
I still do not quite understand. If there is no target, how can you apply the back propagation and update weights? Reply
prakash says: November 08, 2016 at 5:38 pm
this is awesome.. can be this done as a project.? Reply
Faizan Shaikh
Faizan Shaikh says: November 08, 2016 at 8:44 pm
Yes you can! Reply
Faizan Shaikh
Faizan Shaikh says: November 08, 2016 at 8:49 pm
I did not say that you don't have a target, I said you don't have a "concrete" target, i.e. its not defined clearly which image should be "better" artistically. I would recommend you to go through the research paper ( https://arxiv.org/abs/1508.06576 ) If you still have a doubt, ask it in discussion portal. I'm sure someone in the community would help you. Reply
Chiuyee Lau
Chiuyee Lau says: November 08, 2016 at 8:54 pm
I have read that paper. It says you can start from a white noise image and the target is the base image. The reference image is used to divert the image to be close to the reference one. Some other paper also suggests you can also start from the base image. Reply
Faizan Shaikh
Faizan Shaikh says: November 09, 2016 at 8:39 am
Yes you are right. Starting with the base image would converge faster than random noise, so we've used it here. Also, if you see; targets of a neural network depend on what loss function you've defined. Here we've defined three, which each of them affects our model in a specific way. You're in a way saying to the network, "the output of this layer should closely resemble this image" Reply
prakash says: November 09, 2016 at 9:55 am
thanks for the reply . since we are using already trained weights will that count on the project.?.or the model completely trained by us counts..? Reply
Fateh says: November 09, 2016 at 3:55 pm
Hi Faizan Can you please add a file with the above code so that we can test it out on our own machines.Presently the code is in the form of images and cannot be copied. Thanks Reply
Faizan Shaikh
Faizan Shaikh says: November 09, 2016 at 4:54 pm
Hi Fateh, sorry for the wait. Here's the link to the code on github ( https://github.com/faizankshaikh/Random_Projects/tree/master/keras_NeuralStyle ) Reply
Fateh says: November 09, 2016 at 5:05 pm
Thanks a lot for a great article and the code Reply
Faizan Shaikh
Faizan Shaikh says: November 09, 2016 at 6:00 pm
You are welcome :) Reply
Faizan Shaikh
Faizan Shaikh says: November 09, 2016 at 6:08 pm
As I've said in the comments above, pre-training doesn’t necessary mean that the model is trained on the “intended” dataset. So yes you can use them for your project. I intend an article on an explaning pre-training & fine tuning in the future. Do check it out. Reply
bikram kachari
bikram kachari says: November 11, 2016 at 1:12 pm
I am getting the following error - "ValueError: all the input array dimensions except for the concatenation axis must match exactly " the stacktrace points to the line input_tensor = K.concatenate([base_image, ref_image, final_image], axis=0) Reply
Faizan Shaikh
Faizan Shaikh says: November 12, 2016 at 8:16 am
Hi bikram, is there anything that you changed in the code? because it works fine for me Reply
Alvydas Vitkauskas
Alvydas Vitkauskas says: November 13, 2016 at 7:30 pm
I had the same issue. Inserting the foolowing line at the start of cell 5 solved it. K.set_image_dim_ordering('th') Reply
Faizan says: November 14, 2016 at 8:07 am
Thanks for posting the solution Alvydas! In the code above, I assume that keras is using theano as backend. So the image ordering would follow theano protocols Reply
Alvydas Vitkauskas
Alvydas Vitkauskas says: November 14, 2016 at 10:04 am
Yes, results of cell 1 in my environment said "Using Theano backend", but dim_ordering was still "tf" wihout setting it explicitly to "th". Reply
Steven Ross
Steven Ross says: November 16, 2016 at 10:10 pm
Can I do this type of visualization in R? is There a R package for that? Reply
Faizan Shaikh
Faizan Shaikh says: November 17, 2016 at 6:44 am
I have not searched extensively but I haven't found a similar implementation in R. It will surely be a good project to do this in R Reply
avaneesh Kumar
avaneesh Kumar says: November 19, 2016 at 7:47 pm
Please revert to my problem, I am unable to solve it. Reply
Faizan Shaikh
Faizan Shaikh says: November 21, 2016 at 5:41 am
Could you describe in detail what is your problem? Reply
Nisheeth Golakiya
Nisheeth Golakiya says: November 26, 2016 at 4:14 pm
I am getting 'ValueError - all the input array dimensions except for the concatenation axis must match exactly' on line 6(...fprime=evaluator.grads, maxfun=20) in cell 17. Reply
Faizan Shaikh
Faizan Shaikh says: November 26, 2016 at 4:20 pm
I've updated the steps according to your feedback. Thanks! Reply
Faizan Shaikh
Faizan Shaikh says: November 26, 2016 at 4:21 pm
Hi Nisheeth, Did you change the dimension ordering as explained in step 0? ("image_dim_ordering": "th") Reply
Nisheeth Golakiya
Nisheeth Golakiya says: November 26, 2016 at 4:37 pm
Sorry, my bad. Reply

Leave a Reply Your email address will not be published. Required fields are marked *