How to start a Deep Learning project?

Image for post
Image for post
Images generated from PaintsChainer.

In a 6-part series, I will explain the whole journey from starting to finishing a deep learning (DL) project. We will use an automatic Manga colorization project we did to illustrate the deep learning design, debugging and tuning process.

The whole series for “How to start a Deep Learning project?” consists of six parts:

· Part 1: Start a Deep Learning project.
· Part 2: Build a Deep Learning dataset.
· Part 3: Deep Learning designs.
· Part 4: Visualize Deep Network models and metrics.
· Part 5: Debug a Deep Learning Network.
· Part 6: Improve Deep Learning Models performance & network tuning.

Part 1: Start a Deep Learning project

What projects to pick?

Many AI projects are not that serious and pretty fun. In early 2017, as part of my research on the topic of Generative Adversaries Network (GAN), I started a project to colorize Japanese Manga. The problem is difficult but it was fascinating in particular I cannot draw! In looking for projects, look beyond incremental improvements, make a product that is marketable, or create a new model that learns faster and better.

Debug Deep Network (DN) is tough

Deep Learning (DL) training composes of million iterations to build a model. Locate bugs are hard and it breaks easily. Start with something simple and make changes incrementally. Model optimizations like regularization can always wait after the code is debugged. Visualize your predictions and model metrics frequently. Make something works first so you have a baseline to fall back. Do not get stuck in a big model.

Start small, move small.

Measure and learn

Plan big fails big. Most personal projects last from two to four months for the first release. It is pretty short since research, debugging and experiments take time. We schedule those complex experiments to run overnight. By early morning, we want enough information to make our next move. As a good rule of thumb, those experiments should not run longer than 12 hours in the early phase. To achieve that, we narrow the scope to single Anime characters. As a flashback, we should reduce the scope further. We have many design tests and we need to turn around fast. Analyze where your models fail. Don’t plan too far ahead. We want to measure and learn fast.

Build, measure and learn.

Research v.s. product

When we start the Manga project in the Spring of 2017, Kevin Frans has a Deepcolor project to colorize Manga with spatial color hints using GAN.

Image for post
Image for post
Source: Kevin Frans

Many AI fields are pretty competitive. When defining the goal, you want to push hard enough so the project is still relevant when it is done. GAN model is pretty complex and the quality is usually not product-ready in early 2017. Nevertheless, if you narrow what the product can handle smartly, you may push the quality high enough as a commercial product. To achieve that, select your training samples carefully. For any DL projects, strike a good balance among model generalization, capacity, and accuracy.


Training real-life models with a GPU is a must. It is 20 to 100 times faster than a CPU. The lowest price Amazon GPU p2.xlarge spot instance is about $7.5/day and then go up to $75/day for an 8 unit GPUs. Training models can be expensive consider that Google can spend a whole week in training NLP models using one thousand servers. In our Manga project, some experiments took over 2 days. We spend an average of $150/week. For faster iterations, it can ring up the bill to $1500/week with faster instances. Instead of using the cloud computing, you can purchase a standalone machine. A desktop with the Nvidia GeForce GTX 1080 TI costs about $2200 in Feb 2018. It is about 5x faster than a P2 instance in training a fine-tuned VGG model.

Time line

We define our development in four phases with the last 3 phases executed in multiple iterations.

• Project research
• Design
• Implementation and debugging
• Experiment and tuning

Project research

We do research in current offerings to explore their weakness. For many GAN type solutions, they utilized spatial color hints. The drawings are a little bit wash out or muddy. The colors sometimes bleed. We set a 2-month timeframe for our project with 2 top priorities: generate color without hints and improve color fidelity. Our goal is:

Color a grayscale Manga drawing without hints on single Anime characters.

Standing on the shoulders of giants

Then we study related research and open source projects. Spend a good amount of time in doing research. Gain intuitions on where existing models are flawed or performed well. Many people go through at least a few dozen papers and projects before starting their implementations. For example, when we deep down into GANs, there are over a dozen new GAN models: DRAGAN, cGAN, LSGAN etc… Reading research papers can be painful. Skim through the paper quickly to grab the core ideas. Pay attention to the figures. After knowing what is important, read the paper again.

Deep learning (DL) codes are condensed but difficult to troubleshoot. Research papers often miss details. Many projects start off with open source implementations that show successes for similar problems. Search hard. Try a few options. We locate code implementations on different GAN variants. We trace the code and give them a few test drives. We replace the generative network of one implementation with an image encoder and a decoder. As a special bonus, we find the hyperparameters in place are pretty decent. Otherwise, searching for the initial hyperparameters can be tedious when the code is still buggy.

Part 2

Have fun and being innovative. There are plenty cool applications of deep learning waiting for you. Free feel to leave comments on nice AI projects you find.

In part 1, we talk about the general principle of a Deep Learning project. With the exception of using academic datasets, the effort to build a dataset is usually overlooked and under estimated. In Part 2: Build a Deep Learning dataset, we discuss how to build a dataset for a better model.

Written by

Deep Learning

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store