Friday, May 5, 2017

Handwriting Recognition with Deep Learning

In this post, we will go over an example to apply deep learning, in particular convolutional neural network, to recognize handwritten numbers from 0 to 9 using TensorFlow backend Keras. If you don't have deep learning environment setup yet, checkout this post.

Thankfully, there is publicly available handwritten digits and its labels on the web that we can use. For more info, check out MNIST website. Even better, Keras has MNIST data module, so we will use  it.

Use your favorite editor and create mnist.py file with the following:

Upon running the file with
$ python mnist.py

you will see convolutional network being trained with data and tested with unseen data with accuracy of about 99%!

Thursday, May 4, 2017

Setup Deep Learning Development Environment in Ubuntu

In this tutorial, I will go through step by step instructions to setup deep learning development environment for Ubuntu. We will install the following python packages on fresh Ubuntu 16.04:
opencv
tensorflow
keras
matplotlib
numpy
scipy
sklearn
tk

Let's dig it! First, we need to install pip, which helps us install these python packages with ease.
$ sudo apt-get install python-pip -y

You may want to upgrade pip to the latest version:
$ pip install --upgrade pip

Next, let's install python packages:
$ sudo -H pip install tensorflow keras numpy scipy matplotlib sklearn

For OpenCV and TK, we need to install it from apt-get:
$ sudo apt-get install libopencv-dev python-opencv python-tk -y

That's it! Now you are ready to develop your neural network with tensorflow backend keras! If you want to test out if your environment is successfully setup, check out this post.

Tuesday, May 2, 2017

Displaying Exact Values for Digital Color Meter on Mac OS X

Mac OS X's built-in app Digital Color Meter is an extremely useful tool for me, and I love it. However, the only problem is that it is not exact.

I manually created an image with RGB values of arbitrary numbers, say (38, 205, 86) using OpenCV. I then opened up the image with Preview, and measured the color with Digital Color Meter. Unfortunately, its output was close but not exact.

I also noticed that there is an drop down menu, such as
Display in native values
Display in P3
...

After playing around with all of these options, I am very happy to report that the option Display in sRGB produces exact match with the pixel!


Saturday, April 22, 2017

Hosting Multiple Websites on a Single IP Address

It is not too difficult to host multiple websites on a single server and a single IP address. Here is how to do it with Apache server, based on article1, article2, and article3. I am assuming that you already have a server setup with apache2. If not, please refer to this tutorial for instructions.

The first step is to create virtual hosts where each virtual host serves each different website. For example, assume you want to serve domain names: hi.com and hello.com. You will need to create two virtual hosts, so that one will serve hi.com while the other will serve hello.com.

On Ubuntu or Debian, apache default server configuration files are in /etc/apache2 directory, while web server directory is /var/www.

First, copy the default configuration file and create two virtual host config files:
$ sudo cp /etc/apache2/sites-available/000-default.conf /etc/apache2/sites-available/hi.conf
$ sudo cp /etc/apache2/sites-available/000-default.conf /etc/apache2/sites-available/hello.conf

Next, edit /etc/apache2/sites-available/hi.conf similar to below:
 <VirtualHost *:80>
    ServerName hi.com
    ServerAlias www.hi.com
    DocumentRoot /var/www/hi
    ErrorLog ${APACHE_LOG_DIR}/error.log
    CustomLog ${APACHE_LOG_DIR}/access.log combined
</VirtualHost>


Note that you must create /var/www/hi directory that contains files to serve for hi.com.

Similarly, edit /etc/apache2/sites-available/hello.conf in the same manner.
 <VirtualHost *:80>
    ServerName hello.com
    ServerAlias www.hello.com
    DocumentRoot /var/www/hello
    ErrorLog ${APACHE_LOG_DIR}/error.log
    CustomLog ${APACHE_LOG_DIR}/access.log combined
</VirtualHost>


Again, you will need to create /var/www/hello directory that will serve visitors to hello.com.

Next, enable the new virtual host configuration files:
$ sudo a2ensite hi.conf 
$ sudo a2ensite hello.conf 

Next, reload apache2 so that the change takes effect
$ sudo service apache2 reload

If you want to test these out, refer to this excellent article for more details.


These steps up to here will complete the setup for the server side. Now, it is time to setup your domain name configurations.

To direct any visitor who enters hi.com or hello.com to your virtual hosts, you will need to add A Record. Take a look at this article for more details.

Essentially, create A Record for hi.com and hello.com to direct to your server's IP address, and apache server will then take care of directing visitors of hi.com to your hi.com virtual host, and visitors of hello.com to the virtual host of hello.com that you have set up above.

*** Note: make sure not to enable forwarding of your domain name to your server. That was my first attempt, and it did not work. You will need to set up A Record instead in order for your apache server to point to appropriate virtual host.

Tuesday, April 18, 2017

Local Thresholding from Zxing Library (C++)

In this tutorial, I will go over how to call local threshold method in zxing library, ported to C++ (zxing-cpp) instead of Java. If you want to implement it in Java, refer to this tutorial. I will assume that OpenCV library is installed on the system.

First, one needs to download zxing-cpp
$ git clone https://github.com/glassechidna/zxing-cpp.git
$ mkdir build && cd build

To integrate zxing-cpp with system's OpenCV, one needs to first search for OpenCVConfig.cmake file:
$ find / -iname opencvconfig.cmake 2>/dev/null
/usr/local/Cellar/opencv3/3.2.0/share/OpenCV/OpenCVConfig.cmake

In my case, I installed OpenCV from Homebrew, and its config cmake file is in /usr/local/Cellar/opencv3/3.2.0/share/OpenCV/ directory. Export OpenCV_DIR environment variable to point to this path:
$ export OpenCV_DIR='/usr/local/Cellar/opencv3/3.2.0/share/OpenCV/'

Now, we are ready to compile and install:
$ cmake -G "Unix Makefiles" ..
$ make -j2
$ sudo make install

Next, change directory to your working directory and create zxing.cpp source file as below:

To compile, we first need to copy MatSource.h into zxing include directory. I am assuming that you have cloned the zxing-cpp repository into ~/zxing-cpp-master.
$ sudo cp ~/zxing-cpp-master/opencv/src/zxing/MatSource.h /usr/local/include/zxing/
where /usr/local/include/zxing is the folder that contains zxing-cpp library's header files.

We also need to locate where zxing-cpp library file is installed. You can look it up from the output after sudo make install command above, or search the file as shown below.
$ find / -name libzxing.a 2>/dev/null
/usr/local/lib/libzxing.a

Finally, we are ready to compile.
$ g++ zxing.cpp `pkg-config --libs opencv` -lzxing -L/usr/local/lib/

Make sure to replace /usr/local/lib/ above after -L option with the directory where libzxing.a file is installed in your system.

To execute, run the following:
$ ./a.out some_image.jpg

You should see local threshold binary image of your input image file.

Monday, April 17, 2017

Local Thresholding from Zxing Library (Java)

Although OpenCV has its own local threshold method, such as adaptiveThreshold, I was looking for something a bit sophisticated and better. After some search, I came across Zxing's HybridBinarizer class that does the job much better than simple adaptiveThreshold from OpenCV. So, below is a very rough code to make use of this excellent library in Java. If you want to do this in C++, refer to this tutorial.



I admit that the code above is a bit messy, but it is just for testing out to see this actually works. In writing the code, I would like to acknowledge the following especially helpful references.

To compile the file and run it, first, create Test.java with the code above. Then, download Zxing's core library file (most recent version at the time of writing this is core-3.3.0.jar) form here. Then, run the following, assuming the jar file is in the same directory.

$ javac Test.java -cp ".:core-3.3.0.jar"
$ java -cp ".:core-3.3.0.jar" Test image_to_test.jpg

This will output the local thresholded image binary_output.jpg to the same folder. If you want to know more about java and javac -cp option, refer to this tutorial.

Sunday, April 16, 2017

Java Compilation with Package and Jar Dependencies

In this tutorial, I will go over Java package and dependencies basics.

Let's go over the simplest java code with package declaration. Let's assume we are working on ~/java_package directory. Create Hello.java file below in the current directory:


To compile and execute this, we will first create a directory whose name is equal to the package name. In this case, the directory name should be pkgtest. So, create this directory and move Hello.java file into this. Run all of the following from ~/java_package directory:
$ mkdir pkgtest
$ mv Hello.java pkgtest/

Next, we need to compile. This is very simple.
$ javac pkgtest/Hello.java

Finally, we need to run it.
$ java pkgtest/Hello
hello

Note that you must run this from ~/java_package directory; otherwise java will complain with the error below:
$ cd pkgtest
$ java Hello
Error: Could not find or load main class Hello


Next, we will see how to import the package. Modify pkgtest/Hello.java and create ./JarTest.java as below:


To compile and run, run the following from ~/java_package directory:
$ javac JarTest.java
$ java JarTest
hello
hi


Next, we will create a jar library file and use this to compile and run. First, let's create the jar library. Run the following in ~/java_package directory
$ javac pkgtest/Hello.java
$ jar cvf pkgtest.jar pkgtest/Hello.class
added manifest
adding: pkgtest/Hello.class(in = 648) (out= 372)(deflated 42%)

This creates pkgtest.jar file, which contains Hello class.

Next, we will rename the pkgtest directory so that we don't compile from the source.
$ mv pkgtest pkgtest_bk

Let's see what happens if don't specify the jar file as we compile.
$ javac JarTest.java
JarTest.java:1: error: package pkgtest does not exist
import pkgtest.Hello;
              ^
JarTest.java:5: error: cannot find symbol
        Hello.print("hello");
        ^
  symbol:   variable Hello
  location: class JarTest
JarTest.java:6: error: cannot find symbol
        Hello h = new Hello();
        ^
  symbol:   class Hello
  location: class JarTest
JarTest.java:6: error: cannot find symbol
        Hello h = new Hello();
                      ^
  symbol:   class Hello
  location: class JarTest
4 errors

As you can see above, java compiler complains that it cannot find some symbols, since we renamed the package directory. To resolve this, let's link the jar file:
$ javac -cp ".:pkgtest.jar" JarTest.java
$ java -cp ".:pkgtest.jar" JarTest
hello
hi

Viola! Let's call it a day.

Friday, April 7, 2017

Useful Environment Variables for Mac OS X

Here is a list of some useful environment variables for Mac OS X:

C_INCLUDE_PATH: header search path for gcc

CPLUS_INCLUDE_PATH: header search path for g++

Thursday, April 6, 2017

Dynamic Programming: Maximizing Stock Profit Example

In this tutorial, I will go over a simple dynamic programming example. Dynamic programming simply refers to breaking down a complicated problem into simpler sub-problems and saving their results to refer back. I think of dynamic programming as an extension to recursion where we store its child recursive solutions, or more simply recursion + memoization.

Let's take a look at an example. Assume that you somehow have the ability to predict stock market for the next N days. Say you can trade stock exactly once a day, and the total number of trades you can make within the next N days is constrained to some number, say n. In addition, you can not own more than 1 stock at a time.

So, basically, given stock prices for next N consecutive days, [x1, x2, ..., xN], and given maximum of n total trades all on different days, when will you buy and sell each stock so that you can maximize the profit? Note that you must buy-sell-buy-sell and so forth with no consecutive buys or sells.

For example, if the next 5 day market price will be [5, 2, 2, 5, 3], and if I have at most 2 buy-sell trades to do, then I would buy on day 2 or 3 and sell on day 4, leaving me with +3 in profit. I will buy-sell only once, since this maximizes the profit.

It is easy when N is small, but it gets very difficult as N grows large. Given N = 50 with prices given by [8, 44, 39, 4, 41, 0, 19, 29, 44, 43, 44, 21, 34, 39, 23, 7, 1, 24, 30, 34, 40, 26, 28, 11, 44, 19, 48, 12, 5, 10, 2, 21, 0, 24, 8, 44, 36, 20, 6, 2, 4, 21, 14, 6, 47, 31, 23, 29, 4, 36] and you have at most 10 buy-sell trades allowed. What is your maximum profit?

To tackle this problem, I would first break down the problem. Here, it is asking for the maximum profit for at most n stock buy-sell stock trades. I will split it into sub-problems where I can find the maximum profit for exactly n buy-sell trades, and later I can search for the max among them.

So, the sub-problem is now to find the maximum profit given N-day forecast where I need to trade exactly n times. Since each trade is either buy or sell, I will once again split this sub-problem into sub-sub-problem as follows:

B(x, N, n): given N-day forecast of prices x, find the maximum cash after exactly n-1 buy-sell trades and another buy by the Nth day.

S(x, N, n): given N-day forecast of prices x, find the maximum cash after exactly n buy-sell trades by the Nth day.

With these functions, it is not too difficult to come up with a simple recursion relationship. Perhaps it is easy to see this from Python code below:

So, that is the recursive part of dynamic programming. Now, we need to deal with the memoization part. Note that with the recursion above, the performance will be very slow, since the number of recursive calls exponentially grows, where they are called with the same parameters multiple times. While we can store the results for all possible parameters, this will take up too much memory ~ O(N*N). A better way is to start from N = 1 and store all possible values for n, and increment N by 1 each time. This requires O(N) space.

It is probably better to explain in code below:

To run it:
$ python dp2.py 10
[4, 1, 4, 6, 2, 7, 4, 9, 7, 4]
[0, 8, 12, 15, 12, 6]

That is, given 10-day stock price [4, 1, 4, 6, 2, 7, 4, 9, 7, 4], the most profit you can make by trading up to 5 buy-sell trades using aforementioned rule is 0, 8, 12, 15, 12, and 6. Now, to find the answer to the original question, we simply take the maximum of the array. Given 2 maximum buy-sell trades, the most profit one can make is 12.

By the way, the problem become trivial if maximum allowed trade is N/2, in which case one simply buys at local minimum and sells at local maximum. Hence, this algorithm shines when n is less than N/2.

Sunday, April 2, 2017

How to Deal with Argument List Too Long Error Message

In Mac OS X terminal,when you provide too many arguments to commands, you may get the error message below:

$ rm *.jpg
-bash: /bin/rm: Argument list too long

This is because there are too many jpg files in the directory, all of which are fed into rm arguments. Here are two ways to achieve it.

$ find . -name '*.jpg' -exec rm {} \;

$ find . -name '*.jpg' | xargs rm

Note that they will be slow, since we are basically calling the rm command as many times as the number of jpg files in the folder, but this works though!