Monday, May 18, 2015

Image processing using OpenCV on Hadoop using HIPI


HIPI meets OpenCV


Problem



This project tries to solve the problem of processing big data of images on Apache Hadoop using Hadoop Image Processing Interface (HIPI) for storing and efficient distributed processing, combined with OpenCV, an open source library of rich image processing algorithms. A program to count number of faces in collection of images is demonstrated.

Background



Processing large set of images on a single machine can be very time consuming and costly. HIPI is an image processing library designed to be used with the Apache Hadoop MapReduce, a software framework for sorting and processing big data in a distributed fashion on large cluster of commodity hardware. HIPI facilitates efficient and high-throughput image processing with MapReduce style parallel programs typically executed on a cluster. It provides a solution for how to store a large collection of images on the Hadoop Distributed File System (HDFS) and make them available for efficient distributed processing.


OpenCV (Open Source Computer Vision) is an open source library of rich image processing algorithms, mainly aimed at real time computer vision. Starting with OpenCV 2.4.4, OpenCV supports Java Development which can be used with Apache Hadoop.

Goal



This project demonstrates how HIPI and OpenCV can be used together to count total number of faces in big image dataset.


Overview of Steps






Big Data Set

Test images for face detection
Input: Image data containing 158 image (34MB)
Format: png image files


Downloaded image were in gif format, I used Mac OSX Preview program to convert these to png format.


Other sources for face detection image datasets:

Technologies Used:



Software Used
Purpose
VMWare Fusion
Software hypervisor for running Cloudera Quickstart VM
Cloudera Quickstart VM
VM for single node Hadoop cluster for testing and running map/reduce programs
IntelliJ IDEA 14 CE
Java IDE for editing and compiling Java code
Hadoop Image Processing Interface (HIPI)
Image processing library designed to be used with the Apache Hadoop MapReduce parallel programming framework, for storing large collection of images on HDFS and efficient distributed processing.
OpenCV
An image processing library aimed at real-time computer vision
Apache Hadoop
Distributed processing of large data sets


References:

Steps

1. Download VMWare Fusion



VMware Fusion is a software hypervisor developed by VMware for computers running OS X with Intel processors. Fusion allows Intel-based Macs to run operating systems such as Microsoft Windows, Linux, NetWare, or Solaris on virtual machines, along with their Mac OS X operating system using a combination of paravirtualization,hardware virtualization and dynamic recompilation. (http://en.wikipedia.org/wiki/VMware_Fusion)


Download and install VMWare Fusion from following URL, this will be used to run Cloudera Quickstart VM.




Screen Shot 2015-05-09 at 11.32.24 AM.png

2. Download and Setup Cloudera Quickstart VM 5.4.x



The Cloudera QuickStart VMs contain a single-node Apache Hadoop cluster, complete with example data, queries, scripts, and Cloudera Manager to manage your cluster. The VMs run CentOS 6.4 and are available for VMware, VirtualBox, and KVM. This will help us gettings started with all the tools needed to run image processing using Hadoop.


Download Cloudera Quickstart VM 5.4.x from following URL. The Quickstart VM 5.4 has Hadoop 2.6 installed on it, which is needed for HIPI.




Open VMWare Fusion and open the VM


Screen Shot 2015-05-10 at 8.46.02 AM.png


3. Getting started with HIPI



Following steps demonstrates how to setup HIPI to run MapReduce job on Apache Hadoop.

Setup Hadoop

The Cloudera Quickstart VM 5.4.x comes pre-installed with Hadoop 2.6 which is needed needed for running HIPI.


Check Hadoop is installed and has correct version:


[cloudera@quickstart Project]$ which hadoop
/usr/bin/hadoop
[cloudera@quickstart Project]$ hadoop version
Hadoop 2.6.0-cdh5.4.0
Subversion http://github.com/cloudera/hadoop -r c788a14a5de9ecd968d1e2666e8765c5f018c271
Compiled by jenkins on 2015-04-21T19:18Z
Compiled with protoc 2.5.0
From source with checksum cd78f139c66c13ab5cee96e15a629025
This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.4.0.jar


Install Apache Ant

Install Apache Ant and check it added to PATH:


[cloudera@quickstart Project]$ which ant
/usr/local/apache-ant/apache-ant-1.9.2/bin/ant


Install and build HIPI

There are two ways to install HIPI
  1. Clone the latest HIPI distribution from GitHub and build from source. (https://github.com/uvagfx/hipi)
  2. Download a precompiled JAR from the downloads page. (http://hipi.cs.virginia.edu/downloads.html)


Clone HIPI GitHub repository


The best way to check and verify that your system is properly setup is to clone the official GitHub repository and build the tools and example programs.


[cloudera@quickstart Project]$ git clone https://github.com/uvagfx/hipi.git
Initialized empty Git repository in /home/cloudera/Project/hipi/.git/
remote: Counting objects: 2882, done.
remote: Total 2882 (delta 0), reused 0 (delta 0), pack-reused 2882
Receiving objects: 100% (2882/2882), 222.33 MiB | 7.03 MiB/s, done.
Resolving deltas: 100% (1767/1767), done.


Download Apache Hadoop tarball
Download Apache Hadoop tarball from following URL and untar it, this is needed to build HIPI.


[cloudera@quickstart Project]$ tar -xvzf /mnt/hgfs/CSCEI63/project/hadoop-2.6.0-cdh5.4.0.tar.gz
[cloudera@quickstart Project]$ ls
hadoop-2.6.0-cdh5.4.0  hipi


Build HIPI binaries


Change directory to hipi repo and build HIPI.


[cloudera@quickstart Project]$ cd hipi/
[cloudera@quickstart hipi]$ ls
3rdparty   data  examples  license.txt  release
build.xml  doc   libsrc    README.md    util


Before building the HIPI, hadoop.home and hadoop.version properties in build.xml file should be updated to the path to Hadoop installation and the version of Hadoop we are using. Change:


build.xml



  <!-- IMPORTANT: You must update the following two properties according to your Hadoop setup -->
   <!-- <property name="hadoop.home" value="/usr/local/Cellar/hadoop/2.6.0/libexec/share/hadoop" /> -->
   <!-- <property name="hadoop.version" value="2.6.0" /> -->


to


<!-- IMPORTANT: You must update the following two properties according to your Hadoop setup -->
   <property name="hadoop.home" value="/home/cloudera/Project/hadoop-2.6.0-cdh5.4.0/share/hadoop" />
   <property name="hadoop.version" value="2.6.0-cdh5.4.0" />


Build HIPI using ant


[cloudera@quickstart hipi]$ ant
Buildfile: /home/cloudera/Project/hipi/build.xml
hipi:
   [javac] Compiling 30 source files to /home/cloudera/Project/hipi/lib
     [jar] Building jar: /home/cloudera/Project/hipi/lib/hipi-2.0.jar
    [echo] Hipi library built.


compile:
   [javac] Compiling 1 source file to /home/cloudera/Project/hipi/bin
     [jar] Building jar: /home/cloudera/Project/hipi/examples/covariance.jar
    [echo] Covariance built.


all:


BUILD SUCCESSFUL
Total time: 36 seconds


Make sure it builds tools and examples:


[cloudera@quickstart hipi]$ ls
3rdparty  build.xml  doc       lib     license.txt  release  util
bin       data       examples  libsrc  README.md    tool
[cloudera@quickstart hipi]$ ls tool/
hibimport.jar
[cloudera@quickstart hipi]$ ls examples/
covariance.jar          hipi              runCreateSequenceFile.sh
createsequencefile.jar  jpegfromhib.jar   runDownloader.sh
downloader.jar          rumDumpHib.sh     runJpegFromHib.sh
dumphib.jar             runCovariance.sh  testimages.txt


Sample MapReduce Java Program


Create SampleProgram.java class in “sample” folder to run a simple map/reduce program using sample.hib created in above step:


[cloudera@quickstart hipi]$ mkdir sample
[cloudera@quickstart hipi]$ vi sample/SampleProgram.java


SampleProgram.java



import hipi.image.FloatImage;
import hipi.image.ImageHeader;
import hipi.imagebundle.mapreduce.ImageBundleInputFormat;


import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;


import java.io.IOException;


public class SampleProgram extends Configured implements Tool {


  public static class SampleProgramMapper extends Mapper<ImageHeader, FloatImage, IntWritable, FloatImage> {
      public void map(ImageHeader key, FloatImage value, Context context)
              throws IOException, InterruptedException {


          // Verify that image was properly decoded, is of sufficient size, and has three color channels (RGB)
          if (value != null && value.getWidth() > 1 && value.getHeight() > 1 && value.getBands() == 3) {


              // Get dimensions of image
              int w = value.getWidth();
              int h = value.getHeight();


              // Get pointer to image data
              float[] valData = value.getData();


              // Initialize 3 element array to hold RGB pixel average
              float[] avgData = {0,0,0};


              // Traverse image pixel data in raster-scan order and update running average
              for (int j = 0; j < h; j++) {
                  for (int i = 0; i < w; i++) {
                      avgData[0] += valData[(j*w+i)*3+0]; // R
                      avgData[1] += valData[(j*w+i)*3+1]; // G
                      avgData[2] += valData[(j*w+i)*3+2]; // B
                  }
              }


              // Create a FloatImage to store the average value
              FloatImage avg = new FloatImage(1, 1, 3, avgData);


              // Divide by number of pixels in image
              avg.scale(1.0f/(float)(w*h));


              // Emit record to reducer
              context.write(new IntWritable(1), avg);


          } // If (value != null...


      } // map()
  }


  public static class SampleProgramReducer extends Reducer<IntWritable, FloatImage, IntWritable, Text> {
      public void reduce(IntWritable key, Iterable<FloatImage> values, Context context)
              throws IOException, InterruptedException {


          // Create FloatImage object to hold final result
          FloatImage avg = new FloatImage(1, 1, 3);


          // Initialize a counter and iterate over IntWritable/FloatImage records from mapper
          int total = 0;
          for (FloatImage val : values) {
              avg.add(val);
              total++;
          }


          if (total > 0) {
              // Normalize sum to obtain average
              avg.scale(1.0f / total);
              // Assemble final output as string
              float[] avgData = avg.getData();
              String result = String.format("Average pixel value: %f %f %f", avgData[0], avgData[1], avgData[2]);
              // Emit output of job which will be written to HDFS
              context.write(key, new Text(result));
          }


      } // reduce()
  }


  public int run(String[] args) throws Exception {
      // Check input arguments
      if (args.length != 2) {
          System.out.println("Usage: firstprog <input HIB> <output directory>");
          System.exit(0);
      }


      // Initialize and configure MapReduce job
      Job job = Job.getInstance();
      // Set input format class which parses the input HIB and spawns map tasks
      job.setInputFormatClass(ImageBundleInputFormat.class);
      // Set the driver, mapper, and reducer classes which express the computation
      job.setJarByClass(SampleProgram.class);
      job.setMapperClass(SampleProgramMapper.class);
      job.setReducerClass(SampleProgramReducer.class);
      // Set the types for the key/value pairs passed to/from map and reduce layers
      job.setMapOutputKeyClass(IntWritable.class);
      job.setMapOutputValueClass(FloatImage.class);
      job.setOutputKeyClass(IntWritable.class);
      job.setOutputValueClass(Text.class);


      // Set the input and output paths on the HDFS
      FileInputFormat.setInputPaths(job, new Path(args[0]));
      FileOutputFormat.setOutputPath(job, new Path(args[1]));


      // Execute the MapReduce job and block until it complets
      boolean success = job.waitForCompletion(true);


      // Return success or failure
      return success ? 0 : 1;
  }


  public static void main(String[] args) throws Exception {
      ToolRunner.run(new SampleProgram(), args);
      System.exit(0);
  }


}


Add a new build target to hipi/build.xml to build SampleProgram.java and create a sample.jar library:


build.xml

 <target name="sample">
     <antcall target="compile">
     <param name="srcdir" value="sample" />
     <param name="jarfilename" value="sample.jar" />
     <param name="jardir" value="sample" />
     <param name="mainclass" value="SampleProgram" />
     </antcall>
  </target>
...


Build SampleProgram


[cloudera@quickstart hipi]$ ant sample
Buildfile: /home/cloudera/Project/hipi/build.xml
...
compile:
     [jar] Building jar: /home/cloudera/Project/hipi/sample/sample.jar


BUILD SUCCESSFUL
Total time: 16 seconds


Running a sample HIPI MapReduce Program



Create a sample.hib on HDFS file from sample images provides with HIPI using hibimport tool, this will be the input to MapReduce program.


[cloudera@quickstart hipi]$ hadoop jar tool/hibimport.jar data/test/ImageBundleTestCase/read/
0.jpg  1.jpg  2.jpg  3.jpg  
[cloudera@quickstart hipi]$ hadoop jar tool/hibimport.jar data/test/ImageBundleTestCase/read examples/sample.hib
** added: 2.jpg
** added: 3.jpg
** added: 0.jpg
** added: 1.jpg
Created: examples/sample.hib and examples/sample.hib.dat
[cloudera@quickstart hipi]$ hadoop fs -ls examples
Found 2 items
-rw-r--r--   1 cloudera cloudera         80 2015-05-09 22:19 examples/sample.hib
-rw-r--r--   1 cloudera cloudera    1479345 2015-05-09 22:19 examples/sample.hib.dat


Running Hadoop MapReduce program


[cloudera@quickstart hipi]$ hadoop jar sample/sample.jar examples/sample.hib examples/output
15/05/09 23:05:10 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/05/09 23:07:05 INFO mapreduce.Job: Job job_1431127776378_0001 running in uber mode : false
15/05/09 23:07:05 INFO mapreduce.Job:  map 0% reduce 0%
15/05/09 23:08:55 INFO mapreduce.Job:  map 57% reduce 0%
15/05/09 23:09:04 INFO mapreduce.Job:  map 100% reduce 0%
15/05/09 23:09:38 INFO mapreduce.Job:  map 100% reduce 100%
15/05/09 23:09:39 INFO mapreduce.Job: Job job_1431127776378_0001 completed successfully
File Output Format Counters
Bytes Written=50


Check the program output:
[cloudera@quickstart hipi]$ hadoop fs -ls examples/output
Found 2 items
-rw-r--r--   1 cloudera cloudera          0 2015-05-09 23:09 examples/output/_SUCCESS
-rw-r--r--   1 cloudera cloudera         50 2015-05-09 23:09 examples/output/part-r-00000


The average pixel value calculated for all the image is:


[cloudera@quickstart hipi]$ hadoop fs -cat examples/output/part-r-00000
1 Average pixel value: 0.420624 0.404933 0.380449


4. Getting started with OpenCV using Java



OpenCV is an image processing library. It contains a large collection of image processing functions. Starting from version 2.4.4 OpenCV includes desktop Java bindings. We will use version 2.4.11 to build Java bindings and use it with HIPI to run image processing.


Download OpenCV source:



The zip bundle for OpenCV 2.4.11 source can be download from following URL and unzip it to ~/Project/opencv directory.


[cloudera@quickstart Project]$ mkdir opencv && cd opencv
[cloudera@quickstart opencv]$ unzip /mnt/hgfs/CSCEI63/project/opencv-2.4.11.zip


CMake build system

CMake is a family of tools designed to build, test and package software. CMake is used to control the software compilation process using simple platform and compiler independent configuration files. CMake generates native makefiles and workspaces that can be used in the compiler environment of your choice. (http://www.cmake.org/)


The OpenCV is built with CMake, download CMake binaries from http://www.cmake.org/download/ and untar the tar bundles to ~/Project/opencv directory:


[cloudera@quickstart opencv]$ tar -xvzf cmake-3.2.2-Linux-x86_64.tar.gz


Build OpenCV for Java



Following steps details how to build OpenCV on linux
Configure OpenCV for builds on Linux


[cloudera@quickstart opencv-2.4.11]$ ../cmake-3.2.2-Linux-x86_64/bin/cmake -DBUILD_SHARED_LIBS=OFF
.
.
.  Target "opencv_haartraining_engine" links to itself.
This warning is for project developers.  Use -Wno-dev to suppress it.


-- Generating done
-- Build files have been written to: /home/cloudera/Project/opencv/opencv-2.4.11


Build OpenCV


[cloudera@quickstart opencv-2.4.11]$ make
.
.
.
[100%] Building CXX object apps/traincascade/CMakeFiles/opencv_traincascade.dir/imagestorage.cpp.o
Linking CXX executable ../../bin/opencv_traincascade
[100%] Built target opencv_traincascade
Scanning dependencies of target opencv_annotation
[100%] Building CXX object apps/annotation/CMakeFiles/opencv_annotation.dir/opencv_annotation.cpp.o
Linking CXX executable ../../bin/opencv_annotation
[100%] Built target opencv_annotation


This will create a jar containing the Java interface (bin/opencv-2411.jar) and a native dynamic library containing Java bindings and all the OpenCV stuff (lib/libopencv_java2411.so). We’ll use these files to build and run OpenCV program.


[cloudera@quickstart opencv-2.4.11]$ ls lib | grep .so
libopencv_java2411.so
[cloudera@quickstart opencv-2.4.11]$ ls bin | grep .jar
opencv-2411.jar


Running OpenCV Face Detection Program



Following steps are run to make sure OpenCV is setup correctly and works as expected.


Create a new directory sample  and Create an ant build.xml file in it.


[cloudera@quickstart opencv]$ mkdir sample && cd sample
[cloudera@quickstart sample]$ vi build.xml

build.xml



<project name="Main" basedir="." default="rebuild-run">


  <property name="src.dir"     value="src"/>


  <property name="lib.dir"     value="${ocvJarDir}"/>
  <path id="classpath">
      <fileset dir="${lib.dir}" includes="**/*.jar"/>
  </path>


  <property name="build.dir"   value="build"/>
  <property name="classes.dir" value="${build.dir}/classes"/>
  <property name="jar.dir"     value="${build.dir}/jar"/>


  <property name="main-class"  value="${ant.project.name}"/>


  <target name="clean">
      <delete dir="${build.dir}"/>
  </target>


  <target name="compile">
      <mkdir dir="${classes.dir}"/>
      <javac includeantruntime="false" srcdir="${src.dir}" destdir="${classes.dir}" classpathref="classpath"/>
  </target>


  <target name="jar" depends="compile">
      <mkdir dir="${jar.dir}"/>
      <jar destfile="${jar.dir}/${ant.project.name}.jar" basedir="${classes.dir}">
          <manifest>
              <attribute name="Main-Class" value="${main-class}"/>
          </manifest>
      </jar>
  </target>


  <target name="run" depends="jar">
      <java fork="true" classname="${main-class}">
          <sysproperty key="java.library.path" path="${ocvLibDir}"/>
          <classpath>
              <path refid="classpath"/>
              <path location="${jar.dir}/${ant.project.name}.jar"/>
          </classpath>
      </java>
  </target>


  <target name="rebuild" depends="clean,jar"/>


  <target name="rebuild-run" depends="clean,run"/>


</project>


Write a program using OpenCV to detect number of faces in an image.


DetectFaces.java



import org.opencv.core.Core;
import org.opencv.core.Mat;
import org.opencv.core.Scalar;
import org.opencv.highgui.*;
import org.opencv.core.MatOfRect;
import org.opencv.core.Point;
import org.opencv.core.Rect;
import org.opencv.objdetect.CascadeClassifier;


import java.io.File;


/**
* Created by dmalav on 4/30/15.
*/
public class DetectFaces {


  public void run(String imageFile) {
      System.out.println("\nRunning DetectFaceDemo");


      // Create a face detector from the cascade file in the resources
      // directory.
      String xmlPath = "/home/cloudera/project/opencv-examples/lbpcascade_frontalface.xml";
      System.out.println(xmlPath);
      CascadeClassifier faceDetector = new CascadeClassifier(xmlPath);
      Mat image = Highgui.imread(imageFile);


      // Detect faces in the image.
      // MatOfRect is a special container class for Rect.
      MatOfRect faceDetections = new MatOfRect();
      faceDetector.detectMultiScale(image, faceDetections);


      System.out.println(String.format("Detected %s faces", faceDetections.toArray().length));


      // Draw a bounding box around each face.
      for (Rect rect : faceDetections.toArray()) {
          Core.rectangle(image, new Point(rect.x, rect.y), new Point(rect.x + rect.width, rect.y + rect.height), new Scalar(0, 255, 0));
      }


      File f = new File(imageFile);
      System.out.println(f.getName());
      // Save the visualized detection.
      String filename = f.getName();
      System.out.println(String.format("Writing %s", filename));
      Highgui.imwrite(filename, image);


  }
}


Main.java



import org.opencv.core.Core;


import java.io.File;


public class Main {


  public static void main(String... args) {


      System.loadLibrary(Core.NATIVE_LIBRARY_NAME);


if (args.length == 0) {
  System.err.println("Usage Main /path/to/images");
  System.exit(1);
}


      File[] files = new File(args[0]).listFiles();
      showFiles(files);
  }


  public static void showFiles(File[] files) {
      DetectFaces faces = new DetectFaces();
      for (File file : files) {
          if (file.isDirectory()) {
              System.out.println("Directory: " + file.getName());
              showFiles(file.listFiles()); // Calls same method again.
          } else {
              System.out.println("File: " + file.getAbsolutePath());
              faces.run(file.getAbsolutePath());
          }
      }
  }
}


Build Face Detection Java Program


[cloudera@quickstart sample]$ant -DocvJarDir=/home/cloudera/Project/opencv/opencv-2.4.11/bin -DocvLibDir=/home/cloudera/Project/opencv/opencv-2.4.11/lib jar
Buildfile: /home/cloudera/Project/opencv/sample/build.xml


compile:
   [mkdir] Created dir: /home/cloudera/Project/opencv/sample/build/classes
   [javac] Compiling 2 source files to /home/cloudera/Project/opencv/sample/build/classes


jar:
   [mkdir] Created dir: /home/cloudera/Project/opencv/sample/build/jar
     [jar] Building jar: /home/cloudera/Project/opencv/sample/build/jar/Main.jar


BUILD SUCCESSFUL
Total time: 3 seconds


This build creates a build/jar/Main.jar file which can be used to detect faces from images stored in a directory:


[cloudera@quickstart sample]$ java -cp ../opencv-2.4.11/bin/opencv-2411.jar:build/jar/Main.jar -Djava.library.path=../opencv-2.4.11/lib Main /mnt/hgfs/CSCEI63/project/images2
File: /mnt/hgfs/CSCEI63/project/images2/addams-family.png


Running DetectFaceDemo
/home/cloudera/Project/opencv/sample/lbpcascade_frontalface.xml
Detected 7 faces
addams-family.png
Writing addams-family.png


OpenCV detected faces




OpenCV does fairly good job detecting front facing faces when lbpcascade_frontalface.xml classifier is used. There are other classifier provided by OpenCV which can detect rotate faces and other face orientations.


5. Configure Hadoop for OpenCV



To run OpenCV Java code the native library Core.NATIVE_LIBRARY_NAME must be added to Hadoop java.library.path, following steps details how to setup OpenCV native library with Hadoop.


Copy OpenCV native library libopencv_java2411.so to /etc/opencv/lib


[cloudera@quickstart opencv]$ pwd
/home/cloudera/Project/opencv
[cloudera@quickstart opencv]$ ls
cmake-3.2.2-Linux-x86_64  opencv-2.4.11  sample  test
[cloudera@quickstart opencv]$ sudo cp opencv-2.4.11/lib/libopencv_java2411.so /etc/opencv/lib/


Setup JAVA_LIBRARY_PATH in /usr/lib/hadoop/libexec/hadoop-config.sh file to point to OpenCV native library.


[cloudera@quickstart Project]$ vi  /usr/lib/hadoop/libexec/hadoop-config.sh
.
.
# setup 'java.library.path' for native-hadoop code if necessary


if [ -d "${HADOOP_PREFIX}/build/native" -o -d "${HADOOP_PREFIX}/$HADOOP_COMMON_LIB_NATIVE_DIR" ]; then


 if [ -d "${HADOOP_PREFIX}/$HADOOP_COMMON_LIB_NATIVE_DIR" ]; then
   if [ "x$JAVA_LIBRARY_PATH" != "x" ]; then
     JAVA_LIBRARY_PATH=${JAVA_LIBRARY_PATH}:${HADOOP_PREFIX}/$HADOOP_COMMON_LIB_NATIVE_DIR
   else
     JAVA_LIBRARY_PATH=${HADOOP_PREFIX}/$HADOOP_COMMON_LIB_NATIVE_DIR
   fi
 fi
fi


# setup opencv native library path
JAVA_LIBRARY_PATH=${JAVA_LIBRARY_PATH}:/etc/opencv/lib


.
.


6. HIPI with OpenCV



This step details Java code for combining HIPI with OpenCV.


HIPI uses HipiImageBundle class to represent collection of images on HDFS, and FloatImage for representing the image in memory. This FloatImage must be converted to OpenCV Mat format for image processing, counting face in this case.


Following method is used to convert FloatImage to Mat:


      // Convert HIPI FloatImage to OpenCV Mat
      public Mat convertFloatImageToOpenCVMat(FloatImage floatImage) {


          // Get dimensions of image
          int w = floatImage.getWidth();
          int h = floatImage.getHeight();


          // Get pointer to image data
          float[] valData = floatImage.getData();


          // Initialize 3 element array to hold RGB pixel average
          double[] rgb = {0.0,0.0,0.0};


          Mat mat = new Mat(h, w, CvType.CV_8UC3);


          // Traverse image pixel data in raster-scan order and update running average
          for (int j = 0; j < h; j++) {
              for (int i = 0; i < w; i++) {
                  rgb[0] = (double) valData[(j*w+i)*3+0] * 255.0; // R
                  rgb[1] = (double) valData[(j*w+i)*3+1] * 255.0; // G
                  rgb[2] = (double) valData[(j*w+i)*3+2] * 255.0; // B
                  mat.put(j, i, rgb);
              }
          }


          return mat;
      }


To count the number of faces from an image we need to create a CascadeClassifier which uses a classifier file. This file must be present on HDFS, this can be accomplished by using Job.addCacheFile method and later retrieve it in Mapper class.


    public int run(String[] args) throws Exception {
     ....


      // Initialize and configure MapReduce job
      Job job = Job.getInstance();
      
      ....
      
      // add cascade file
      job.addCacheFile(new URI("/user/cloudera/lbpcascade_frontalface.xml#lbpcascade_frontalface.xml"));


      // Execute the MapReduce job and block until it complets
      boolean success = job.waitForCompletion(true);


      // Return success or failure
      return success ? 0 : 1;
  }


Override Mapper::setup method to load OpenCV native library and create CascadeClassifier for face detections:


public static class FaceCountMapper extends Mapper<ImageHeader, FloatImage, IntWritable, IntWritable> {


      // Create a face detector from the cascade file in the resources
      // directory.
      private CascadeClassifier faceDetector;


      
      public void setup(Context context)
              throws IOException, InterruptedException {


          // Load OpenCV native library
          try {
              System.loadLibrary(Core.NATIVE_LIBRARY_NAME);
          } catch (UnsatisfiedLinkError e) {
              System.err.println("Native code library failed to load.\n" + e + Core.NATIVE_LIBRARY_NAME);
              System.exit(1);
          }


          // Load cached cascade file for front face detection and create CascadeClassifier
          if (context.getCacheFiles() != null && context.getCacheFiles().length > 0) {
              URI mappingFileUri = context.getCacheFiles()[0];


              if (mappingFileUri != null) {
                  faceDetector = new CascadeClassifier("./lbpcascade_frontalface.xml");


              } else {
                  System.out.println(">>>>>> NO MAPPING FILE");
              }
          } else {
              System.out.println(">>>>>> NO CACHE FILES AT ALL");
          }


          super.setup(context);
      } // setup()
      ....
}


Full listing of FaceCount.java.


Mapper:
  1. Load OpenCV native library
  2. Create CascadeClassifier
  3. Convert HIPI FloatImage to OpenCV Mat
  4. Detect and count faces in the image
  5. Write number of faces detected to context


Reducer:
  1. Count number of files processed
  2. Count number of faces detected
  3. Output number of files and faces detected


FaceCount.java



import hipi.image.FloatImage;
import hipi.image.ImageHeader;
import hipi.imagebundle.mapreduce.ImageBundleInputFormat;


import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;


import org.opencv.core.*;
import org.opencv.objdetect.CascadeClassifier;


import java.io.IOException;
import java.net.URI;


public class FaceCount extends Configured implements Tool {


  public static class FaceCountMapper extends Mapper<ImageHeader, FloatImage, IntWritable, IntWritable> {


      // Create a face detector from the cascade file in the resources
      // directory.
     private CascadeClassifier faceDetector;


      // Convert HIPI FloatImage to OpenCV Mat
      public Mat convertFloatImageToOpenCVMat(FloatImage floatImage) {


          // Get dimensions of image
          int w = floatImage.getWidth();
          int h = floatImage.getHeight();


          // Get pointer to image data
          float[] valData = floatImage.getData();


          // Initialize 3 element array to hold RGB pixel average
          double[] rgb = {0.0,0.0,0.0};


          Mat mat = new Mat(h, w, CvType.CV_8UC3);


          // Traverse image pixel data in raster-scan order and update running average
          for (int j = 0; j < h; j++) {
              for (int i = 0; i < w; i++) {
                  rgb[0] = (double) valData[(j*w+i)*3+0] * 255.0; // R
                  rgb[1] = (double) valData[(j*w+i)*3+1] * 255.0; // G
                  rgb[2] = (double) valData[(j*w+i)*3+2] * 255.0; // B
                  mat.put(j, i, rgb);
              }
          }


          return mat;
      }


      // Count faces in image
      public int countFaces(Mat image) {


          // Detect faces in the image.
          // MatOfRect is a special container class for Rect.
          MatOfRect faceDetections = new MatOfRect();
          faceDetector.detectMultiScale(image, faceDetections);


          return faceDetections.toArray().length;
      }


      public void setup(Context context)
              throws IOException, InterruptedException {


          // Load OpenCV native library
          try {
              System.loadLibrary(Core.NATIVE_LIBRARY_NAME);
          } catch (UnsatisfiedLinkError e) {
              System.err.println("Native code library failed to load.\n" + e + Core.NATIVE_LIBRARY_NAME);
              System.exit(1);
          }


          // Load cached cascade file for front face detection and create CascadeClassifier
          if (context.getCacheFiles() != null && context.getCacheFiles().length > 0) {
              URI mappingFileUri = context.getCacheFiles()[0];


              if (mappingFileUri != null) {
                  faceDetector = new CascadeClassifier("./lbpcascade_frontalface.xml");


              } else {
                  System.out.println(">>>>>> NO MAPPING FILE");
              }
          } else {
              System.out.println(">>>>>> NO CACHE FILES AT ALL");
          }


          super.setup(context);
      } // setup()


      public void map(ImageHeader key, FloatImage value, Context context)
              throws IOException, InterruptedException {


          // Verify that image was properly decoded, is of sufficient size, and has three color channels (RGB)
          if (value != null && value.getWidth() > 1 && value.getHeight() > 1 && value.getBands() == 3) {


              Mat cvImage = this.convertFloatImageToOpenCVMat(value);


              int faces = this.countFaces(cvImage);


              System.out.println(">>>>>> Detected Faces: " + Integer.toString(faces));


              // Emit record to reducer
              context.write(new IntWritable(1), new IntWritable(faces));


          } // If (value != null...


      } // map()
  }


  public static class FaceCountReducer extends Reducer<IntWritable, IntWritable, IntWritable, Text> {
      public void reduce(IntWritable key, Iterable<IntWritable> values, Context context)
              throws IOException, InterruptedException {


           // Initialize a counter and iterate over IntWritable/FloatImage records from mapper
          int total = 0;
          int images = 0;
          for (IntWritable val : values) {
              total += val.get();
              images++;
          }


          String result = String.format("Total face detected: %d", total);
          // Emit output of job which will be written to HDFS
          context.write(new IntWritable(images), new Text(result));
      } // reduce()
  }


  public int run(String[] args) throws Exception {
      // Check input arguments
      if (args.length != 2) {
          System.out.println("Usage: firstprog <input HIB> <output directory>");
          System.exit(0);
      }


      // Initialize and configure MapReduce job
      Job job = Job.getInstance();
      // Set input format class which parses the input HIB and spawns map tasks
      job.setInputFormatClass(ImageBundleInputFormat.class);
      // Set the driver, mapper, and reducer classes which express the computation
      job.setJarByClass(FaceCount.class);
      job.setMapperClass(FaceCountMapper.class);
      job.setReducerClass(FaceCountReducer.class);
      // Set the types for the key/value pairs passed to/from map and reduce layers
      job.setMapOutputKeyClass(IntWritable.class);
      job.setMapOutputValueClass(IntWritable.class);
      job.setOutputKeyClass(IntWritable.class);
      job.setOutputValueClass(Text.class);


      // Set the input and output paths on the HDFS
      FileInputFormat.setInputPaths(job, new Path(args[0]));
      FileOutputFormat.setOutputPath(job, new Path(args[1]));


      // add cascade file
      job.addCacheFile(new URI("/user/cloudera/lbpcascade_frontalface.xml#lbpcascade_frontalface.xml"));


      // Execute the MapReduce job and block until it complets
      boolean success = job.waitForCompletion(true);


      // Return success or failure
      return success ? 0 : 1;
  }


  public static void main(String[] args) throws Exception {


      ToolRunner.run(new FaceCount(), args);
      System.exit(0);
  }


}


7. Build FaceCount.java as facecount.jar



Create new facecount directory in hipi folder (where HIPI was built) and copy FaceCount.java from previous step.


[cloudera@quickstart hipi]$ pwd
/home/cloudera/Project/hipi
[cloudera@quickstart hipi]$ mkdir facecount
[cloudera@quickstart hipi]$ cp /mnt/hgfs/CSCEI63/project/hipi/src/FaceCount.java facecount/
[cloudera@quickstart hipi]$ ls
3rdparty   data      facecount    libsrc       README.md  sample
bin        doc       hipiwrapper  license.txt  release    tool
build.xml  examples  lib          my.diff      run.sh     util
[cloudera@quickstart hipi]$ ls facecount/
FaceCount.java


Make changes to HIPI build.xml ant script to build link to OpenCV jar file and add new build target facecount.


build.xml



<project basedir="." default="all">


<target name="setup">


....


  <!-- opencv dependencies -->
  <property name="opencv.jar" value="../opencv/opencv-2.4.11/bin/opencv-2411.jar" />


  <echo message="Properties set."/>
</target>


<target name="compile" depends="setup,test_settings,hipi">


  <mkdir dir="bin" />
 
  <!-- Compile -->
  <javac debug="yes" nowarn="on" includeantruntime="no" srcdir="${srcdir}" destdir="./bin" classpath="${hadoop.classpath}:./lib/hipi-${hipi.version}.jar:${opencv.jar}">
    <compilerarg value="-Xlint:deprecation" />
  </javac>
 
  <!-- Create the jar -->
  <jar destfile="${jardir}/${jarfilename}" basedir="./bin">
    <zipfileset src="./lib/hipi-${hipi.version}.jar" />
    <zipfileset src="${opencv.jar}" />
    <manifest>
        <attribute name="Main-Class" value="${mainclass}" />
    </manifest>
  </jar>
 
</target>
....


 <target name="facecount">
    <antcall target="compile">
    <param name="srcdir" value="facecount" />
    <param name="jarfilename" value="facecount.jar" />
    <param name="jardir" value="facecount" />
    <param name="mainclass" value="FaceCount" />
    </antcall>
 </target>


 <target name="all" depends="hipi,hibimport,downloader,dumphib,jpegfromhib,createsequencefile,covariance" />


 <!-- Clean -->
 <target name="clean">
   <delete dir="lib" />
   <delete dir="bin" />
   <delete>
     <fileset dir="." includes="examples/*.jar,experiments/*.jar" />
   </delete>
 </target>
</project>


Build FaceCount.java


[cloudera@quickstart hipi]$ ant facecount
Buildfile: /home/cloudera/Project/hipi/build.xml


facecount:


setup:
    [echo] Setting properties for build task...
    [echo] Properties set.


test_settings:
    [echo] Confirming that hadoop settings are set...
    [echo] Properties are specified properly.


hipi:
    [echo] Building the hipi library...


hipi:
   [javac] Compiling 30 source files to /home/cloudera/Project/hipi/lib
     [jar] Building jar: /home/cloudera/Project/hipi/lib/hipi-2.0.jar
    [echo] Hipi library built.


compile:
     [jar] Building jar: /home/cloudera/Project/hipi/facecount/facecount.jar


BUILD SUCCESSFUL
Total time: 12 seconds


Check facecount.jar is built under facecount directory


[cloudera@quickstart hipi]$ ls facecount/
facecount.jar  FaceCount.java


8. Run FaceCount MapReduce job



Setup input images



[cloudera@quickstart hipi]$ ls /mnt/hgfs/CSCEI63/project/images-png/
217.png               eugene.png                    patio.png
221.png               ew-courtney-david.png         people.png
3.png                 ew-friends.png                pict_28.png
….


Create HIB



The primary input type to a HIPI program is a HipiImageBundle (HIB), which stores a collection of images on the Hadoop Distributed File System (HDFS). Use the hibimport tool to create a HIB (project/input.hib) from a collection of images on your local file system located in the directory /mnt/hgfs/CSCEI63/project/images-png/  by executing the following command from the HIPI root directory


[cloudera@quickstart hipi]$ hadoop fs -mkdir project
[cloudera@quickstart hipi]$ hadoop jar tool/hibimport.jar /mnt/hgfs/CSCEI63/project/images-png/ project/input.hib
** added: 217.png
** added: 221.png
** added: 3.png
** added: addams-family.png
** added: aeon1a.png
** added: aerosmith-double.png
.
.
.
** added: werbg04.png
** added: window.png
** added: wxm.png
** added: yellow-pages.png
** added: ysato.png
Created: project/input.hib and project/input.hib.dat


Run MapReduce



Create a run-facecount.sh script to clean previous output file and execute mapreduce job:


run-facecount.sh



#!/bin/bash
hadoop fs -rm -R project/output
hadoop jar facecount/facecount.jar project/input.hib project/output


[cloudera@quickstart hipi]$ bash run-facecount.sh
15/05/12 16:48:06 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
Deleted project/output
15/05/12 16:48:17 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/05/12 16:48:20 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/05/12 16:48:21 INFO input.FileInputFormat: Total input paths to process : 1
Spawned 1map tasks
15/05/12 16:48:22 INFO mapreduce.JobSubmitter: number of splits:1
15/05/12 16:48:22 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1431127776378_0049
15/05/12 16:48:25 INFO impl.YarnClientImpl: Submitted application application_1431127776378_0049
15/05/12 16:48:25 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1431127776378_0049/
15/05/12 16:48:25 INFO mapreduce.Job: Running job: job_1431127776378_0049
15/05/12 16:48:58 INFO mapreduce.Job: Job job_1431127776378_0049 running in uber mode : false
15/05/12 16:48:58 INFO mapreduce.Job:  map 0% reduce 0%
15/05/12 16:49:45 INFO mapreduce.Job:  map 3% reduce 0%
15/05/12 16:50:53 INFO mapreduce.Job:  map 5% reduce 0%
15/05/12 16:50:57 INFO mapreduce.Job:  map 8% reduce 0%
15/05/12 16:51:01 INFO mapreduce.Job:  map 11% reduce 0%
15/05/12 16:51:09 INFO mapreduce.Job:  map 15% reduce 0%
15/05/12 16:51:13 INFO mapreduce.Job:  map 22% reduce 0%
15/05/12 16:51:18 INFO mapreduce.Job:  map 25% reduce 0%
15/05/12 16:51:21 INFO mapreduce.Job:  map 28% reduce 0%
15/05/12 16:51:29 INFO mapreduce.Job:  map 31% reduce 0%
15/05/12 16:51:32 INFO mapreduce.Job:  map 33% reduce 0%
15/05/12 16:51:45 INFO mapreduce.Job:  map 38% reduce 0%
15/05/12 16:51:57 INFO mapreduce.Job:  map 51% reduce 0%
15/05/12 16:52:07 INFO mapreduce.Job:  map 55% reduce 0%
15/05/12 16:52:10 INFO mapreduce.Job:  map 58% reduce 0%
15/05/12 16:52:14 INFO mapreduce.Job:  map 60% reduce 0%
15/05/12 16:52:18 INFO mapreduce.Job:  map 63% reduce 0%
15/05/12 16:52:26 INFO mapreduce.Job:  map 100% reduce 0%
15/05/12 16:52:59 INFO mapreduce.Job:  map 100% reduce 100%
15/05/12 16:53:01 INFO mapreduce.Job: Job job_1431127776378_0049 completed successfully
15/05/12 16:53:02 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=1576
FILE: Number of bytes written=226139
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=35726474
HDFS: Number of bytes written=27
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=205324
Total time spent by all reduces in occupied slots (ms)=29585
Total time spent by all map tasks (ms)=205324
Total time spent by all reduce tasks (ms)=29585
Total vcore-seconds taken by all map tasks=205324
Total vcore-seconds taken by all reduce tasks=29585
Total megabyte-seconds taken by all map tasks=210251776
Total megabyte-seconds taken by all reduce tasks=30295040
Map-Reduce Framework
Map input records=157
Map output records=157
Map output bytes=1256
Map output materialized bytes=1576
Input split bytes=132
Combine input records=0
Combine output records=0
Reduce input groups=1
Reduce shuffle bytes=1576
Reduce input records=157
Reduce output records=1
Spilled Records=314
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=11772
CPU time spent (ms)=45440
Physical memory (bytes) snapshot=564613120
Virtual memory (bytes) snapshot=3050717184
Total committed heap usage (bytes)=506802176
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=35726342
File Output Format Counters
Bytes Written=27


Check results:


[cloudera@quickstart hipi]$ hadoop fs -ls project/output
Found 2 items
-rw-r--r--   1 cloudera cloudera          0 2015-05-12 16:52 project/output/_SUCCESS
-rw-r--r--   1 cloudera cloudera         27 2015-05-12 16:52 project/output/part-r-00000
[cloudera@quickstart hipi]$ hadoop fs -cat project/output/part-r-00000
157 Total face detected: 0


9. Summary



OpenCV provides very rich set of tools for image processing, when combined with HIPI’s efficient and high-throughput parallel image processing power can be a great solution for processing very large image dataset very fast. These tools can help researcher and engineers alike to achieve high performance image processing.


10. Issues



The wrapper function to convert HIPI FloatImage to OpenCV did not work for some reason and not producing the correct image when converted. This was causing a bug of no faces detected. I had contacted HIPI members but did not receive timely reply before finishing this project. This bug is causing my results to show “0” faces detected.


11. Benefits (Pros/Cons)



Pros: HIPI is a great tool for processing very large volume of images in hadoop cluster, when combined with OpenCV it can be very powerful.
Cons: Converting the image format (HIPI FloatImage) to OpenCV Mat format is not straightforward and causing the issues for OpenCV to process images correctly.


12. Lessons Learned


  1. Setting up Cloudera Quickstart VM
  2. Using HIPI to run MapReduce on large volume of images
  3. Using OpenCV in Java environment
  4. Settings up hadoop to load native libraries
  5. Using cached files on HDFS

207 comments:

  1. yes, I like it. I have question for Dinesh Malav. You can use SIFTDectector (Opencv) to find the same image on Hadoop using HIPI?

    ReplyDelete
  2. I am not familiar with SIFTDectector but a wrapper to convert HIPI FloatImage to any required format can be written and used with OpenCV.

    ReplyDelete
    Replies
    1. Hi Dinesh Malav,

      Subject: Image matching with hipi

      I want to know whether is it possible to search a image in the bundle that match to the image I give. Thanks

      Delete
  3. This comment has been removed by the author.

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
  4. Hello Dinesh

    Did you try to run this MapReduce prog. in Hipi using eclipse by having separate classes like driver, mapper and reducer. If yes, please give me the steps of eclipse. Creating with single file.java works fine with me.. any help!

    -Prasad

    ReplyDelete
  5. The Hadoop tutorial you have explained is most useful for begineers who are taking Hadoop Administrator Online Training
    Thank you for sharing Such a good tutorials on Hadoop Image Processing

    ReplyDelete
  6. Apart from learning more about Hadoop at hadoop online training, this blog adds to my learning platforms. Great work done by the webmasters. Thanks for your research and experience sharing on a platform like this.

    ReplyDelete
  7. Some topics covered,may be it helps someone,HDFS is a Java-based file system that provides scalable and reliable data storage,and it was designed to span large clusters of commodity servers. HDFS has demonstrated production scalability of up to 200 PB of storage and a single cluster of 4500 servers, supporting close to a billion files and blocks.
    http://www.computaholics.in/2015/12/hdfs.html
    http://www.computaholics.in/2015/12/mapreduce.html
    http://www.computaholics.in/2015/11/hadoop-fs-commands.html

    ReplyDelete
  8. hi,dinesh
    i have one project in which i have to give image of one person it will search into a video frame by frame,match and give the result like count of face detection,timing of appearance etc. so is it possible with HiPi and opencv??

    ReplyDelete
  9. I have my hipi with build.gradle file..!!! Where do i need to specify the opencv dependencies in hipi instead of build.xml........??????

    ReplyDelete
    Replies
    1. did you get any clue for this?

      Delete
    2. Well i am too facing same issue..did anyone find solution?

      Delete
    3. I think it is because you are using latest HIPI version, Dinesh used HIPI 2.0.
      I don't have solution to fix it though...

      Delete
  10. Does the bug u mentioned is solved

    ReplyDelete
  11. Hey Dinesh, nice tutorial. Very helpful.
    Can you help a bit more. I am getting problem with opencv native library. The library is loaded. But still I am getting the error:

    Exception in thread "main" java.lang.UnsatisfiedLinkError: org.opencv.core.Mat.n_Mat()J
    at org.opencv.core.Mat.n_Mat(Native Method)
    at org.opencv.core.Mat.(Mat.java:24)
    at xugglerTest.readVideoFile.run(readVideoFile.java:100)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
    at xugglerTest.readVideoFile.main(readVideoFile.java:113)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

    ReplyDelete
  12. This comment has been removed by the author.

    ReplyDelete
  13. Hey Dinesh,very nice Cool tutorial!!
    So,could u help bit more to send me a copy of ur demo code.pretty much thanks for you.(fengbosapphire@foxmail.com)!

    ReplyDelete
  14. Hi,
    I'm try to run the jar file but i got exception from container failure.i couldn't fix the error. plz any one help me to resolve this error.....

    ReplyDelete
  15. hi , Does HIPI use datnode , namenode , job tracker and tasktracker ?

    ReplyDelete
  16. hi , Does HIPI use datnode , namenode , job tracker and tasktracker after culling step?

    If yes what would be the flow of processing?

    ReplyDelete
  17. Hi Dinesh,
    Nice tutorial.
    Does HIPI work on Hadoop-1.2.1 ? As i am new to Hadoop I installed basic version Hadoop-1.2.1. So,can I start installing HIPI with this or do I need to install Hadoop-2.6.0.

    ReplyDelete
  18. This comment has been removed by the author.

    ReplyDelete
  19. Hi Dinesh,

    Thanks for sharing a good tutorial for Hadoop, Hipi and OpenCV.

    I am getting the following error while running my program on Hadoop with Hipi:


    Exception in thread "main" java.lang.NoClassDefFoundError: hipi/imagebundle/mapreduce/ImageBundleInputFormat at AveragePixelColor.run(AveragePixelColor.java:114) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at AveragePixelColor.main(AveragePixelColor.java:144) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

    Caused by: java.lang.ClassNotFoundException: hipi.imagebundle.mapreduce.ImageBundleInputFormat at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 10 more


    Hipi JAR file is already included in HADOOP_CLASSPATH and I tried providing the JAR using -libjars as well. No success. Do you have any idea of resolving it.

    Regards,

    ReplyDelete
    Replies
    1. Try this Workaround: sudo cp ~/Downloads/hipi-2.0.jar /usr/lib/hadoop/

      Delete
  20. Really awesome blog. Your blog is really useful for me. Thanks for sharing this informative blog. Keep update your blog.


    Chennai Bigdata Training

    ReplyDelete
  21. hello sir,
    thanks for this awesome tutorial.
    can you please send me the complete build.xml file for HIPI which is shown in step 7 or if possible can you please tell us how to build this FaceCount.java using gradle since the latest github repo of HIPI doesn't contain build.xml. (my mail id: bscniitmunna@gmail.com)

    ReplyDelete
  22. This article describes the Hadoop Software, All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common and should be automatically handled by the framework. This post gives great idea on Hadoop Certification for beginners. Also find best Hadoop Online Training in your locality at StaygreenAcademy.com

    ReplyDelete
  23. Hi Dinesh,

    Issues section describes that the number of faced detected seem to be zero due to some HIPI related errors. Is there any way that we can resolve this. We are trying to build the same use case here. It would be really helpful if we can get a solution to this at the earliest.

    ReplyDelete
    Replies
    1. Hi Afzal,
      I have my hipi with build.gradle file instead of build.xml.Am i missing something.......??????

      Delete
  24. Thanks for providing this informative information you may also refer.
    http://www.s4techno.com/blog/2016/08/13/installing-a-storm-cluster/

    ReplyDelete
  25. This comment has been removed by the author.

    ReplyDelete
  26. I just did you example and It works, I don't know if hipi team has changed something but I'm getting the found faces! Thank you so much.

    I use open cv 3.0.0 instead...

    ReplyDelete
  27. onlineitguru offers job oriented Big data big data hadoop online training and Certification

    Course and become Expert in big data hadoop .

    ReplyDelete
  28. The question was hypothetical. There was no specific job I was thinking of. But after you saying that databases like greenplum allows mixing of map reduce code and sql queries, it suddenly dawned to me that my database might be doing the same as well. But just to know your thoughts because I don’t know, I am currently using MongoDB, do you know if it optimizes like Greenplum does?

    hadoop training in chennai

    ReplyDelete
  29. Thanks for this valuable info. I was going through all the your ppt one of the recommanded ppt for all hadoop learners in

    Hadoop training

    Hadoop Online Training in india|usa|uk

    ReplyDelete
  30. Thanks in advance for the article. However, the VM link is no longer available.

    ReplyDelete
  31. This comment has been removed by the author.

    ReplyDelete
  32. Buildfile: /home/faisal/hipi/sample/build.xml

    BUILD FAILED
    /home/faisal/hipi/sample/build.xml:1: Unexpected element "{}target" {antlib:org.apache.tools.ant}target

    Total time: 0 seconds



    hi i got this error can any one help

    ReplyDelete
  33. These instructions are 2 years old, please use new libraries. I will try to update instruction in with newer libraries.

    ReplyDelete
    Replies
    1. can you direct me to the new libraries or a send a link to new way
      thanks a lot

      Delete
  34. The blog you shared is really good.
    Should be a Big Data Hadoop Developer? Join TechandMate for the Big Data Hadoop Online Tutorial and learn MapReduce structure, HDFS thoughts, Hadoop Cluster, Sqoop, Flume, Hive, Pig and YARN. https://goo.gl/6RHMF2

    ReplyDelete
  35. http://worldofbigdata-inaction.blogspot.in/2017/02/processing-images-in-hadoop-using.html

    This is my blog and explains another issue that I felt while using HIPI and solution to overcome the same

    ReplyDelete
  36. @ Dinesh - Can you please provide any other methods which can be used for image processing in hadoop?

    ReplyDelete
  37. Webtrackker technology is the best IT training institute in NCR. Webtrackker provide training on all latest technology such as hadoop training. Webtrackker is not only training institute but also it also provide best IT solution to his client. Webtrackker provide training by experienced and working in the industry on same technology.Webtrackker Technology C-67 Sector-63 Noida 8802820025

    Hadoop Training institute in indirapuram


    Hadoop Training institute in Noida


    Hadoop Training institute in Ghaziabad


    Hadoop Training institute in Vaishali


    Hadoop Training institute in Vasundhara


    Hadoop Training institute in Delhi South Ex

    ReplyDelete
  38. thank you for sharing such a valuable information.we are very thankful to your articles we are provideing hadoop training.nice blog one of the recommanded blog

    Hadoop Training in hyderabad
    Hadoop Training in ameerpet
    Hadoop Training institute in ameerpet
    Hadoop certification in hyderabad

    ReplyDelete
  39. thank you for sharing this informative blog.. this blog really helpful for everyone.. explanation are clear so easy to understand... I got more useful information from this blog

    hadoop training institute in velachery | big data training institute in velachery | hadoop training in chennai velachery | big data training in chennai velachery

    ReplyDelete
  40. Thanks for sharing the information very useful info about Hadoop and

    keep updating us, Please........

    ReplyDelete
  41. Excellent Information great blog thanks for sharing ..... it is helpful to all big data learners and real time employees.
    Hadoop Online Training

    ReplyDelete
  42. the article provided by you is very nice and it is very helpful to know about the hadoop ..i read a article related to you..once you can check it out
    Hadoop Admin Online Training Hyderabad,Bangalore


    ReplyDelete
  43. Being new to the blogging world I feel like there is still so much to learn. Your tips helped to clarify a few things for me as well as giving..

    Web Designing Training in Chennai

    Java Training in Chennai

    ReplyDelete
  44. Your thinking toward the respective issue is awesome also the idea behind the blog is very interesting which would bring a new evolution in respective field. Thanks for sharing.

    Java Training in Chennai

    VMWare Training in Chennai

    ReplyDelete
  45. These provided information was really so nice,thanks for giving that post and the more skills to develop after refer that post. Your articles really impressed for me,because of all information so nice.


    Hadoop Training in Chennai

    Base SAS Training in Chennai


    ReplyDelete
  46. Besant Technologies conducts exams for an array of companies including Microsoft, Cisco, EXIN, Citrix, HP Expertone, Solaris, Linux, Sun Java and many more. As an authorized Pearson Vue Exam Center in Bangalore.
    Get the practice and confidence you need with our new Adobe, Oracle , Linux, Pegasystems, CompTIA, Nokia Siemens, EMC, Cloudera, NetApp, Zend Technologies to name a few Test bundles. Pearson VUE offers affordable exam preparation bundles for select any certification exams — saving you up to 30% compared to buying separately. Each bundle consists of an exam paired with a corresponding practice test. The bundles can be conveniently purchased when you register for your exam. Pearson Vue Exam Center in Bangalore |
    Pearson Vue Exam Centers in Bangalore |

    ReplyDelete
  47. Thanks for one marvelous posting! I enjoyed reading it; you are a great author. I will make sure to bookmark your blog and may come back someday. I want to encourage that you continue your great posts, have a nice weekend! Besant Technologies offers the best Hadoop Training in Bangalore with the guide of the most gifted and all around experienced experts. Our educators are working in Hadoop and related innovations for a significant number of years in driving multi-national organizations around the globe.

    ReplyDelete
  48. And indeed, I’m just always astounded concerning the remarkable things served by you. Some four facts on this page are undeniably the most effective I’ve had.


    Android training in bangalore

    ReplyDelete
  49. Your thinking toward the respective issue is awesome also the idea behind the blog is very interesting which would bring a new evolution in respective field. Thanks for sharing.

    Hadoop Training in Chennai

    Base SAS Training in Chennai

    MSBI Training in Chennai

    ReplyDelete
  50. This comment has been removed by a blog administrator.

    ReplyDelete
  51. Thanks for sharing this- good stuff.Keep up the great work.

    Thank you
    Big Data Training in Hyderabad

    ReplyDelete
  52. Hi,
    Keep on posting these types of Hadoop articles. Thanks for information .
    Hadoop Training in Hyderabad

    ReplyDelete
  53. Hi,
    Thanks for sharing such a greatful information on Hadoop
    We are expecting more articles from you
    Thank you
    Big Data Analytics Training In Hyderabad
    Big Data Analytics Course In Hyderabad

    ReplyDelete
  54. Hai,
    It's very nice blog
    Thank you for giving valuable information on Hadoop
    I'm expecting much more from you...

    ReplyDelete
  55. Hi,

    I have read your Hadoop blog it's very attractive and impressive.I like it your blog.....
    Than you
    priya

    ReplyDelete
  56. This comment has been removed by the author.

    ReplyDelete
  57. Hi Malav, Great tutorial,

    I have an IllegalArgumentException while running a mapreduce job:

    hadoop jar facecount/facecount.jar HipiprojectHib/TestImageFace.hib project/output

    Brief Error log:

    18/03/29 09:27:59 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/abiodun/.staging/job_1522223364790_0024
    Exception in thread "main" java.lang.IllegalArgumentException: Can not create a Path from an empty string
    at org.apache.hadoop.fs.Path.checkPathArg(Path.java:163)
    at org.apache.hadoop.fs.Path.(Path.java:175)
    at org.apache.hadoop.fs.Path.(Path.java:120)
    at hipi.imagebundle.HipiImageBundle.readBundleHeader(HipiImageBundle.java:364)


    I will appreciate any suggestion to help resolve this issue. Many thanks

    ReplyDelete
  58. This comment has been removed by a blog administrator.

    ReplyDelete
  59. This really has covered a great insight on Python. I found myself lucky to visit your page and came across this insightful read on Python tutorial. Please allow me to share similar work on Python training course . Watch and gain knowledge today.https://www.youtube.com/watch?v=1jMR4cHBwZE

    ReplyDelete
  60. The most effective method to Solve MongoDB Map Reduce Memory Issue with Cognegic's MongoDB Technical Support
    Confronting MongoDB outline memory issue? Or then again some other specialized issue in regards to MongoDB? Try not to freeze, simply unwind and take a profound inhale in light of the fact that we at Cognegic give gives MongoDB Online Support or MongoDB Customer Support USA. Here we created powerful database administration applications to utilize the MongoDB. We have professionally experienced and devoted specialized group who constantly prepared to tackle your concern.
    For More Info: https://cognegicsystems.com/
    Contact Number: 1-800-450-8670
    Email Address- info@cognegicsystems.com
    Company’s Address- 507 Copper Square Drive Bethel Connecticut (USA) 06801

    ReplyDelete
  61. very informative blog and useful article thank you for sharing with us , keep posting
    Big data hadoop online training India

    ReplyDelete
  62. This blog is very useful for me to learn and understand easily. Great and super article.Thanks for sharing this valuable information.Keep sharing.

    zhosters
    Education

    ReplyDelete
  63. Excellent! You provided very useful information in this article. For more Hadoop training in Hyderabad

    ReplyDelete
  64. Nice information thank you,if you want more information please visit our link Java online training Bangalore

    ReplyDelete
  65. Hi,
    Thanks for sharing such an informative blog. I have read your blog and I gathered some needful information from your post. Keep update your blog. Awaiting for your next update.
    sap abap crm training

    ReplyDelete
  66. It's Amazing! Am very Glad to read your blog. Many Will Get Good Kwnoledge After Reading Your Blog With The Good Stuff. Keep Sharing This Type Of Blogs For Further Uses.
    Hadoop Online Training in Delhi
    Hadoop Online Training in Nodia

    ReplyDelete
  67. Each department of CAD have specific programmes which, while completed could provide you with a recognisable qualification that could assist you get a job in anything design enterprise which you would really like.

    AutoCAD training in Noida

    AutoCAD training institute in Noida


    Best AutoCAD training institute in Noida

    ReplyDelete
  68. This concept is a good way to enhance the knowledge.thanks for sharing. please keep it up
    Big Data Hadoop Training

    ReplyDelete
  69. Amazing!!!! Superb work you have done. I appreciate for your patience and the thought which made you to take this topic to explain. Keep posting, Thank you.
    Big Data Hadoop online training in USA,UK, Singapore
    Hadoop online training in Australia

    ReplyDelete
  70. Hi Dinesh, your blog is so informative and so cool. I liked it. Thanks for sharing this great information. Hadoop Big Data Classes in Pune

    ReplyDelete
  71. This blog is the general information for the feature. You got a good work for these blog.We have a developing our creative content of this mind.Thank you for this blog. This for very interesting and useful.
    python training in omr

    python training in annanagar | python training in chennai

    python training in marathahalli | python training in btm layout

    python training in rajaji nagar | python training in jayanagar

    ReplyDelete
  72. This comment has been removed by the author.

    ReplyDelete
  73. Appreciative to you, for sharing those magnificent expressive affirmations. I'll endeavor to do around a motivating force in responding; there's a remarkable game-plan that you've squeezed in articulating the principal objectives, as you charmingly put it. Continue Sharing
    Big Data Hadoop online training in India, Australia, Malaysia.
    Online Hadoop training in USA, UK, Singapore

    ReplyDelete
  74. Nice tutorial. Thanks for sharing the valuable information. it’s really helpful. Who want to learn this blog most helpful. Keep sharing on updated tutorials…
    python training in rajajinagar
    Python training in btm
    Python training in usa

    ReplyDelete
  75. I recently came across your blog and have been reading along. I thought I would leave my first comment.
    java training in marathahalli | java training in btm layout

    java training in jayanagar | java training in electronic city

    ReplyDelete
  76. I prefer to study this kind of material. Nicely written information in this post, the quality of content is fine and the conclusion is lovely. Things are very open and intensely clear explanation of issues
    python training in velachery
    python training institute in chennai

    ReplyDelete
  77. Your good knowledge and kindness in playing with all the pieces were very useful. I don’t know what I would have done if I had not encountered such a step like this.
    DevOps online Training

    ReplyDelete
  78. This comment has been removed by the author.

    ReplyDelete
  79. The blog is so interactive and Informative , you should write more blogs like this Big Data Hadoop Online Training Bangalore

    ReplyDelete
  80. Hi, nice blog.. This was simply superb,, thanks for sharing such a nice post with us...
    DevOps Online Training

    ReplyDelete
  81. I found this post interesting and worth reading. Keep going and putting efforts into good things. Thank you!! Best Python Online Training || Learn Python Course

    ReplyDelete
  82. Hey, Wow all the posts are very informative for the people who visit this site. Good work! We also have a Website. Please feel free to visit our site. Thank you for sharing.Well written article Thank You Sharing with Us pmp training Chennai | pmp training centers in Chennai | pmp training institutes in Chennai | pmp training and certification in Chennai | pmp training in velachery

    ReplyDelete
  83. Really useful information. We are providing best data science online training from industry experts.

    Data Science Classes in Pune

    ReplyDelete
  84. Great article with Nice explanation about it so its very helpful for us thank you for sharing..,


    Best Event Stalls Exhibition in India
    Best Event Technology Services in India

    ReplyDelete

  85. Nice blog..! I really loved reading through this article. Thanks for sharing such
    a amazing post with us and keep blogging...
    Gmat coachining in hyderabad
    Gmat coachining in kukatpally
    Gmat coachining in Banjarahills

    ReplyDelete

  86. Worthful Hadoop tutorial. Appreciate a lot for taking up the pain to write such a quality content on Hadoop tutorial. Just now I watched this similar Hadoop tutorial and I think this will enhance the knowledge of other visitors for sureHadoop Online Training

    ReplyDelete
  87. Thank you for benefiting from time to focus on this kind of, I feel firmly about it and also really like comprehending far more with this particular subject matter. In case doable, when you get know-how, is it possible to thoughts modernizing your site together with far more details? It’s extremely useful to me.
    I really enjoy simply reading all of your weblogs. Simply wanted to inform you that you have people like me who appreciate your work. Definitely a great post I would like to read this
    Data Science training in Chennai
    Data science training in Bangalore
    Data science training in pune
    Data science online training
    Data Science Interview questions and answers
    Data science training in bangalore

    ReplyDelete
  88. I simply wanted to write down a quick word to say thanks to you for those wonderful tips and hints you are showing on this site.
    python Online training in chennai
    python training institute in marathahalli
    python training institute in btm
    Python training course in Chennai

    ReplyDelete

  89. Whoa! I’m enjoying the template/theme of this website. It’s simple, yet effective. A lot of times it’s very hard to get that “perfect balance” between superb usability and visual appeal. I must say you’ve done a very good job with this.

    AWS TRAINING IN BTM LAYOUT | AWS TRAINING IN BANGALORE
    AWS Training in Marathahalli | AWS Training in Bangalore

    ReplyDelete
  90. Read all the information that i've given in above article. It'll give u the whole idea about it.
    AWS Training in pune
    AWS Online Training
    AWS Training in Bangalore

    ReplyDelete
  91. Thanks for sharing the good information and post more information. I need some facilitate to my website. please check once http://talentflames.com/
    training and placemen