HIPI meets OpenCV
Problem
This project tries to solve the problem of processing big data of images on Apache Hadoop using Hadoop Image Processing Interface (HIPI) for storing and efficient distributed processing, combined with OpenCV, an open source library of rich image processing algorithms. A program to count number of faces in collection of images is demonstrated.
Background
Processing large set of images on a single machine can be very time consuming and costly. HIPI is an image processing library designed to be used with the Apache Hadoop MapReduce, a software framework for sorting and processing big data in a distributed fashion on large cluster of commodity hardware. HIPI facilitates efficient and high-throughput image processing with MapReduce style parallel programs typically executed on a cluster. It provides a solution for how to store a large collection of images on the Hadoop Distributed File System (HDFS) and make them available for efficient distributed processing.
OpenCV (Open Source Computer Vision) is an open source library of rich image processing algorithms, mainly aimed at real time computer vision. Starting with OpenCV 2.4.4, OpenCV supports Java Development which can be used with Apache Hadoop.
Goal
This project demonstrates how HIPI and OpenCV can be used together to count total number of faces in big image dataset.
Overview of Steps
Big Data Set
Test images for face detection
Input: Image data containing 158 image (34MB)
Format: png image files
Downloaded image were in gif format, I used Mac OSX Preview program to convert these to png format.
Other sources for face detection image datasets:
Technologies Used:
Software Used
|
Purpose
|
VMWare Fusion
|
Software hypervisor for running Cloudera Quickstart VM
|
Cloudera Quickstart VM
|
VM for single node Hadoop cluster for testing and running map/reduce programs
|
IntelliJ IDEA 14 CE
|
Java IDE for editing and compiling Java code
|
Hadoop Image Processing Interface (HIPI)
|
Image processing library designed to be used with the Apache Hadoop MapReduce parallel programming framework, for storing large collection of images on HDFS and efficient distributed processing.
|
OpenCV
|
An image processing library aimed at real-time computer vision
|
Apache Hadoop
|
Distributed processing of large data sets
|
References:
Steps
1. Download VMWare Fusion
VMware Fusion is a software hypervisor developed by VMware for computers running OS X with Intel processors. Fusion allows Intel-based Macs to run operating systems such as Microsoft Windows, Linux, NetWare, or Solaris on virtual machines, along with their Mac OS X operating system using a combination of paravirtualization,hardware virtualization and dynamic recompilation. (http://en.wikipedia.org/wiki/VMware_Fusion)
Download and install VMWare Fusion from following URL, this will be used to run Cloudera Quickstart VM.
2. Download and Setup Cloudera Quickstart VM 5.4.x
The Cloudera QuickStart VMs contain a single-node Apache Hadoop cluster, complete with example data, queries, scripts, and Cloudera Manager to manage your cluster. The VMs run CentOS 6.4 and are available for VMware, VirtualBox, and KVM. This will help us gettings started with all the tools needed to run image processing using Hadoop.
Download Cloudera Quickstart VM 5.4.x from following URL. The Quickstart VM 5.4 has Hadoop 2.6 installed on it, which is needed for HIPI.
http://www.cloudera.com/content/cloudera/en/downloads/quickstart_vms/cdh-5-4-x.html
Current link: http://www.cloudera.com/downloads/quickstart_vms/5-8.html
Current link: http://www.cloudera.com/downloads/quickstart_vms/5-8.html
Open VMWare Fusion and open the VM
3. Getting started with HIPI
Following steps demonstrates how to setup HIPI to run MapReduce job on Apache Hadoop.
Setup Hadoop
The Cloudera Quickstart VM 5.4.x comes pre-installed with Hadoop 2.6 which is needed needed for running HIPI.
Check Hadoop is installed and has correct version:
[cloudera@quickstart Project]$ which hadoop
/usr/bin/hadoop
[cloudera@quickstart Project]$ hadoop version
Hadoop 2.6.0-cdh5.4.0
Subversion http://github.com/cloudera/hadoop -r c788a14a5de9ecd968d1e2666e8765c5f018c271
Compiled by jenkins on 2015-04-21T19:18Z
Compiled with protoc 2.5.0
From source with checksum cd78f139c66c13ab5cee96e15a629025
This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.4.0.jar
Install Apache Ant
Install Apache Ant and check it added to PATH:
[cloudera@quickstart Project]$ which ant
/usr/local/apache-ant/apache-ant-1.9.2/bin/ant
Install and build HIPI
There are two ways to install HIPI
- Clone the latest HIPI distribution from GitHub and build from source. (https://github.com/uvagfx/hipi)
Clone HIPI GitHub repository
The best way to check and verify that your system is properly setup is to clone the official GitHub repository and build the tools and example programs.
[cloudera@quickstart Project]$ git clone https://github.com/uvagfx/hipi.git
Initialized empty Git repository in /home/cloudera/Project/hipi/.git/
remote: Counting objects: 2882, done.
remote: Total 2882 (delta 0), reused 0 (delta 0), pack-reused 2882
Receiving objects: 100% (2882/2882), 222.33 MiB | 7.03 MiB/s, done.
Resolving deltas: 100% (1767/1767), done.
Download Apache Hadoop tarball
Download Apache Hadoop tarball from following URL and untar it, this is needed to build HIPI.
[cloudera@quickstart Project]$ tar -xvzf /mnt/hgfs/CSCEI63/project/hadoop-2.6.0-cdh5.4.0.tar.gz
[cloudera@quickstart Project]$ ls
hadoop-2.6.0-cdh5.4.0 hipi
Build HIPI binaries
Change directory to hipi repo and build HIPI.
[cloudera@quickstart Project]$ cd hipi/
[cloudera@quickstart hipi]$ ls
3rdparty data examples license.txt release
build.xml doc libsrc README.md util
Before building the HIPI, hadoop.home and hadoop.version properties in build.xml file should be updated to the path to Hadoop installation and the version of Hadoop we are using. Change:
build.xml
<!-- IMPORTANT: You must update the following two properties according to your Hadoop setup -->
<!-- <property name="hadoop.home" value="/usr/local/Cellar/hadoop/2.6.0/libexec/share/hadoop" /> -->
<!-- <property name="hadoop.version" value="2.6.0" /> -->
to
<!-- IMPORTANT: You must update the following two properties according to your Hadoop setup -->
<property name="hadoop.home" value="/home/cloudera/Project/hadoop-2.6.0-cdh5.4.0/share/hadoop" />
<property name="hadoop.version" value="2.6.0-cdh5.4.0" />
Build HIPI using ant
[cloudera@quickstart hipi]$ ant
Buildfile: /home/cloudera/Project/hipi/build.xml
…
hipi:
[javac] Compiling 30 source files to /home/cloudera/Project/hipi/lib
[jar] Building jar: /home/cloudera/Project/hipi/lib/hipi-2.0.jar
[echo] Hipi library built.
compile:
[javac] Compiling 1 source file to /home/cloudera/Project/hipi/bin
[jar] Building jar: /home/cloudera/Project/hipi/examples/covariance.jar
[echo] Covariance built.
all:
BUILD SUCCESSFUL
Total time: 36 seconds
Make sure it builds tools and examples:
[cloudera@quickstart hipi]$ ls
3rdparty build.xml doc lib license.txt release util
bin data examples libsrc README.md tool
[cloudera@quickstart hipi]$ ls tool/
hibimport.jar
[cloudera@quickstart hipi]$ ls examples/
covariance.jar hipi runCreateSequenceFile.sh
createsequencefile.jar jpegfromhib.jar runDownloader.sh
downloader.jar rumDumpHib.sh runJpegFromHib.sh
dumphib.jar runCovariance.sh testimages.txt
Sample MapReduce Java Program
Create SampleProgram.java class in “sample” folder to run a simple map/reduce program using sample.hib created in above step:
[cloudera@quickstart hipi]$ mkdir sample
[cloudera@quickstart hipi]$ vi sample/SampleProgram.java
SampleProgram.java
import hipi.image.FloatImage;
import hipi.image.ImageHeader;
import hipi.imagebundle.mapreduce.ImageBundleInputFormat;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import java.io.IOException;
public class SampleProgram extends Configured implements Tool {
public static class SampleProgramMapper extends Mapper<ImageHeader, FloatImage, IntWritable, FloatImage> {
public void map(ImageHeader key, FloatImage value, Context context)
throws IOException, InterruptedException {
// Verify that image was properly decoded, is of sufficient size, and has three color channels (RGB)
if (value != null && value.getWidth() > 1 && value.getHeight() > 1 && value.getBands() == 3) {
// Get dimensions of image
int w = value.getWidth();
int h = value.getHeight();
// Get pointer to image data
float[] valData = value.getData();
// Initialize 3 element array to hold RGB pixel average
float[] avgData = {0,0,0};
// Traverse image pixel data in raster-scan order and update running average
for (int j = 0; j < h; j++) {
for (int i = 0; i < w; i++) {
avgData[0] += valData[(j*w+i)*3+0]; // R
avgData[1] += valData[(j*w+i)*3+1]; // G
avgData[2] += valData[(j*w+i)*3+2]; // B
}
}
// Create a FloatImage to store the average value
FloatImage avg = new FloatImage(1, 1, 3, avgData);
// Divide by number of pixels in image
avg.scale(1.0f/(float)(w*h));
// Emit record to reducer
context.write(new IntWritable(1), avg);
} // If (value != null...
} // map()
}
public static class SampleProgramReducer extends Reducer<IntWritable, FloatImage, IntWritable, Text> {
public void reduce(IntWritable key, Iterable<FloatImage> values, Context context)
throws IOException, InterruptedException {
// Create FloatImage object to hold final result
FloatImage avg = new FloatImage(1, 1, 3);
// Initialize a counter and iterate over IntWritable/FloatImage records from mapper
int total = 0;
for (FloatImage val : values) {
avg.add(val);
total++;
}
if (total > 0) {
// Normalize sum to obtain average
avg.scale(1.0f / total);
// Assemble final output as string
float[] avgData = avg.getData();
String result = String.format("Average pixel value: %f %f %f", avgData[0], avgData[1], avgData[2]);
// Emit output of job which will be written to HDFS
context.write(key, new Text(result));
}
} // reduce()
}
public int run(String[] args) throws Exception {
// Check input arguments
if (args.length != 2) {
System.out.println("Usage: firstprog <input HIB> <output directory>");
System.exit(0);
}
// Initialize and configure MapReduce job
Job job = Job.getInstance();
// Set input format class which parses the input HIB and spawns map tasks
job.setInputFormatClass(ImageBundleInputFormat.class);
// Set the driver, mapper, and reducer classes which express the computation
job.setJarByClass(SampleProgram.class);
job.setMapperClass(SampleProgramMapper.class);
job.setReducerClass(SampleProgramReducer.class);
// Set the types for the key/value pairs passed to/from map and reduce layers
job.setMapOutputKeyClass(IntWritable.class);
job.setMapOutputValueClass(FloatImage.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(Text.class);
// Set the input and output paths on the HDFS
FileInputFormat.setInputPaths(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
// Execute the MapReduce job and block until it complets
boolean success = job.waitForCompletion(true);
// Return success or failure
return success ? 0 : 1;
}
public static void main(String[] args) throws Exception {
ToolRunner.run(new SampleProgram(), args);
System.exit(0);
}
}
Add a new build target to hipi/build.xml to build SampleProgram.java and create a sample.jar library:
build.xml
…
<target name="sample">
<antcall target="compile">
<param name="srcdir" value="sample" />
<param name="jarfilename" value="sample.jar" />
<param name="jardir" value="sample" />
<param name="mainclass" value="SampleProgram" />
</antcall>
</target>
...
Build SampleProgram
[cloudera@quickstart hipi]$ ant sample
Buildfile: /home/cloudera/Project/hipi/build.xml
...
compile:
[jar] Building jar: /home/cloudera/Project/hipi/sample/sample.jar
BUILD SUCCESSFUL
Total time: 16 seconds
Running a sample HIPI MapReduce Program
Create a sample.hib on HDFS file from sample images provides with HIPI using hibimport tool, this will be the input to MapReduce program.
[cloudera@quickstart hipi]$ hadoop jar tool/hibimport.jar data/test/ImageBundleTestCase/read/
0.jpg 1.jpg 2.jpg 3.jpg
[cloudera@quickstart hipi]$ hadoop jar tool/hibimport.jar data/test/ImageBundleTestCase/read examples/sample.hib
** added: 2.jpg
** added: 3.jpg
** added: 0.jpg
** added: 1.jpg
Created: examples/sample.hib and examples/sample.hib.dat
[cloudera@quickstart hipi]$ hadoop fs -ls examples
Found 2 items
-rw-r--r-- 1 cloudera cloudera 80 2015-05-09 22:19 examples/sample.hib
-rw-r--r-- 1 cloudera cloudera 1479345 2015-05-09 22:19 examples/sample.hib.dat
Running Hadoop MapReduce program
[cloudera@quickstart hipi]$ hadoop jar sample/sample.jar examples/sample.hib examples/output
15/05/09 23:05:10 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
…
15/05/09 23:07:05 INFO mapreduce.Job: Job job_1431127776378_0001 running in uber mode : false
15/05/09 23:07:05 INFO mapreduce.Job: map 0% reduce 0%
15/05/09 23:08:55 INFO mapreduce.Job: map 57% reduce 0%
15/05/09 23:09:04 INFO mapreduce.Job: map 100% reduce 0%
15/05/09 23:09:38 INFO mapreduce.Job: map 100% reduce 100%
15/05/09 23:09:39 INFO mapreduce.Job: Job job_1431127776378_0001 completed successfully
…
File Output Format Counters
Bytes Written=50
Check the program output:
[cloudera@quickstart hipi]$ hadoop fs -ls examples/output
Found 2 items
-rw-r--r-- 1 cloudera cloudera 0 2015-05-09 23:09 examples/output/_SUCCESS
-rw-r--r-- 1 cloudera cloudera 50 2015-05-09 23:09 examples/output/part-r-00000
The average pixel value calculated for all the image is:
[cloudera@quickstart hipi]$ hadoop fs -cat examples/output/part-r-00000
1 Average pixel value: 0.420624 0.404933 0.380449
4. Getting started with OpenCV using Java
OpenCV is an image processing library. It contains a large collection of image processing functions. Starting from version 2.4.4 OpenCV includes desktop Java bindings. We will use version 2.4.11 to build Java bindings and use it with HIPI to run image processing.
Download OpenCV source:
The zip bundle for OpenCV 2.4.11 source can be download from following URL and unzip it to ~/Project/opencv directory.
[cloudera@quickstart Project]$ mkdir opencv && cd opencv
[cloudera@quickstart opencv]$ unzip /mnt/hgfs/CSCEI63/project/opencv-2.4.11.zip
CMake build system
CMake is a family of tools designed to build, test and package software. CMake is used to control the software compilation process using simple platform and compiler independent configuration files. CMake generates native makefiles and workspaces that can be used in the compiler environment of your choice. (http://www.cmake.org/)
The OpenCV is built with CMake, download CMake binaries from http://www.cmake.org/download/ and untar the tar bundles to ~/Project/opencv directory:
[cloudera@quickstart opencv]$ tar -xvzf cmake-3.2.2-Linux-x86_64.tar.gz
Build OpenCV for Java
Following steps details how to build OpenCV on linux
Configure OpenCV for builds on Linux
[cloudera@quickstart opencv-2.4.11]$ ../cmake-3.2.2-Linux-x86_64/bin/cmake -DBUILD_SHARED_LIBS=OFF
.
.
. Target "opencv_haartraining_engine" links to itself.
This warning is for project developers. Use -Wno-dev to suppress it.
-- Generating done
-- Build files have been written to: /home/cloudera/Project/opencv/opencv-2.4.11
Build OpenCV
[cloudera@quickstart opencv-2.4.11]$ make
.
.
.
[100%] Building CXX object apps/traincascade/CMakeFiles/opencv_traincascade.dir/imagestorage.cpp.o
Linking CXX executable ../../bin/opencv_traincascade
[100%] Built target opencv_traincascade
Scanning dependencies of target opencv_annotation
[100%] Building CXX object apps/annotation/CMakeFiles/opencv_annotation.dir/opencv_annotation.cpp.o
Linking CXX executable ../../bin/opencv_annotation
[100%] Built target opencv_annotation
This will create a jar containing the Java interface (bin/opencv-2411.jar) and a native dynamic library containing Java bindings and all the OpenCV stuff (lib/libopencv_java2411.so). We’ll use these files to build and run OpenCV program.
[cloudera@quickstart opencv-2.4.11]$ ls lib | grep .so
libopencv_java2411.so
[cloudera@quickstart opencv-2.4.11]$ ls bin | grep .jar
opencv-2411.jar
Running OpenCV Face Detection Program
Following steps are run to make sure OpenCV is setup correctly and works as expected.
Create a new directory sample and Create an ant build.xml file in it.
[cloudera@quickstart opencv]$ mkdir sample && cd sample
[cloudera@quickstart sample]$ vi build.xml
build.xml
<project name="Main" basedir="." default="rebuild-run">
<property name="src.dir" value="src"/>
<property name="lib.dir" value="${ocvJarDir}"/>
<path id="classpath">
<fileset dir="${lib.dir}" includes="**/*.jar"/>
</path>
<property name="build.dir" value="build"/>
<property name="classes.dir" value="${build.dir}/classes"/>
<property name="jar.dir" value="${build.dir}/jar"/>
<property name="main-class" value="${ant.project.name}"/>
<target name="clean">
<delete dir="${build.dir}"/>
</target>
<target name="compile">
<mkdir dir="${classes.dir}"/>
<javac includeantruntime="false" srcdir="${src.dir}" destdir="${classes.dir}" classpathref="classpath"/>
</target>
<target name="jar" depends="compile">
<mkdir dir="${jar.dir}"/>
<jar destfile="${jar.dir}/${ant.project.name}.jar" basedir="${classes.dir}">
<manifest>
<attribute name="Main-Class" value="${main-class}"/>
</manifest>
</jar>
</target>
<target name="run" depends="jar">
<java fork="true" classname="${main-class}">
<sysproperty key="java.library.path" path="${ocvLibDir}"/>
<classpath>
<path refid="classpath"/>
<path location="${jar.dir}/${ant.project.name}.jar"/>
</classpath>
</java>
</target>
<target name="rebuild" depends="clean,jar"/>
<target name="rebuild-run" depends="clean,run"/>
</project>
Write a program using OpenCV to detect number of faces in an image.
DetectFaces.java
import org.opencv.core.Core;
import org.opencv.core.Mat;
import org.opencv.core.Scalar;
import org.opencv.highgui.*;
import org.opencv.core.MatOfRect;
import org.opencv.core.Point;
import org.opencv.core.Rect;
import org.opencv.objdetect.CascadeClassifier;
import java.io.File;
/**
* Created by dmalav on 4/30/15.
*/
public class DetectFaces {
public void run(String imageFile) {
System.out.println("\nRunning DetectFaceDemo");
// Create a face detector from the cascade file in the resources
// directory.
String xmlPath = "/home/cloudera/project/opencv-examples/lbpcascade_frontalface.xml";
System.out.println(xmlPath);
CascadeClassifier faceDetector = new CascadeClassifier(xmlPath);
Mat image = Highgui.imread(imageFile);
// Detect faces in the image.
// MatOfRect is a special container class for Rect.
MatOfRect faceDetections = new MatOfRect();
faceDetector.detectMultiScale(image, faceDetections);
System.out.println(String.format("Detected %s faces", faceDetections.toArray().length));
// Draw a bounding box around each face.
for (Rect rect : faceDetections.toArray()) {
Core.rectangle(image, new Point(rect.x, rect.y), new Point(rect.x + rect.width, rect.y + rect.height), new Scalar(0, 255, 0));
}
File f = new File(imageFile);
System.out.println(f.getName());
// Save the visualized detection.
String filename = f.getName();
System.out.println(String.format("Writing %s", filename));
Highgui.imwrite(filename, image);
}
}
Main.java
import org.opencv.core.Core;
import java.io.File;
public class Main {
public static void main(String... args) {
System.loadLibrary(Core.NATIVE_LIBRARY_NAME);
if (args.length == 0) {
System.err.println("Usage Main /path/to/images");
System.exit(1);
}
File[] files = new File(args[0]).listFiles();
showFiles(files);
}
public static void showFiles(File[] files) {
DetectFaces faces = new DetectFaces();
for (File file : files) {
if (file.isDirectory()) {
System.out.println("Directory: " + file.getName());
showFiles(file.listFiles()); // Calls same method again.
} else {
System.out.println("File: " + file.getAbsolutePath());
faces.run(file.getAbsolutePath());
}
}
}
}
Build Face Detection Java Program
[cloudera@quickstart sample]$ant -DocvJarDir=/home/cloudera/Project/opencv/opencv-2.4.11/bin -DocvLibDir=/home/cloudera/Project/opencv/opencv-2.4.11/lib jar
Buildfile: /home/cloudera/Project/opencv/sample/build.xml
compile:
[mkdir] Created dir: /home/cloudera/Project/opencv/sample/build/classes
[javac] Compiling 2 source files to /home/cloudera/Project/opencv/sample/build/classes
jar:
[mkdir] Created dir: /home/cloudera/Project/opencv/sample/build/jar
[jar] Building jar: /home/cloudera/Project/opencv/sample/build/jar/Main.jar
BUILD SUCCESSFUL
Total time: 3 seconds
This build creates a build/jar/Main.jar file which can be used to detect faces from images stored in a directory:
[cloudera@quickstart sample]$ java -cp ../opencv-2.4.11/bin/opencv-2411.jar:build/jar/Main.jar -Djava.library.path=../opencv-2.4.11/lib Main /mnt/hgfs/CSCEI63/project/images2
File: /mnt/hgfs/CSCEI63/project/images2/addams-family.png
Running DetectFaceDemo
/home/cloudera/Project/opencv/sample/lbpcascade_frontalface.xml
Detected 7 faces
addams-family.png
Writing addams-family.png
OpenCV detected faces
OpenCV does fairly good job detecting front facing faces when lbpcascade_frontalface.xml classifier is used. There are other classifier provided by OpenCV which can detect rotate faces and other face orientations.
5. Configure Hadoop for OpenCV
To run OpenCV Java code the native library Core.NATIVE_LIBRARY_NAME must be added to Hadoop java.library.path, following steps details how to setup OpenCV native library with Hadoop.
Copy OpenCV native library libopencv_java2411.so to /etc/opencv/lib
[cloudera@quickstart opencv]$ pwd
/home/cloudera/Project/opencv
[cloudera@quickstart opencv]$ ls
cmake-3.2.2-Linux-x86_64 opencv-2.4.11 sample test
[cloudera@quickstart opencv]$ sudo cp opencv-2.4.11/lib/libopencv_java2411.so /etc/opencv/lib/
Setup JAVA_LIBRARY_PATH in /usr/lib/hadoop/libexec/hadoop-config.sh file to point to OpenCV native library.
[cloudera@quickstart Project]$ vi /usr/lib/hadoop/libexec/hadoop-config.sh
.
.
# setup 'java.library.path' for native-hadoop code if necessary
if [ -d "${HADOOP_PREFIX}/build/native" -o -d "${HADOOP_PREFIX}/$HADOOP_COMMON_LIB_NATIVE_DIR" ]; then
if [ -d "${HADOOP_PREFIX}/$HADOOP_COMMON_LIB_NATIVE_DIR" ]; then
if [ "x$JAVA_LIBRARY_PATH" != "x" ]; then
JAVA_LIBRARY_PATH=${JAVA_LIBRARY_PATH}:${HADOOP_PREFIX}/$HADOOP_COMMON_LIB_NATIVE_DIR
else
JAVA_LIBRARY_PATH=${HADOOP_PREFIX}/$HADOOP_COMMON_LIB_NATIVE_DIR
fi
fi
fi
# setup opencv native library path
JAVA_LIBRARY_PATH=${JAVA_LIBRARY_PATH}:/etc/opencv/lib
.
.
6. HIPI with OpenCV
This step details Java code for combining HIPI with OpenCV.
HIPI uses HipiImageBundle class to represent collection of images on HDFS, and FloatImage for representing the image in memory. This FloatImage must be converted to OpenCV Mat format for image processing, counting face in this case.
Following method is used to convert FloatImage to Mat:
// Convert HIPI FloatImage to OpenCV Mat
public Mat convertFloatImageToOpenCVMat(FloatImage floatImage) {
// Get dimensions of image
int w = floatImage.getWidth();
int h = floatImage.getHeight();
// Get pointer to image data
float[] valData = floatImage.getData();
// Initialize 3 element array to hold RGB pixel average
double[] rgb = {0.0,0.0,0.0};
Mat mat = new Mat(h, w, CvType.CV_8UC3);
// Traverse image pixel data in raster-scan order and update running average
for (int j = 0; j < h; j++) {
for (int i = 0; i < w; i++) {
rgb[0] = (double) valData[(j*w+i)*3+0] * 255.0; // R
rgb[1] = (double) valData[(j*w+i)*3+1] * 255.0; // G
rgb[2] = (double) valData[(j*w+i)*3+2] * 255.0; // B
mat.put(j, i, rgb);
}
}
return mat;
}
To count the number of faces from an image we need to create a CascadeClassifier which uses a classifier file. This file must be present on HDFS, this can be accomplished by using Job.addCacheFile method and later retrieve it in Mapper class.
public int run(String[] args) throws Exception {
....
// Initialize and configure MapReduce job
Job job = Job.getInstance();
....
// add cascade file
job.addCacheFile(new URI("/user/cloudera/lbpcascade_frontalface.xml#lbpcascade_frontalface.xml"));
// Execute the MapReduce job and block until it complets
boolean success = job.waitForCompletion(true);
// Return success or failure
return success ? 0 : 1;
}
Override Mapper::setup method to load OpenCV native library and create CascadeClassifier for face detections:
public static class FaceCountMapper extends Mapper<ImageHeader, FloatImage, IntWritable, IntWritable> {
// Create a face detector from the cascade file in the resources
// directory.
private CascadeClassifier faceDetector;
public void setup(Context context)
throws IOException, InterruptedException {
// Load OpenCV native library
try {
System.loadLibrary(Core.NATIVE_LIBRARY_NAME);
} catch (UnsatisfiedLinkError e) {
System.err.println("Native code library failed to load.\n" + e + Core.NATIVE_LIBRARY_NAME);
System.exit(1);
}
// Load cached cascade file for front face detection and create CascadeClassifier
if (context.getCacheFiles() != null && context.getCacheFiles().length > 0) {
URI mappingFileUri = context.getCacheFiles()[0];
if (mappingFileUri != null) {
faceDetector = new CascadeClassifier("./lbpcascade_frontalface.xml");
} else {
System.out.println(">>>>>> NO MAPPING FILE");
}
} else {
System.out.println(">>>>>> NO CACHE FILES AT ALL");
}
super.setup(context);
} // setup()
....
}
Full listing of FaceCount.java.
Mapper:
- Load OpenCV native library
- Create CascadeClassifier
- Convert HIPI FloatImage to OpenCV Mat
- Detect and count faces in the image
- Write number of faces detected to context
Reducer:
- Count number of files processed
- Count number of faces detected
- Output number of files and faces detected
FaceCount.java
import hipi.image.FloatImage;
import hipi.image.ImageHeader;
import hipi.imagebundle.mapreduce.ImageBundleInputFormat;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.opencv.core.*;
import org.opencv.objdetect.CascadeClassifier;
import java.io.IOException;
import java.net.URI;
public class FaceCount extends Configured implements Tool {
public static class FaceCountMapper extends Mapper<ImageHeader, FloatImage, IntWritable, IntWritable> {
// Create a face detector from the cascade file in the resources
// directory.
private CascadeClassifier faceDetector;
// Convert HIPI FloatImage to OpenCV Mat
public Mat convertFloatImageToOpenCVMat(FloatImage floatImage) {
// Get dimensions of image
int w = floatImage.getWidth();
int h = floatImage.getHeight();
// Get pointer to image data
float[] valData = floatImage.getData();
// Initialize 3 element array to hold RGB pixel average
double[] rgb = {0.0,0.0,0.0};
Mat mat = new Mat(h, w, CvType.CV_8UC3);
// Traverse image pixel data in raster-scan order and update running average
for (int j = 0; j < h; j++) {
for (int i = 0; i < w; i++) {
rgb[0] = (double) valData[(j*w+i)*3+0] * 255.0; // R
rgb[1] = (double) valData[(j*w+i)*3+1] * 255.0; // G
rgb[2] = (double) valData[(j*w+i)*3+2] * 255.0; // B
mat.put(j, i, rgb);
}
}
return mat;
}
// Count faces in image
public int countFaces(Mat image) {
// Detect faces in the image.
// MatOfRect is a special container class for Rect.
MatOfRect faceDetections = new MatOfRect();
faceDetector.detectMultiScale(image, faceDetections);
return faceDetections.toArray().length;
}
public void setup(Context context)
throws IOException, InterruptedException {
// Load OpenCV native library
try {
System.loadLibrary(Core.NATIVE_LIBRARY_NAME);
} catch (UnsatisfiedLinkError e) {
System.err.println("Native code library failed to load.\n" + e + Core.NATIVE_LIBRARY_NAME);
System.exit(1);
}
// Load cached cascade file for front face detection and create CascadeClassifier
if (context.getCacheFiles() != null && context.getCacheFiles().length > 0) {
URI mappingFileUri = context.getCacheFiles()[0];
if (mappingFileUri != null) {
faceDetector = new CascadeClassifier("./lbpcascade_frontalface.xml");
} else {
System.out.println(">>>>>> NO MAPPING FILE");
}
} else {
System.out.println(">>>>>> NO CACHE FILES AT ALL");
}
super.setup(context);
} // setup()
public void map(ImageHeader key, FloatImage value, Context context)
throws IOException, InterruptedException {
// Verify that image was properly decoded, is of sufficient size, and has three color channels (RGB)
if (value != null && value.getWidth() > 1 && value.getHeight() > 1 && value.getBands() == 3) {
Mat cvImage = this.convertFloatImageToOpenCVMat(value);
int faces = this.countFaces(cvImage);
System.out.println(">>>>>> Detected Faces: " + Integer.toString(faces));
// Emit record to reducer
context.write(new IntWritable(1), new IntWritable(faces));
} // If (value != null...
} // map()
}
public static class FaceCountReducer extends Reducer<IntWritable, IntWritable, IntWritable, Text> {
public void reduce(IntWritable key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
// Initialize a counter and iterate over IntWritable/FloatImage records from mapper
int total = 0;
int images = 0;
for (IntWritable val : values) {
total += val.get();
images++;
}
String result = String.format("Total face detected: %d", total);
// Emit output of job which will be written to HDFS
context.write(new IntWritable(images), new Text(result));
} // reduce()
}
public int run(String[] args) throws Exception {
// Check input arguments
if (args.length != 2) {
System.out.println("Usage: firstprog <input HIB> <output directory>");
System.exit(0);
}
// Initialize and configure MapReduce job
Job job = Job.getInstance();
// Set input format class which parses the input HIB and spawns map tasks
job.setInputFormatClass(ImageBundleInputFormat.class);
// Set the driver, mapper, and reducer classes which express the computation
job.setJarByClass(FaceCount.class);
job.setMapperClass(FaceCountMapper.class);
job.setReducerClass(FaceCountReducer.class);
// Set the types for the key/value pairs passed to/from map and reduce layers
job.setMapOutputKeyClass(IntWritable.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(Text.class);
// Set the input and output paths on the HDFS
FileInputFormat.setInputPaths(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
// add cascade file
job.addCacheFile(new URI("/user/cloudera/lbpcascade_frontalface.xml#lbpcascade_frontalface.xml"));
// Execute the MapReduce job and block until it complets
boolean success = job.waitForCompletion(true);
// Return success or failure
return success ? 0 : 1;
}
public static void main(String[] args) throws Exception {
ToolRunner.run(new FaceCount(), args);
System.exit(0);
}
}
7. Build FaceCount.java as facecount.jar
Create new facecount directory in hipi folder (where HIPI was built) and copy FaceCount.java from previous step.
[cloudera@quickstart hipi]$ pwd
/home/cloudera/Project/hipi
[cloudera@quickstart hipi]$ mkdir facecount
[cloudera@quickstart hipi]$ cp /mnt/hgfs/CSCEI63/project/hipi/src/FaceCount.java facecount/
[cloudera@quickstart hipi]$ ls
[cloudera@quickstart hipi]$ ls
3rdparty data facecount libsrc README.md sample
bin doc hipiwrapper license.txt release tool
build.xml examples lib my.diff run.sh util
[cloudera@quickstart hipi]$ ls facecount/
FaceCount.java
Make changes to HIPI build.xml ant script to build link to OpenCV jar file and add new build target facecount.
build.xml
<project basedir="." default="all">
<target name="setup">
....
<!-- opencv dependencies -->
<property name="opencv.jar" value="../opencv/opencv-2.4.11/bin/opencv-2411.jar" />
<echo message="Properties set."/>
</target>
<target name="compile" depends="setup,test_settings,hipi">
<mkdir dir="bin" />
<!-- Compile -->
<javac debug="yes" nowarn="on" includeantruntime="no" srcdir="${srcdir}" destdir="./bin" classpath="${hadoop.classpath}:./lib/hipi-${hipi.version}.jar:${opencv.jar}">
<compilerarg value="-Xlint:deprecation" />
</javac>
<!-- Create the jar -->
<jar destfile="${jardir}/${jarfilename}" basedir="./bin">
<zipfileset src="./lib/hipi-${hipi.version}.jar" />
<zipfileset src="${opencv.jar}" />
<manifest>
<attribute name="Main-Class" value="${mainclass}" />
</manifest>
</jar>
</target>
....
<target name="facecount">
<antcall target="compile">
<param name="srcdir" value="facecount" />
<param name="jarfilename" value="facecount.jar" />
<param name="jardir" value="facecount" />
<param name="mainclass" value="FaceCount" />
</antcall>
</target>
<target name="all" depends="hipi,hibimport,downloader,dumphib,jpegfromhib,createsequencefile,covariance" />
<!-- Clean -->
<target name="clean">
<delete dir="lib" />
<delete dir="bin" />
<delete>
<fileset dir="." includes="examples/*.jar,experiments/*.jar" />
</delete>
</target>
</project>
Build FaceCount.java
[cloudera@quickstart hipi]$ ant facecount
Buildfile: /home/cloudera/Project/hipi/build.xml
facecount:
setup:
[echo] Setting properties for build task...
[echo] Properties set.
test_settings:
[echo] Confirming that hadoop settings are set...
[echo] Properties are specified properly.
hipi:
[echo] Building the hipi library...
hipi:
[javac] Compiling 30 source files to /home/cloudera/Project/hipi/lib
[jar] Building jar: /home/cloudera/Project/hipi/lib/hipi-2.0.jar
[echo] Hipi library built.
compile:
[jar] Building jar: /home/cloudera/Project/hipi/facecount/facecount.jar
BUILD SUCCESSFUL
Total time: 12 seconds
Check facecount.jar is built under facecount directory
[cloudera@quickstart hipi]$ ls facecount/
facecount.jar FaceCount.java
8. Run FaceCount MapReduce job
Setup input images
[cloudera@quickstart hipi]$ ls /mnt/hgfs/CSCEI63/project/images-png/
217.png eugene.png patio.png
221.png ew-courtney-david.png people.png
3.png ew-friends.png pict_28.png
….
Create HIB
The primary input type to a HIPI program is a HipiImageBundle (HIB), which stores a collection of images on the Hadoop Distributed File System (HDFS). Use the hibimport tool to create a HIB (project/input.hib) from a collection of images on your local file system located in the directory /mnt/hgfs/CSCEI63/project/images-png/ by executing the following command from the HIPI root directory
[cloudera@quickstart hipi]$ hadoop fs -mkdir project
[cloudera@quickstart hipi]$ hadoop jar tool/hibimport.jar /mnt/hgfs/CSCEI63/project/images-png/ project/input.hib
** added: 217.png
** added: 221.png
** added: 3.png
** added: addams-family.png
** added: aeon1a.png
** added: aerosmith-double.png
.
.
.
** added: werbg04.png
** added: window.png
** added: wxm.png
** added: yellow-pages.png
** added: ysato.png
Created: project/input.hib and project/input.hib.dat
Run MapReduce
Create a run-facecount.sh script to clean previous output file and execute mapreduce job:
run-facecount.sh
#!/bin/bash
hadoop fs -rm -R project/output
hadoop jar facecount/facecount.jar project/input.hib project/output
[cloudera@quickstart hipi]$ bash run-facecount.sh
15/05/12 16:48:06 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
Deleted project/output
15/05/12 16:48:17 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/05/12 16:48:20 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/05/12 16:48:21 INFO input.FileInputFormat: Total input paths to process : 1
Spawned 1map tasks
15/05/12 16:48:22 INFO mapreduce.JobSubmitter: number of splits:1
15/05/12 16:48:22 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1431127776378_0049
15/05/12 16:48:25 INFO impl.YarnClientImpl: Submitted application application_1431127776378_0049
15/05/12 16:48:25 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1431127776378_0049/
15/05/12 16:48:25 INFO mapreduce.Job: Running job: job_1431127776378_0049
15/05/12 16:48:58 INFO mapreduce.Job: Job job_1431127776378_0049 running in uber mode : false
15/05/12 16:48:58 INFO mapreduce.Job: map 0% reduce 0%
15/05/12 16:49:45 INFO mapreduce.Job: map 3% reduce 0%
15/05/12 16:50:53 INFO mapreduce.Job: map 5% reduce 0%
15/05/12 16:50:57 INFO mapreduce.Job: map 8% reduce 0%
15/05/12 16:51:01 INFO mapreduce.Job: map 11% reduce 0%
15/05/12 16:51:09 INFO mapreduce.Job: map 15% reduce 0%
15/05/12 16:51:13 INFO mapreduce.Job: map 22% reduce 0%
15/05/12 16:51:18 INFO mapreduce.Job: map 25% reduce 0%
15/05/12 16:51:21 INFO mapreduce.Job: map 28% reduce 0%
15/05/12 16:51:29 INFO mapreduce.Job: map 31% reduce 0%
15/05/12 16:51:32 INFO mapreduce.Job: map 33% reduce 0%
15/05/12 16:51:45 INFO mapreduce.Job: map 38% reduce 0%
15/05/12 16:51:57 INFO mapreduce.Job: map 51% reduce 0%
15/05/12 16:52:07 INFO mapreduce.Job: map 55% reduce 0%
15/05/12 16:52:10 INFO mapreduce.Job: map 58% reduce 0%
15/05/12 16:52:14 INFO mapreduce.Job: map 60% reduce 0%
15/05/12 16:52:18 INFO mapreduce.Job: map 63% reduce 0%
15/05/12 16:52:26 INFO mapreduce.Job: map 100% reduce 0%
15/05/12 16:52:59 INFO mapreduce.Job: map 100% reduce 100%
15/05/12 16:53:01 INFO mapreduce.Job: Job job_1431127776378_0049 completed successfully
15/05/12 16:53:02 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=1576
FILE: Number of bytes written=226139
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=35726474
HDFS: Number of bytes written=27
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=205324
Total time spent by all reduces in occupied slots (ms)=29585
Total time spent by all map tasks (ms)=205324
Total time spent by all reduce tasks (ms)=29585
Total vcore-seconds taken by all map tasks=205324
Total vcore-seconds taken by all reduce tasks=29585
Total megabyte-seconds taken by all map tasks=210251776
Total megabyte-seconds taken by all reduce tasks=30295040
Map-Reduce Framework
Map input records=157
Map output records=157
Map output bytes=1256
Map output materialized bytes=1576
Input split bytes=132
Combine input records=0
Combine output records=0
Reduce input groups=1
Reduce shuffle bytes=1576
Reduce input records=157
Reduce output records=1
Spilled Records=314
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=11772
CPU time spent (ms)=45440
Physical memory (bytes) snapshot=564613120
Virtual memory (bytes) snapshot=3050717184
Total committed heap usage (bytes)=506802176
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=35726342
File Output Format Counters
Bytes Written=27
Check results:
[cloudera@quickstart hipi]$ hadoop fs -ls project/output
Found 2 items
-rw-r--r-- 1 cloudera cloudera 0 2015-05-12 16:52 project/output/_SUCCESS
-rw-r--r-- 1 cloudera cloudera 27 2015-05-12 16:52 project/output/part-r-00000
[cloudera@quickstart hipi]$ hadoop fs -cat project/output/part-r-00000
157 Total face detected: 0
9. Summary
OpenCV provides very rich set of tools for image processing, when combined with HIPI’s efficient and high-throughput parallel image processing power can be a great solution for processing very large image dataset very fast. These tools can help researcher and engineers alike to achieve high performance image processing.
10. Issues
The wrapper function to convert HIPI FloatImage to OpenCV did not work for some reason and not producing the correct image when converted. This was causing a bug of no faces detected. I had contacted HIPI members but did not receive timely reply before finishing this project. This bug is causing my results to show “0” faces detected.
11. Benefits (Pros/Cons)
Pros: HIPI is a great tool for processing very large volume of images in hadoop cluster, when combined with OpenCV it can be very powerful.
Cons: Converting the image format (HIPI FloatImage) to OpenCV Mat format is not straightforward and causing the issues for OpenCV to process images correctly.
12. Lessons Learned
- Setting up Cloudera Quickstart VM
- Using HIPI to run MapReduce on large volume of images
- Using OpenCV in Java environment
- Settings up hadoop to load native libraries
- Using cached files on HDFS
yes, I like it. I have question for Dinesh Malav. You can use SIFTDectector (Opencv) to find the same image on Hadoop using HIPI?
ReplyDeleteI am not familiar with SIFTDectector but a wrapper to convert HIPI FloatImage to any required format can be written and used with OpenCV.
ReplyDeleteHi Dinesh Malav,
DeleteSubject: Image matching with hipi
I want to know whether is it possible to search a image in the bundle that match to the image I give. Thanks
This comment has been removed by the author.
ReplyDeleteThis comment has been removed by the author.
DeleteHello Dinesh
ReplyDeleteDid you try to run this MapReduce prog. in Hipi using eclipse by having separate classes like driver, mapper and reducer. If yes, please give me the steps of eclipse. Creating with single file.java works fine with me.. any help!
-Prasad
The Hadoop tutorial you have explained is most useful for begineers who are taking Hadoop Administrator Online Training
ReplyDeleteThank you for sharing Such a good tutorials on Hadoop Image Processing
Apart from learning more about Hadoop at hadoop online training, this blog adds to my learning platforms. Great work done by the webmasters. Thanks for your research and experience sharing on a platform like this.
ReplyDeleteSome topics covered,may be it helps someone,HDFS is a Java-based file system that provides scalable and reliable data storage,and it was designed to span large clusters of commodity servers. HDFS has demonstrated production scalability of up to 200 PB of storage and a single cluster of 4500 servers, supporting close to a billion files and blocks.
ReplyDeletehttp://www.computaholics.in/2015/12/hdfs.html
http://www.computaholics.in/2015/12/mapreduce.html
http://www.computaholics.in/2015/11/hadoop-fs-commands.html
hi,dinesh
ReplyDeletei have one project in which i have to give image of one person it will search into a video frame by frame,match and give the result like count of face detection,timing of appearance etc. so is it possible with HiPi and opencv??
I have my hipi with build.gradle file..!!! Where do i need to specify the opencv dependencies in hipi instead of build.xml........??????
ReplyDeletedid you get any clue for this?
DeleteWell i am too facing same issue..did anyone find solution?
DeleteI think it is because you are using latest HIPI version, Dinesh used HIPI 2.0.
DeleteI don't have solution to fix it though...
Does the bug u mentioned is solved
ReplyDeletenoooooooooo
ReplyDeleteHey Dinesh, nice tutorial. Very helpful.
ReplyDeleteCan you help a bit more. I am getting problem with opencv native library. The library is loaded. But still I am getting the error:
Exception in thread "main" java.lang.UnsatisfiedLinkError: org.opencv.core.Mat.n_Mat()J
at org.opencv.core.Mat.n_Mat(Native Method)
at org.opencv.core.Mat.(Mat.java:24)
at xugglerTest.readVideoFile.run(readVideoFile.java:100)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at xugglerTest.readVideoFile.main(readVideoFile.java:113)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
This comment has been removed by the author.
ReplyDeleteHey Dinesh,very nice Cool tutorial!!
ReplyDeleteSo,could u help bit more to send me a copy of ur demo code.pretty much thanks for you.(fengbosapphire@foxmail.com)!
Hi,
ReplyDeleteI'm try to run the jar file but i got exception from container failure.i couldn't fix the error. plz any one help me to resolve this error.....
hi , Does HIPI use datnode , namenode , job tracker and tasktracker ?
ReplyDeletehi , Does HIPI use datnode , namenode , job tracker and tasktracker after culling step?
ReplyDeleteIf yes what would be the flow of processing?
Hi Dinesh,
ReplyDeleteNice tutorial.
Does HIPI work on Hadoop-1.2.1 ? As i am new to Hadoop I installed basic version Hadoop-1.2.1. So,can I start installing HIPI with this or do I need to install Hadoop-2.6.0.
This comment has been removed by the author.
ReplyDeleteHi Dinesh,
ReplyDeleteThanks for sharing a good tutorial for Hadoop, Hipi and OpenCV.
I am getting the following error while running my program on Hadoop with Hipi:
Exception in thread "main" java.lang.NoClassDefFoundError: hipi/imagebundle/mapreduce/ImageBundleInputFormat at AveragePixelColor.run(AveragePixelColor.java:114) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at AveragePixelColor.main(AveragePixelColor.java:144) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.ClassNotFoundException: hipi.imagebundle.mapreduce.ImageBundleInputFormat at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 10 more
Hipi JAR file is already included in HADOOP_CLASSPATH and I tried providing the JAR using -libjars as well. No success. Do you have any idea of resolving it.
Regards,
Try this Workaround: sudo cp ~/Downloads/hipi-2.0.jar /usr/lib/hadoop/
DeleteReally awesome blog. Your blog is really useful for me. Thanks for sharing this informative blog. Keep update your blog.
ReplyDeleteChennai Bigdata Training
hello sir,
ReplyDeletethanks for this awesome tutorial.
can you please send me the complete build.xml file for HIPI which is shown in step 7 or if possible can you please tell us how to build this FaceCount.java using gradle since the latest github repo of HIPI doesn't contain build.xml. (my mail id: bscniitmunna@gmail.com)
This article describes the Hadoop Software, All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common and should be automatically handled by the framework. This post gives great idea on Hadoop Certification for beginners. Also find best Hadoop Online Training in your locality at StaygreenAcademy.com
ReplyDeleteHi Dinesh,
ReplyDeleteIssues section describes that the number of faced detected seem to be zero due to some HIPI related errors. Is there any way that we can resolve this. We are trying to build the same use case here. It would be really helpful if we can get a solution to this at the earliest.
Hi Afzal,
DeleteI have my hipi with build.gradle file instead of build.xml.Am i missing something.......??????
Learn Big Data from Basics ... Hadoop Training in Hyderabad
ReplyDeleteThanks for providing this informative information you may also refer.
ReplyDeletehttp://www.s4techno.com/blog/2016/08/13/installing-a-storm-cluster/
This comment has been removed by the author.
ReplyDeleteI just did you example and It works, I don't know if hipi team has changed something but I'm getting the found faces! Thank you so much.
ReplyDeleteI use open cv 3.0.0 instead...
onlineitguru offers job oriented Big data big data hadoop online training and Certification
ReplyDeleteCourse and become Expert in big data hadoop .
Great information. Thanks for sharing.
ReplyDeletehadoop online training in usa
The question was hypothetical. There was no specific job I was thinking of. But after you saying that databases like greenplum allows mixing of map reduce code and sql queries, it suddenly dawned to me that my database might be doing the same as well. But just to know your thoughts because I don’t know, I am currently using MongoDB, do you know if it optimizes like Greenplum does?
ReplyDeletehadoop training in chennai
Wow,helpful information.
ReplyDeleteThanks you so much.
hadoop online training in usa
Thanks for this valuable info. I was going through all the your ppt one of the recommanded ppt for all hadoop learners in
ReplyDeleteHadoop training
Hadoop Online Training in india|usa|uk
Thanks in advance for the article. However, the VM link is no longer available.
ReplyDeleteThis comment has been removed by the author.
ReplyDeletethank for you information online excellent blog hadoop
ReplyDeleteBig Data & Hadoop Online Training , Online Hadoop Training
Buildfile: /home/faisal/hipi/sample/build.xml
ReplyDeleteBUILD FAILED
/home/faisal/hipi/sample/build.xml:1: Unexpected element "{}target" {antlib:org.apache.tools.ant}target
Total time: 0 seconds
hi i got this error can any one help
These instructions are 2 years old, please use new libraries. I will try to update instruction in with newer libraries.
ReplyDeletecan you direct me to the new libraries or a send a link to new way
Deletethanks a lot
The blog you shared is really good.
ReplyDeleteShould be a Big Data Hadoop Developer? Join TechandMate for the Big Data Hadoop Online Tutorial and learn MapReduce structure, HDFS thoughts, Hadoop Cluster, Sqoop, Flume, Hive, Pig and YARN. https://goo.gl/6RHMF2
http://worldofbigdata-inaction.blogspot.in/2017/02/processing-images-in-hadoop-using.html
ReplyDeleteThis is my blog and explains another issue that I felt while using HIPI and solution to overcome the same
@ Dinesh - Can you please provide any other methods which can be used for image processing in hadoop?
ReplyDeleteWebtrackker technology is the best IT training institute in NCR. Webtrackker provide training on all latest technology such as hadoop training. Webtrackker is not only training institute but also it also provide best IT solution to his client. Webtrackker provide training by experienced and working in the industry on same technology.Webtrackker Technology C-67 Sector-63 Noida 8802820025
ReplyDeleteHadoop Training institute in indirapuram
Hadoop Training institute in Noida
Hadoop Training institute in Ghaziabad
Hadoop Training institute in Vaishali
Hadoop Training institute in Vasundhara
Hadoop Training institute in Delhi South Ex
thank you for sharing such a valuable information.we are very thankful to your articles we are provideing hadoop training.nice blog one of the recommanded blog
ReplyDeleteHadoop Training in hyderabad
Hadoop Training in ameerpet
Hadoop Training institute in ameerpet
Hadoop certification in hyderabad
thank you for sharing this informative blog.. this blog really helpful for everyone.. explanation are clear so easy to understand... I got more useful information from this blog
ReplyDeletehadoop training institute in velachery | big data training institute in velachery | hadoop training in chennai velachery | big data training in chennai velachery
Thanks for sharing the information very useful info about Hadoop and
ReplyDeletekeep updating us, Please........
yes really useful information
DeleteExcellent Information great blog thanks for sharing ..... it is helpful to all big data learners and real time employees.
ReplyDeleteHadoop Online Training
Thanks for your good information....hadoop online training in hyderabad
ReplyDeletethe article provided by you is very nice and it is very helpful to know about the hadoop ..i read a article related to you..once you can check it out
ReplyDeleteHadoop Admin Online Training Hyderabad,Bangalore
Being new to the blogging world I feel like there is still so much to learn. Your tips helped to clarify a few things for me as well as giving..
ReplyDeleteWeb Designing Training in Chennai
Java Training in Chennai
Your thinking toward the respective issue is awesome also the idea behind the blog is very interesting which would bring a new evolution in respective field. Thanks for sharing.
ReplyDeleteJava Training in Chennai
VMWare Training in Chennai
Thanks for sharing the nice informations. HADOOP Online Training In Hyderabad
ReplyDeleteThese provided information was really so nice,thanks for giving that post and the more skills to develop after refer that post. Your articles really impressed for me,because of all information so nice.
ReplyDeleteHadoop Training in Chennai
Base SAS Training in Chennai
Besant Technologies conducts exams for an array of companies including Microsoft, Cisco, EXIN, Citrix, HP Expertone, Solaris, Linux, Sun Java and many more. As an authorized Pearson Vue Exam Center in Bangalore.
ReplyDeleteGet the practice and confidence you need with our new Adobe, Oracle , Linux, Pegasystems, CompTIA, Nokia Siemens, EMC, Cloudera, NetApp, Zend Technologies to name a few Test bundles. Pearson VUE offers affordable exam preparation bundles for select any certification exams — saving you up to 30% compared to buying separately. Each bundle consists of an exam paired with a corresponding practice test. The bundles can be conveniently purchased when you register for your exam. Pearson Vue Exam Center in Bangalore |
Pearson Vue Exam Centers in Bangalore |
Check it once Through Hadoop admin Online Training for more info.
ReplyDeleteThanks for one marvelous posting! I enjoyed reading it; you are a great author. I will make sure to bookmark your blog and may come back someday. I want to encourage that you continue your great posts, have a nice weekend! Besant Technologies offers the best Hadoop Training in Bangalore with the guide of the most gifted and all around experienced experts. Our educators are working in Hadoop and related innovations for a significant number of years in driving multi-national organizations around the globe.
ReplyDeleteAnd indeed, I’m just always astounded concerning the remarkable things served by you. Some four facts on this page are undeniably the most effective I’ve had.
ReplyDeleteAndroid training in bangalore
Your thinking toward the respective issue is awesome also the idea behind the blog is very interesting which would bring a new evolution in respective field. Thanks for sharing.
ReplyDeleteHadoop Training in Chennai
Base SAS Training in Chennai
MSBI Training in Chennai
This comment has been removed by a blog administrator.
ReplyDeleteThanks for sharing this- good stuff.Keep up the great work.
ReplyDeleteThank you
Big Data Training in Hyderabad
Hi,
ReplyDeleteKeep on posting these types of Hadoop articles. Thanks for information .
Hadoop Training in Hyderabad
Hi,
ReplyDeleteThanks for sharing such a greatful information on Hadoop
We are expecting more articles from you
Thank you
Big Data Analytics Training In Hyderabad
Big Data Analytics Course In Hyderabad
Hai,
ReplyDeleteIt's very nice blog
Thank you for giving valuable information on Hadoop
I'm expecting much more from you...
Good post..
ReplyDeleteCompress JPEG Images Online
Compress PNG
Online Image Optimizer
Online JPEG Compressor Tool
JPG Size Reducer
Photo Compressor
Convert PNG to JPG
Hi,
ReplyDeleteI have read your Hadoop blog it's very attractive and impressive.I like it your blog.....
Than you
priya
Thanks for sharing..
ReplyDeleteHadoop Training in Bangalore
Digital Marketing Training in Bangalore
Angularjs Training in Bangalore
This comment has been removed by the author.
ReplyDeleteHi Malav, Great tutorial,
ReplyDeleteI have an IllegalArgumentException while running a mapreduce job:
hadoop jar facecount/facecount.jar HipiprojectHib/TestImageFace.hib project/output
Brief Error log:
18/03/29 09:27:59 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/abiodun/.staging/job_1522223364790_0024
Exception in thread "main" java.lang.IllegalArgumentException: Can not create a Path from an empty string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:163)
at org.apache.hadoop.fs.Path.(Path.java:175)
at org.apache.hadoop.fs.Path.(Path.java:120)
at hipi.imagebundle.HipiImageBundle.readBundleHeader(HipiImageBundle.java:364)
I will appreciate any suggestion to help resolve this issue. Many thanks
This comment has been removed by a blog administrator.
ReplyDeleteThis really has covered a great insight on Python. I found myself lucky to visit your page and came across this insightful read on Python tutorial. Please allow me to share similar work on Python training course . Watch and gain knowledge today.https://www.youtube.com/watch?v=1jMR4cHBwZE
ReplyDeletenice course. thanks for sharing this post.
ReplyDeleteHadoop Training in Gurgaon
The Blog is very effective. Thanks for sharing.
ReplyDeleteBig data institute in Delhi
The most effective method to Solve MongoDB Map Reduce Memory Issue with Cognegic's MongoDB Technical Support
ReplyDeleteConfronting MongoDB outline memory issue? Or then again some other specialized issue in regards to MongoDB? Try not to freeze, simply unwind and take a profound inhale in light of the fact that we at Cognegic give gives MongoDB Online Support or MongoDB Customer Support USA. Here we created powerful database administration applications to utilize the MongoDB. We have professionally experienced and devoted specialized group who constantly prepared to tackle your concern.
For More Info: https://cognegicsystems.com/
Contact Number: 1-800-450-8670
Email Address- info@cognegicsystems.com
Company’s Address- 507 Copper Square Drive Bethel Connecticut (USA) 06801
Thanks for sharing Good Information
ReplyDeleteBig data Hadoop Online Training
very helpful blog.
ReplyDeleteBest Big Data and Hadoop Training in Bangalore
Mastering Machine Learning
Artificial intelligence training in Bangalore
AWS Training in Bangalore
Blockchain training in bangalore
Python Training in Bangalore
Very helpful and informative blog. I loved it. Thanks for sharing...
ReplyDeletehadoop big data classes in pune
hadoop big data training in pune
big data training institutes in pune
big data training in pune
big data institute in pune
big data certification in pune
big data testing classes
very informative blog and useful article thank you for sharing with us , keep posting
ReplyDeleteBig data hadoop online training India
This blog is very useful for me to learn and understand easily. Great and super article.Thanks for sharing this valuable information.Keep sharing.
ReplyDeletezhosters
Education
Excellent! You provided very useful information in this article. For more Hadoop training in Hyderabad
ReplyDeleteNice information thank you,if you want more information please visit our link Java online training Bangalore
ReplyDelete
ReplyDeletesuch a wonderful article...very interesting to read ....thanks for sharining .............
Bigdata hadoop online training in Hyderabad
Hadoop training in Hyderabad
online hadoop training in Hyderabad
Hi,
ReplyDeleteThanks for sharing such an informative blog. I have read your blog and I gathered some needful information from your post. Keep update your blog. Awaiting for your next update.
sap abap crm training
It's Amazing! Am very Glad to read your blog. Many Will Get Good Kwnoledge After Reading Your Blog With The Good Stuff. Keep Sharing This Type Of Blogs For Further Uses.
ReplyDeleteHadoop Online Training in Delhi
Hadoop Online Training in Nodia
Each department of CAD have specific programmes which, while completed could provide you with a recognisable qualification that could assist you get a job in anything design enterprise which you would really like.
ReplyDeleteAutoCAD training in Noida
AutoCAD training institute in Noida
Best AutoCAD training institute in Noida
This concept is a good way to enhance the knowledge.thanks for sharing. please keep it up
ReplyDeleteBig Data Hadoop Training
Ajio Promo Codes
ReplyDeleteAjio Coupons & Offers
Ajio Coupon codes
Ajio Offers Promo Codes
Ajio Offers on online shopping
Firstcry Promo Codes
Firstcry Deals & offers
Firstcry coupons codes
Firstcry Coupons Offers Promo Codes
Firstcry Offers on Kids shopping
Myntra promo codes
Myntra deals and offers
ReplyDeleteMyntra coupon codes
Myntra coupons offers promo codes
Myntra offers on online shopping
Nykaa Promo Codes
Nykaa Deals and offers
Nykaa Coupons codes
Nykaa coupons offers promo codes
Nykaa offers on online shopping
Flipkart promo codes
Flipkart deals & coupons
flipkart coupon code
flipkart coupons offer promo code
Amazon promo code
amazon offers
amazon offers and deals
amazon coupon code
amazon deal of the day
cleartrip promo codes
cleartrip coupon code
cleartrip offers and deals
ReplyDeletecleartrip deals
MMT promo Codes
MMT coupon codes
Makemytrip promo codes
makemytrip offers
makemytrip deals & offers
healthkart coupon code
healthkart promo codes
healthkart deals and offers
healthkart discount offers
bigbasket promo codes
bigbasket coupon codes
bigbasket offers
bigbasket coupon and deals
pizzahut promo code
pizzahut coupon codes
pizzahut offers
pizzahut coupon and offers
hotels promo code
hotels coupon codes
hotel offers & deals
hotels discount offers
nearbuy coupon codes
nearbuy promo codes
nearbuy deals and offers
nearbuy discounts
zoomcar promo code
zoomcar coupon code
zoomcar offers on ride
zoomcar deals and offers
Amazing!!!! Superb work you have done. I appreciate for your patience and the thought which made you to take this topic to explain. Keep posting, Thank you.
ReplyDeleteBig Data Hadoop online training in USA,UK, Singapore
Hadoop online training in Australia
Hi Dinesh, your blog is so informative and so cool. I liked it. Thanks for sharing this great information. Hadoop Big Data Classes in Pune
ReplyDeleteThis blog is the general information for the feature. You got a good work for these blog.We have a developing our creative content of this mind.Thank you for this blog. This for very interesting and useful.
ReplyDeletepython training in omr
python training in annanagar | python training in chennai
python training in marathahalli | python training in btm layout
python training in rajaji nagar | python training in jayanagar
Great thoughts you got there, believe I may possibly try just some of it throughout my daily life.
ReplyDeleteMEAN stack training in Chennai
MEAN stack training in bangalore
MEAN stack training in tambaram
MEAN stack training in annanagar
MEAN stack training in Velachery
MEAN stack training Sholinganallur
This comment has been removed by the author.
ReplyDeleteInteresting post, keep sharing
ReplyDeleteHadoop training in Hyderabad
Good Blog, very informative
ReplyDeleteHadoop training in Hyderabad
core Java online training
ReplyDeleteAppreciative to you, for sharing those magnificent expressive affirmations. I'll endeavor to do around a motivating force in responding; there's a remarkable game-plan that you've squeezed in articulating the principal objectives, as you charmingly put it. Continue Sharing
ReplyDeleteBig Data Hadoop online training in India, Australia, Malaysia.
Online Hadoop training in USA, UK, Singapore
Thanks for the informative article. This is one of the best resources I have found in quite some time. Nicely written and great info. I really cannot thank you enough for sharing.
ReplyDeleteData Science training in marathahalli
Data Science training in btm
Data Science training in rajaji nagar
Data Science training in chennai
Data Science training in kalyan nagar
Data Science training in electronic city
Data Science training in USA
Nice tutorial. Thanks for sharing the valuable information. it’s really helpful. Who want to learn this blog most helpful. Keep sharing on updated tutorials…
ReplyDeletepython training in rajajinagar
Python training in btm
Python training in usa
I recently came across your blog and have been reading along. I thought I would leave my first comment.
ReplyDeletejava training in marathahalli | java training in btm layout
java training in jayanagar | java training in electronic city
I prefer to study this kind of material. Nicely written information in this post, the quality of content is fine and the conclusion is lovely. Things are very open and intensely clear explanation of issues
ReplyDeletepython training in velachery
python training institute in chennai
Your good knowledge and kindness in playing with all the pieces were very useful. I don’t know what I would have done if I had not encountered such a step like this.
ReplyDeleteDevOps online Training
This comment has been removed by the author.
ReplyDeleteThe blog is so interactive and Informative , you should write more blogs like this Big Data Hadoop Online Training Bangalore
ReplyDeleteIt is better to engaged ourselves in activities we like. I liked the post. Thanks for sharing.
ReplyDeleteData science training in tambaram | Data Science training in anna nagar
Data Science training in chennai | Data science training in Bangalore
Data Science training in marathahalli | Data Science training in btm
Hi, nice blog.. This was simply superb,, thanks for sharing such a nice post with us...
ReplyDeleteDevOps Online Training
Hi,
ReplyDeleteI must appreciate you for providing such a valuable content for us. This is one amazing piece of article. Helped a lot in increasing my knowledge.
Cloud computing Training in Chennai
Hadoop Training in Chennai
Best institute for Cloud computing in Chennai
Cloud computing Training Chennai
Big Data Course in Chennai
Big Data Hadoop Training in Chennai
Does your blog have a contact page? I’m having problems locating it but, I’d like to shoot you an email. I’ve got some recommendations for your blog you might be interested in hearing.
ReplyDeleteAWS Training in Pune | Best Amazon Web Services Training in Pune
AWS Tutorial |Learn Amazon Web Services Tutorials |AWS Tutorial For Beginners
Amazon Web Services Training in OMR , Chennai | Best AWS Training in OMR,Chennai
AWS Training in Chennai |Best Amazon Web Services Training in Chennai
Amazon Web Services Training in Pune | Best AWS Training in Pune
I found this post interesting and worth reading. Keep going and putting efforts into good things. Thank you!! Best Python Online Training || Learn Python Course
ReplyDeleteHey, Wow all the posts are very informative for the people who visit this site. Good work! We also have a Website. Please feel free to visit our site. Thank you for sharing.Well written article Thank You Sharing with Us pmp training Chennai | pmp training centers in Chennai | pmp training institutes in Chennai | pmp training and certification in Chennai | pmp training in velachery
ReplyDeleteReally useful information. We are providing best data science online training from industry experts.
ReplyDeleteData Science Classes in Pune
Great article with Nice explanation about it so its very helpful for us thank you for sharing..,
ReplyDeleteBest Event Stalls Exhibition in India
Best Event Technology Services in India
ReplyDeleteNice blog..! I really loved reading through this article. Thanks for sharing such
a amazing post with us and keep blogging...
Gmat coachining in hyderabad
Gmat coachining in kukatpally
Gmat coachining in Banjarahills
ReplyDeleteWorthful Hadoop tutorial. Appreciate a lot for taking up the pain to write such a quality content on Hadoop tutorial. Just now I watched this similar Hadoop tutorial and I think this will enhance the knowledge of other visitors for sureHadoop Online Training
Whoa! I’m enjoying the template/theme of this website. It’s simple, yet effective. A lot of times it’s very hard to get that “perfect balance” between superb usability and visual appeal. I must say you’ve done a very good job with this.
ReplyDeleteAWS TRAINING IN CHENNAI | BEST AWS TRAINING IN CHENNAI
aws training in chennai"> aws training in chennai | Advanced AWS Training in Chennai
AWS Training in Velachery, Chennai | Best AWS Training in Velachery, Chennai
AWS Training in Tambaram | Best AWS Training in Tambaram
AWS Training in Sholinganallur, Chennai
AWS Training in Annanagar, Chennai | AWS Training in Chennai
AWS Training in Chennai | Advanced AWS Training in Chennai
Thank you for benefiting from time to focus on this kind of, I feel firmly about it and also really like comprehending far more with this particular subject matter. In case doable, when you get know-how, is it possible to thoughts modernizing your site together with far more details? It’s extremely useful to me.
ReplyDeleteI really enjoy simply reading all of your weblogs. Simply wanted to inform you that you have people like me who appreciate your work. Definitely a great post I would like to read this
Data Science training in Chennai
Data science training in Bangalore
Data science training in pune
Data science online training
Data Science Interview questions and answers
Data science training in bangalore
I simply wanted to write down a quick word to say thanks to you for those wonderful tips and hints you are showing on this site.
ReplyDeletepython Online training in chennai
python training institute in marathahalli
python training institute in btm
Python training course in Chennai
ReplyDeleteWhoa! I’m enjoying the template/theme of this website. It’s simple, yet effective. A lot of times it’s very hard to get that “perfect balance” between superb usability and visual appeal. I must say you’ve done a very good job with this.
AWS TRAINING IN BTM LAYOUT | AWS TRAINING IN BANGALORE
AWS Training in Marathahalli | AWS Training in Bangalore
Read all the information that i've given in above article. It'll give u the whole idea about it.
ReplyDeleteAWS Training in pune
AWS Online Training
AWS Training in Bangalore
Thanks for sharing the good information and post more information. I need some facilitate to my website. please check once http://talentflames.com/
training and placemen