The word count example explained @ http://static.springsource.org/spring-hadoop/docs/current/reference/html/batch-wordcount.html didn’t run for me.

I followed the following steps:

Imported & compiled the source (comes bundled with spring-data-hadoop distribution) in Spring Tool Suite IDE on windows box Exported the executable jar with all required dependencies using the ‘installApp’ (Run As -> GradleBuild -> InstallApp) option in the IDE. Copied ‘build/install/batch-wordcount’ to the Linux Hadoop cluster  Executed the sample using ‘./build/install/wordcount/bin/wordcount classpath:/launch-context.xml job1’

However the execution failed with ClassNotFoundException for class org.apache.hadoop.examples.WordCount$TokenizerMapper.

In a previous blog-post we got to know about enterprise integration patterns (EIPs). Now in this post we will look into Apache Camel framework that realizes those patterns. 

About Camel:

Apache Camel is an open source project which is almost 5 years old and has a large community of users. At the heart of the framework is an engine which does the job of mediation and routes messages from one system to another.

Couple of the issues encountered when I started experimenting with Spring for Apache Hadoop

One, the Hadoop job that I was running was not appearing on the Map/Reduce Administration console or the Job Tracker Interface

And the other: I was trying to run the job from Spring Tool Suite (STS) IDE on a windows machine, whereas the Hadoop cluster was on Linux machines. There were permission issues (AccessControlException) which inhibited job execution in this mode.

Couple of set-up issues observed while installing and using hive.

Related software versions: Hadoop: 1.0.3, HBase: 0.92.1, Hive: 0.9.0

I installed Hive using steps mentioned @ http://cloudblog.8kmiles.com/2012/01/31/hive-installation/ to set-up various environment variables and to create the required directories.

NoClassDefFoundError/ classpath issue

However, when I tried starting hive using /bin/hive, I found myself deep into NoClassDef issues like the one mentioned below.

In this blog entry we will go through some of Enterprise Integration Patterns. These are known design patterns that aims to solve integration challenges. After reading this one would get a head start in designing integration solutions. 

EIPs (in short) are known design patterns that provide solutions to issues/problems faced during application integration.

[New approach to build nextGen enterprise applications]

Javascript is available in web world for a while and has become one of the most popular, known, understood and also hated language available. There are many reason for ‘Why is Javascript hated’, and there are literatures about how it turned to a vamp, while others tries to see the good part of it.

This article is not about cursing this multifaceted language, but here we will try to explore more possibilities with this language.
1

In previous blog, ‘Marking the map’, we discussed different available frameworks and APIs for marking geographical coordinates on the map. This edition is an extension to that; here we will discuss an algorithm which targets extracting geographical coordinates (i.e. latitude and longitude) from an image. This solution requires implementing different image processing algorithms and a bit of mathematics. Let’s discuss the requirement in brief.

Yesterday we were working on setting up our first Hadoop cluster. Though there are many online documentation on this even then we faced a few challenges getting with it. In this post I am providing details on the faced problems and solutions:

Passwordless login from NameNode to DataNode and vice versa:

Though setting paswordless login from NameNode to DataNode was easy.
1

UMLGraph allows the declarative specification and drawing of UML class and sequence diagrams. The specification is done in text diagrams, that are then transformed into the appropriate graphical representations.

UMLGraph is implemented as a javadoc doclet (a program satisfying the doclet API that specifies the content and format of the output generated by the javadoc tool). Furthermore, the output of UMLGraph needs to be post-processed with the Graphviz dot program.
2

With my latest assignment I have started exploring Hadoop and related technologies. When exploring HDFS and playing with it, I came across these two syntaxes of querying HDFS:

> hadoop dfs

> hadoop fs

Initally could not differentiate between the two and keep wondering why we have two different syntaxes for a common purpose.
Loading