With my latest assignment I have started exploring Hadoop and related technologies. When exploring HDFS and playing with it, I came across these two syntaxes of querying HDFS:

> hadoop dfs
> hadoop fs

Initally could not differentiate between the two and keep wondering why we have two different syntaxes for a common purpose. I googled the web and found people too having the same question and below are there reasonings:

Per Chris explanation looks like there's no difference between the two syntaxes. If we look at the definitions of the two commands (hadoop fs and hadoop dfs) in $HADOOP_HOME/bin/hadoop
...
elif [ "$COMMAND" = "datanode" ] ; then
  CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode'
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_DATANODE_OPTS"
elif [ "$COMMAND" = "fs" ] ; then
  CLASS=org.apache.hadoop.fs.FsShell
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "dfs" ] ; then
  CLASS=org.apache.hadoop.fs.FsShell
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "dfsadmin" ] ; then
  CLASS=org.apache.hadoop.hdfs.tools.DFSAdmin
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
...
that's his reasoning behind the difference.

I am not convinced with this, I looked out for a more convincing answer and here's are a few excerpts which made better sense to me:

FS relates to a generic file system which can point to any file systems like local, HDFS etc. But dfs is very specific to HDFS. So when we use FS it can perform operation with from/to local or hadoop distributed file system to destination . But specifying DFS operation relates to HDFS.

Below are the excerpts from hadoop documentation which describes these two as different shells.


FS Shell
The FileSystem (FS) shell is invoked by bin/hadoop fs . All the FS shell commands take path URIs as arguments. The URI format is scheme://autority/path. For HDFS the scheme is hdfs, and for the local filesystem the scheme is file. The scheme and authority are optional. If not specified, the default scheme specified in the configuration is used. An HDFS file or directory such as /parent/child can be specified as hdfs://namenodehost/parent/child or simply as /parent/child (given that your configuration is set to point to hdfs://namenodehost). Most of the commands in FS shell behave like corresponding Unix commands. 

DFShell
The HDFS shell is invoked by bin/hadoop dfs . All the HDFS shell commands take path URIs as arguments. The URI format is scheme://autority/path. For HDFS the scheme is hdfs, and for the local filesystem the scheme is file. The scheme and authority are optional. If not specified, the default scheme specified in the configuration is used. An HDFS file or directory such as /parent/child can be specified as hdfs://namenode:namenodeport/parent/child or simply as /parent/child (given that your configuration is set to point to namenode:namenodeport). Most of the commands in HDFS shell behave like corresponding Unix commands. 

So from the above it can be concluded that it all depends upon the scheme configure. When using this two command with absolute URI, i.e. scheme://a/b the behavior shall be identical. Only its the default configured scheme value for file and hdfs for fs and dfs respectively which is the cause for difference in behavior.
0

Add a comment

You might be seeking for the option to profile (capturing method's execution time) your spring application. Spring provides different ways to profile the application. Profiling should be treated as a separate concern and spring AOP facilitates easy approach to separate this concern. With the help of spring AOP you can profile your methods without making any changes in actual classes. 

You just need to perform pretty simple steps to configure your application for profiling using spring AOP:


  • In application context file, add below tag to enable the load time weaving
         context:load-time-weaver      
         With this configuration, AspectJ's Load-time weaver is registered with current class loader. All classes
         (which are matched with the pattern in below aspect class) are weaved at load time as per defined 
         aspect class file.

  • Create the aspect class for profiling       
d                   This aspect is written to capture method's execution time. You can configure your required  
                classes (which need to be profiled) with @Around advice.

package com.xxx.profiling.aspect;

import org.apache.log4j.Logger;
import org.aspectj.lang.JoinPoint;
import org.aspectj.lang.ProceedingJoinPoint;
import org.aspectj.lang.annotation.Around;
import org.aspectj.lang.annotation.Aspect;
import org.aspectj.lang.annotation.Before;
import org.springframework.stereotype.Component;

/**
 * Aspect class to profile the configured methods
 */
@Component
@Aspect
public class ProfilingAspect {
   
    /** Logger instance. */
   
    private static final Logger LOGGER = Logger.getLogger(ProfilingAspect.class);
   
    public LoggingAspect(){
        
         LOGGER.info("Initializing Profiling Aspect... ");
    }
   
    public void profileMethods(ProceedingJoinPoint thisJoinPoint) throws Throwable {
      
        StringBuffer logMessage = new StringBuffer();       
        logMessage.append(thisJoinPoint.getSignature().getName());
        logMessage.append("(");
       
        // Collect arguments value
        Object[] args = thisJoinPoint.getArgs();
        for (int i = 0; i < args.length; i++) {
            logMessage.append(args[i]).append(",");
        }
        if (args.length > 0) {
            logMessage.deleteCharAt(logMessage.length() - 1);
        }
       
        logMessage.append(") ");
       
        // Calculate method's elapsed time
        logMessage.append(" [Elapsed Time: " );
        long start = System.currentTimeMillis();
        thisJoinPoint.proceed();
        long end = System.currentTimeMillis();
        logMessage.append(end - start);
       
        logMessage.append(" ms]");
       
        LOGGER.debug("Profiling : "+thisJoinPoint.getTarget().getClass().getName() +"."+logMessage.toString());
    }
   
     // All public methods which are in services package will be weaved for profiling   
    @Around("execution(public * com.xxx.services..*.*(..))")
       public void profileServiceMethods(ProceedingJoinPoint thisJoinPoint) throws Throwable {
        profileMethods(thisJoinPoint);
    }
   
    // All public methods which are in dao package will be weaved for profiling
    @Around("execution(public * com.xxx.dao..*.*(..))")
    public void profileDAOMethods(ProceedingJoinPoint thisJoinPoint) throws Throwable {
        profileMethods(thisJoinPoint);
      
    }
}

  • Scan package in application context file to initialize this aspect class
          
  • Download the spring-agent.jar file from here
  • Configure JAVA VM parameter by providing spring agent jar path 
          -javaagent:C:/yourpath/spring-agent-2.5.6.jar

You are done with required configuration and implementation. Now it's time to see your execution statistics.

Start your application server and verify whether aspect class is initialized successfully. You should get below log message in application log file. If this log message comes, that means your aspect is initialized properly.

Initializing Profiling Aspect...

Whenever your service or data access class methods are executed you must see below logs.


Profiling :com.xxx.evolution.dao.MyDAO.getUserDetail(123)  [ Elapsed Time :  246 ms ]
Profiling :com.xxx.evolution.services.MyService.getUserDetail(123)   [ Elapsed Time :  247 ms ]


Apart from this approach you can also go for other options for profiling:

0

Add a comment

Loading