Feb
5
HDFS to Mongo migration
One of our current assignment demanded to migrate data from HDFS to Mongo database. Data contained in HDFS was in JSON format and this was a plus since Mongo explicitly support JSON documents.
I started looking out for strategies how shall this migration be executed. The very first thought was to use Hadoop MapRed task to read from HDFS and insert it to Mongo. I was looking explicilty at Spring-Data and Spring-Data-Mongo framework to get this done.
Varun, one of my team member, suggested if MongoImport can be used to get this done. I though of giving it a try and started looking at different options available with this.
Linux pipes and redirection does the trick, without using --file parameter with mongoimport output from HDFS could be redirected to mongoimport command.
I started looking out for strategies how shall this migration be executed. The very first thought was to use Hadoop MapRed task to read from HDFS and insert it to Mongo. I was looking explicilty at Spring-Data and Spring-Data-Mongo framework to get this done.
Varun, one of my team member, suggested if MongoImport can be used to get this done. I though of giving it a try and started looking at different options available with this.
Linux pipes and redirection does the trick, without using --file parameter with mongoimport output from HDFS could be redirected to mongoimport command.