How to connect to HDFS using java ?
Required library
files
For coludera distribution
For coludera distribution
1. log4j-1.2.17.jar
2.
commons-logging-1.0.4.jar
3.
guava-r09-jarjar.jar
4.
hadoop-core-0.20.2.jar
For hadoop 2.7.2 : required jars ( all are located at common/lib folder )
- commons-io-2.4.jar
- guava-11.0.2.jar
- hadoop-common-2.7.2.jar
- htrace-core-3.1.0-incubating.jar
- protobuf-java-2.5.0.jar
- slf4j-api-1.7.10.jar
- commons-logging-1.1.3.jar
- hadoop-auth-2.7.2.jar
- hadoop-hdfs-2.7.2.jar
- log4j-1.2.17.jar
Location :
In master node ,
hadoop version will show the core jar to use
[root@oel6 ~]#
hadoop version
Hadoop
0.20.2-cdh3u6
Subversion
file:///data/1/tmp/topdir/BUILD/hadoop-0.20.2-cdh3u6 -r
efb405d2aa54039bdf39e0733cd0bb9423a1eb0a
Compiled by root on
Wed Mar 20 13:11:26 PDT 2013
From source with
checksum 3277b62b2872d77555cfbc5a202f81c4
This command was
run using /usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u6.jar
So use
/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u6.jar
Basic Code :
http://www.folkstalk.com/2013/06/connect-to-hadoop-hdfs-through-java.html
Read
FileSystem
import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.FileUtil;
import org.apache.hadoop.fs.Path;
public class ReadFileSystem {
public static void main(String[] args) throws IOException,
URISyntaxException
{
Configuration conf = new Configuration();
FileSystem hdfs = FileSystem.get(new
URI("hdfs://IP:9000"),conf);
FileStatus[] fileStatus = hdfs.listStatus(new
Path("hdfs://IP:9000/new"));
Path[] paths = FileUtil.stat2Paths(fileStatus);
System.out.println("***** Contents of the Directory
*****");
for(Path path : paths)
{
System.out.println(path);
}
}
}
hdfs
getconf -confKey fs.default.name in
server
shows correct dfs location
shows correct dfs location
Sample output :
***** Contents of the Directory *****
hdfs://IP:9000/new/123.txt
hdfs://IP:9000/new/newww.txt
hdfs://IP:9000/new/sterin.txt
hdfs://IP:9000/new/ucm
hdfs://IP:9000/new/valut
hdfs://IP:9000/new/weblayout
Write
a file to HDFS
import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.URI;
import java.net.URISyntaxException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.util.Progressable;
public class CopyFileToHDFS {
public static void main(String[] args) throws IOException,
URISyntaxException
{
//1. Get the instance of COnfiguration
Configuration configuration = new Configuration();
//2. Create an InputStream to read the data from local file
InputStream inputStream = new BufferedInputStream(new
FileInputStream("/tmp/sample.txt"));
//3. Get the HDFS instance
FileSystem hdfs = FileSystem.get(new URI("hdfs://IP:9000"),
configuration);
//4. Open a OutputStream to write the data, this can be
obtained from the FileSytem
OutputStream outputStream = hdfs.create(new
Path("hdfs://IP:9000/forsterin/Hadoop_File.txt"),
new Progressable() {
@Override
public void progress() {
System.out.println("....");
}
});
try
{
IOUtils.copyBytes(inputStream, outputStream, 4096, false);
}
finally
{
IOUtils.closeStream(inputStream);
IOUtils.closeStream(outputStream);
}
}
}
No comments:
Post a Comment