hadoophdfsjava_java怎麼連接hdfs文件系統需要哪些包

⑴ 大數據中Hadoop的核心技術是什麼

Hadoop核心架構，分為四個模塊：

1、Hadoop通用：提供Hadoop模塊所需要的java類庫和工具。

2、Hadoop YARN：提供任務調度和集群資源管理功能。

3、Hadoop HDFS：分布式文件系統，提供高吞吐量的應用程序數據訪問方式。

4、Hadoop MapRece：大數據離線計算引擎，用於大規模數據集的並行處理。

特點：

Hadoop的高可靠性、高擴展性、高效性、高容錯性，是Hadoop的優勢所在，在十多年的發展歷程當中，Hadoop依然被行業認可，占據著重要的市場地位。

Hadoop在大數據技術框架當中的地位重要，學大數據必學Hadoop，還要對Hadoop核心技術框架掌握扎實才行。

⑵ hdfs 的java api操作要配置hadoop環境嗎

需要配置，
需要配置幾個配置文件在你的resources目錄下
hdfs-site.xml yarn-site.xml core-site.xml
看具體情況如果使用mr程序還需要 mapred-site.xml 具體文件參數和你hadoop集群的配置有關可以查閱官方配置文檔

⑶ java怎麼連接hdfs文件系統，需要哪些包

apache的Hadoop項目提供一類api可以通過java工程操作hdfs中的文件，包括：文件打開，讀寫，刪除等、目錄的創建，刪除，讀取目錄中所有文件等。
1、到http://hadoop.apache.org/releases.html下載Hadoop，解壓後把所有jar加入項目的lib里
2、程序處理步驟： 1）得到Configuration對象，2）得到FileSystem對象，3）進行文件操作，簡單示例如下：
/**
*
*/
package org.jrs.wlh;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

/**
* @PutMeger.java
* java操作hdfs 往 hdfs中上傳數據
* @version $Revision$</br>
* update: $Date$
*/
public class PutMeger {

public static void main(String[] args) throws IOException {

String[] str = new String[]{"E:\\hadoop\\UploadFileClient.java","hdfs://master:9000/user/hadoop/inccnt.java"};
Configuration conf = new Configuration();
FileSystem fileS= FileSystem.get(conf);
FileSystem localFile = FileSystem.getLocal(conf); //得到一個本地的FileSystem對象

Path input = new Path(str[0]); //設定文件輸入保存路徑
Path out = new Path(str[1]); //文件到hdfs輸出路徑

try{
FileStatus[] inputFile = localFile.listStatus(input); //listStatus得到輸入文件路徑的文件列表
FSDataOutputStream outStream = fileS.create(out); //創建輸出流
for (int i = 0; i < inputFile.length; i++) {
System.out.println(inputFile[i].getPath().getName());
FSDataInputStream in = localFile.open(inputFile[i].getPath());

byte buffer[] = new byte[1024];
int bytesRead = 0;
while((bytesRead = in.read(buffer))>0){ //按照位元組讀取數據
System.out.println(buffer);
outStream.write(buffer,0,bytesRead);
}

in.close();
}

}catch(Exception e){
e.printStackTrace();
}
}

}

⑷ 利用JAVA+API向HDFS文件系統上的文件寫入數據一共有哪三種方法,請敘述該三種

摘要一.構建環境

⑸ hadoop2.6.0 hdfs 錯誤:無法創建Java虛擬機

需要把JAVA_HOME設置為系統變數，
另外你的PATH設置了嗎？

⑹ 我正在搗騰hadoop，用java編寫了一個程序，想要連接到hdfs上，運行後顯示如下，這是什麼問題呢求解！

你的hadoop是2.X的，但是還是按1.X的配置，需修改配置，或者還原hadoop版本。

⑺ 學習hadoop必須有java基礎嗎

作者：markxiao
鏈接：https://www.hu.com/question/34185054/answer/149007333
來源：知乎
著作權歸作者所有。商業轉載請聯系作者獲得授權，非商業轉載請註明出處。

hadoop一般在工業環境大部分是運行在linux環境下，hadoop是用java實現的。所以最好是熟悉linux環境下編程。至於java做到看得懂比較好，遇到問題可以看看源碼。
如果不會java，開源的可以用streaming寫maprece程序，只用跟stdin和stdout打交道就行了。網路之前開發出了bistreaming，hce組件，用這些組件寫c/c++就好了，不清楚有沒有開源。
如果你要定製化一些東西，比如inputformat/outputformat之類的，或者你想調用hdfs/yarn的java介面，懂java就很必要了。
至於hadoop學習路徑，主要基於我個人的學習路線來說得，可能不一定適合你，僅供參考，我對yarn和hdfs的細節了解不深，可能更多側重maprece。
(1)實踐：了解基本的maprece原理後，可以仿照demo寫一些maprece程序，然後查看任務監控頁面，了解監控頁面一些指標，這個是你分析任務很好的幫手。可以處理一些大的數據量，寫完了之後分析監控頁面的指標，思考這個任務還有沒有優化空間？任務哪部分耗時比較多比較多？如果失敗了，你能不能根據日誌定位到錯誤的地方？在此階段可能會遇到各種各樣的問題，比如streaming怎麼處理二進制數據，很多小文件導致性能低下。
(2)理論：經過一段時間的實踐，對mapce的思想應該比較熟悉了。這個時候可以看看maprece的運行過程，maprece的提交過程是怎麼樣？map輸出階段有哪些過程？shuffle過程是怎麼樣？在大數據量情況下怎麼保證rece階段，相同的key的記錄在一起的？
(3)讀源碼階段：如果你對maprece的使用和調優很熟練了，對源碼也有興趣，就可以看看源碼了。上層的有maprece，streaming；基礎點的可以看看hdfs，yarn的實現；底層的可以看看hadoop的rpc源碼實現。
最後，我只是根據我的經歷大致可以劃分這三個階段，三者完全可以穿插進行。

⑻ Java程序訪問不了HDFS下的文件，報缺失塊的異常，請高手解決一下

現在我們來深入了解一下Hadoop的FileSystem類。這個類是用來跟Hadoop的文件系統進行交互的。雖然我們這里主要是針對HDFS，但是我們還是應該讓我們的代碼只使用抽象類FileSystem，這樣我們的代碼就可以跟任何一個Hadoop的文件系統交互了。在寫測試代碼時，我們可以用本地文件系統測試，部署時使用HDFS，只需配置一下，不需要修改代碼了。
在Hadoop 1.x以後的版本中引入了一個新的文件系統介面叫FileContext，一個FileContext實例可以處理多種文件系統，而且介面更加清晰和統一。

⑼ 如何使用Java API讀寫HDFS

Java API讀寫HDFS

public class FSOptr {

/**
* @param args
*/
public static void main(String[] args) throws Exception {
// TODO Auto-generated method stub
Configuration conf = new Configuration();
makeDir(conf);
rename(conf);
delete(conf);

}

// 創建文件目錄
private static void makeDir(Configuration conf) throws Exception {
FileSystem fs = FileSystem.get(conf);
Path dir = new Path("/user/hadoop/data/20140318");
boolean result = fs.mkdirs(dir);// 創建文件夾
System.out.println("make dir :" + result);

// 創建文件，並寫入內容
Path dst = new Path("/user/hadoop/data/20140318/tmp");
byte[] buff = "hello,hadoop!".getBytes();
FSDataOutputStream outputStream = fs.create(dst);
outputStream.write(buff, 0, buff.length);
outputStream.close();
FileStatus files[] = fs.listStatus(dst);
for (FileStatus file : files) {
System.out.println(file.getPath());
}
fs.close();
}

// 重命名文件
private static void rename(Configuration conf) throws Exception {

FileSystem fs = FileSystem.get(conf);
Path oldName = new Path("/user/hadoop/data/20140318/1.txt");
Path newName = new Path("/user/hadoop/data/20140318/2.txt");
fs.rename(oldName, newName);

FileStatus files[] = fs.listStatus(new Path(
"/user/hadoop/data/20140318"));
for (FileStatus file : files) {
System.out.println(file.getPath());
}
fs.close();
}

// 刪除文件
@SuppressWarnings("deprecation")
private static void delete(Configuration conf) throws Exception {
FileSystem fs = FileSystem.get(conf);
Path path = new Path("/user/hadoop/data/20140318");
if (fs.isDirectory(path)) {
FileStatus files[] = fs.listStatus(path);
for (FileStatus file : files) {
fs.delete(file.getPath());
}
} else {
fs.delete(path);
}

// 或者
fs.delete(path, true);

fs.close();
}

/**
* 下載,將hdfs文件下載到本地磁碟
*
* @param localSrc1
* 本地的文件地址，即文件的路徑
* @param hdfsSrc1
* 存放在hdfs的文件地址
*/
public boolean sendFromHdfs(String hdfsSrc1, String localSrc1) {

Configuration conf = new Configuration();
FileSystem fs = null;
try {
fs = FileSystem.get(URI.create(hdfsSrc1), conf);
Path hdfs_path = new Path(hdfsSrc1);
Path local_path = new Path(localSrc1);

fs.ToLocalFile(hdfs_path, local_path);

return true;
} catch (IOException e) {
e.printStackTrace();
}
return false;
}

/**
* 上傳，將本地文件到hdfs系統中
*
* @param localSrc
* 本地的文件地址，即文件的路徑
* @param hdfsSrc
* 存放在hdfs的文件地址
*/
public boolean sendToHdfs1(String localSrc, String hdfsSrc) {
InputStream in;
try {
in = new BufferedInputStream(new FileInputStream(localSrc));
Configuration conf = new Configuration();// 得到配置對象
FileSystem fs; // 文件系統
try {
fs = FileSystem.get(URI.create(hdfsSrc), conf);
// 輸出流，創建一個輸出流
OutputStream out = fs.create(new Path(hdfsSrc),
new Progressable() {
// 重寫progress方法
public void progress() {
// System.out.println("上傳完一個設定緩存區大小容量的文件！");
}
});
// 連接兩個流，形成通道，使輸入流向輸出流傳輸數據,
IOUtils.Bytes(in, out, 10240, true); // in為輸入流對象，out為輸出流對象，4096為緩沖區大小，true為上傳後關閉流
return true;
} catch (IOException e) {
e.printStackTrace();
}

} catch (FileNotFoundException e) {
e.printStackTrace();
}
return false;
}

/**
* 移動
*
* @param old_st原來存放的路徑
* @param new_st移動到的路徑
*/
public boolean moveFileName(String old_st, String new_st) {

try {

// 下載到伺服器本地
boolean down_flag = sendFromHdfs(old_st, "/home/hadoop/文檔/temp");
Configuration conf = new Configuration();
FileSystem fs = null;

// 刪除源文件
try {
fs = FileSystem.get(URI.create(old_st), conf);
Path hdfs_path = new Path(old_st);
fs.delete(hdfs_path);
} catch (IOException e) {
e.printStackTrace();
}

// 從伺服器本地傳到新路徑
new_st = new_st + old_st.substring(old_st.lastIndexOf("/"));
boolean uplod_flag = sendToHdfs1("/home/hadoop/文檔/temp", new_st);

if (down_flag && uplod_flag) {
return true;
}
} catch (Exception e) {
e.printStackTrace();
}
return false;
}

// 本地文件到hdfs
private static void CopyFromLocalFile(Configuration conf) throws Exception {
FileSystem fs = FileSystem.get(conf);
Path src = new Path("/home/hadoop/word.txt");
Path dst = new Path("/user/hadoop/data/");
fs.FromLocalFile(src, dst);
fs.close();
}

// 獲取給定目錄下的所有子目錄以及子文件
private static void getAllChildFile(Configuration conf) throws Exception {
FileSystem fs = FileSystem.get(conf);
Path path = new Path("/user/hadoop");
getFile(path, fs);
}

private static void getFile(Path path, FileSystem fs)throws Exception {
FileStatus[] fileStatus = fs.listStatus(path);
for (int i = 0; i < fileStatus.length; i++) {
if (fileStatus[i].isDir()) {
Path p = new Path(fileStatus[i].getPath().toString());
getFile(p, fs);
} else {
System.out.println(fileStatus[i].getPath().toString());
}
}
}

//判斷文件是否存在
private static boolean isExist(Configuration conf,String path)throws Exception{
FileSystem fileSystem = FileSystem.get(conf);
return fileSystem.exists(new Path(path));
}

//獲取hdfs集群所有主機結點數據
private static void getAllClusterNodeInfo(Configuration conf)throws Exception{
FileSystem fs = FileSystem.get(conf);
DistributedFileSystem hdfs = (DistributedFileSystem)fs;
DatanodeInfo[] dataNodeStats = hdfs.getDataNodeStats();
String[] names = new String[dataNodeStats.length];
System.out.println("list of all the nodes in HDFS cluster:"); //print info

for(int i=0; i < dataNodeStats.length; i++){
names[i] = dataNodeStats[i].getHostName();
System.out.println(names[i]); //print info

}
}

//get the locations of a file in HDFS
private static void getFileLocation(Configuration conf)throws Exception{
FileSystem fs = FileSystem.get(conf);
Path f = new Path("/user/cluster/dfs.txt");
FileStatus filestatus = fs.getFileStatus(f);
BlockLocation[] blkLocations = fs.getFileBlockLocations(filestatus,0,filestatus.getLen());
int blkCount = blkLocations.length;
for(int i=0; i < blkCount; i++){
String[] hosts = blkLocations[i].getHosts();
//Do sth with the block hosts

System.out.println(hosts);
}
}

//get HDFS file last modification time
private static void getModificationTime(Configuration conf)throws Exception{
FileSystem fs = FileSystem.get(conf);
Path f = new Path("/user/cluster/dfs.txt");
FileStatus filestatus = fs.getFileStatus(f);

long modificationTime = filestatus.getModificationTime(); // measured in milliseconds since the epoch

Date d = new Date(modificationTime);
System.out.println(d);
}

}

導航:首頁 > 編程語言 > hadoophdfsjava

hadoophdfsjava

與hadoophdfsjava相關的資料