楼主: Lisrelchen
1732 14

Apache HBase ™ Reference Guide [推广有奖]

  • 0关注
  • 62粉丝

VIP

院士

67%

还不是VIP/贵宾

-

TA的文库  其他...

Bayesian NewOccidental

Spatial Data Analysis

东西方数据挖掘

威望
0
论坛币
49957 个
通用积分
79.5487
学术水平
253 点
热心指数
300 点
信用等级
208 点
经验
41518 点
帖子
3256
精华
14
在线时间
766 小时
注册时间
2006-5-4
最后登录
2022-11-6

相似文件 换一批

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

本帖隐藏的内容

Apache HBase Reference Guide.pdf (13.9 MB)



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Reference erence apache Guide refer

本帖被以下文库推荐

沙发
Lisrelchen 发表于 2017-5-29 00:29:13 |只看作者 |坛友微信交流群
  1. Example 40. Create, modify and delete a Table Using Java
  2. package com.example.hbase.admin;

  3. import java.io.IOException;

  4. import org.apache.hadoop.conf.Configuration;
  5. import org.apache.hadoop.fs.Path;
  6. import org.apache.hadoop.hbase.HBaseConfiguration;
  7. import org.apache.hadoop.hbase.HColumnDescriptor;
  8. import org.apache.hadoop.hbase.HConstants;
  9. import org.apache.hadoop.hbase.HTableDescriptor;
  10. import org.apache.hadoop.hbase.TableName;
  11. import org.apache.hadoop.hbase.client.Admin;
  12. import org.apache.hadoop.hbase.client.Connection;
  13. import org.apache.hadoop.hbase.client.ConnectionFactory;
  14. import org.apache.hadoop.hbase.io.compress.Compression.Algorithm;

  15. public class Example {

  16.   private static final String TABLE_NAME = "MY_TABLE_NAME_TOO";
  17.   private static final String CF_DEFAULT = "DEFAULT_COLUMN_FAMILY";

  18.   public static void createOrOverwrite(Admin admin, HTableDescriptor table) throws IOException {
  19.     if (admin.tableExists(table.getTableName())) {
  20.       admin.disableTable(table.getTableName());
  21.       admin.deleteTable(table.getTableName());
  22.     }
  23.     admin.createTable(table);
  24.   }

  25.   public static void createSchemaTables(Configuration config) throws IOException {
  26.     try (Connection connection = ConnectionFactory.createConnection(config);
  27.          Admin admin = connection.getAdmin()) {

  28.       HTableDescriptor table = new HTableDescriptor(TableName.valueOf(TABLE_NAME));
  29.       table.addFamily(new HColumnDescriptor(CF_DEFAULT).setCompressionType(Algorithm.NONE));

  30.       System.out.print("Creating table. ");
  31.       createOrOverwrite(admin, table);
  32.       System.out.println(" Done.");
  33.     }
  34.   }

  35.   public static void modifySchema (Configuration config) throws IOException {
  36.     try (Connection connection = ConnectionFactory.createConnection(config);
  37.          Admin admin = connection.getAdmin()) {

  38.       TableName tableName = TableName.valueOf(TABLE_NAME);
  39.       if (!admin.tableExists(tableName)) {
  40.         System.out.println("Table does not exist.");
  41.         System.exit(-1);
  42.       }

  43.       HTableDescriptor table = admin.getTableDescriptor(tableName);

  44.       // Update existing table
  45.       HColumnDescriptor newColumn = new HColumnDescriptor("NEWCF");
  46.       newColumn.setCompactionCompressionType(Algorithm.GZ);
  47.       newColumn.setMaxVersions(HConstants.ALL_VERSIONS);
  48.       admin.addColumn(tableName, newColumn);

  49.       // Update existing column family
  50.       HColumnDescriptor existingColumn = new HColumnDescriptor(CF_DEFAULT);
  51.       existingColumn.setCompactionCompressionType(Algorithm.GZ);
  52.       existingColumn.setMaxVersions(HConstants.ALL_VERSIONS);
  53.       table.modifyFamily(existingColumn);
  54.       admin.modifyTable(tableName, table);

  55.       // Disable an existing table
  56.       admin.disableTable(tableName);

  57.       // Delete an existing column family
  58.       admin.deleteColumn(tableName, CF_DEFAULT.getBytes("UTF-8"));

  59.       // Delete a table (Need to be disabled first)
  60.       admin.deleteTable(tableName);
  61.     }
  62.   }

  63.   public static void main(String... args) throws IOException {
  64.     Configuration config = HBaseConfiguration.create();

  65.     //Add any necessary configuration files (hbase-site.xml, core-site.xml)
  66.     config.addResource(new Path(System.getenv("HBASE_CONF_DIR"), "hbase-site.xml"));
  67.     config.addResource(new Path(System.getenv("HADOOP_CONF_DIR"), "core-site.xml"));
  68.     createSchemaTables(config);
  69.     modifySchema(config);
  70.   }
  71. }
复制代码

使用道具

藤椅
Lisrelchen 发表于 2017-5-29 00:30:46 |只看作者 |坛友微信交流群
  1. Example 46. HBaseContext Usage Example
  2. This example shows how HBaseContext can be used to do a foreachPartition on a RDD in Scala:

  3. val sc = new SparkContext("local", "test")
  4. val config = new HBaseConfiguration()

  5. ...

  6. val hbaseContext = new HBaseContext(sc, config)

  7. rdd.hbaseForeachPartition(hbaseContext, (it, conn) => {
  8. val bufferedMutator = conn.getBufferedMutator(TableName.valueOf("t1"))
  9. it.foreach((putRecord) => {
  10. . val put = new Put(putRecord._1)
  11. . putRecord._2.foreach((putValue) => put.addColumn(putValue._1, putValue._2, putValue._3))
  12. . bufferedMutator.mutate(put)
  13. })
  14. bufferedMutator.flush()
  15. bufferedMutator.close()
  16. })
复制代码

使用道具

板凳
Lisrelchen 发表于 2017-5-29 00:32:52 |只看作者 |坛友微信交流群
  1. Example 47. bulkPut Example with DStreams
  2. Below is an example of bulkPut with DStreams. It is very close in feel to the RDD bulk put.

  3. val sc = new SparkContext("local", "test")
  4. val config = new HBaseConfiguration()

  5. val hbaseContext = new HBaseContext(sc, config)
  6. val ssc = new StreamingContext(sc, Milliseconds(200))

  7. val rdd1 = ...
  8. val rdd2 = ...

  9. val queue = mutable.Queue[RDD[(Array[Byte], Array[(Array[Byte],
  10.     Array[Byte], Array[Byte])])]]()

  11. queue += rdd1
  12. queue += rdd2

  13. val dStream = ssc.queueStream(queue)

  14. dStream.hbaseBulkPut(
  15.   hbaseContext,
  16.   TableName.valueOf(tableName),
  17.   (putRecord) => {
  18.    val put = new Put(putRecord._1)
  19.    putRecord._2.foreach((putValue) => put.addColumn(putValue._1, putValue._2, putValue._3))
  20.    put
  21.   })
复制代码

使用道具

报纸
Lisrelchen 发表于 2017-5-29 00:49:26 |只看作者 |坛友微信交流群
  1. Example 48. Bulk Loading Example
  2. The following example shows bulk loading with Spark.

  3. val sc = new SparkContext("local", "test")
  4. val config = new HBaseConfiguration()

  5. val hbaseContext = new HBaseContext(sc, config)

  6. val stagingFolder = ...
  7. val rdd = sc.parallelize(Array(
  8.       (Bytes.toBytes("1"),
  9.         (Bytes.toBytes(columnFamily1), Bytes.toBytes("a"), Bytes.toBytes("foo1"))),
  10.       (Bytes.toBytes("3"),
  11.         (Bytes.toBytes(columnFamily1), Bytes.toBytes("b"), Bytes.toBytes("foo2.b"))), ...

  12. rdd.hbaseBulkLoad(TableName.valueOf(tableName),
  13.   t => {
  14.    val rowKey = t._1
  15.    val family:Array[Byte] = t._2(0)._1
  16.    val qualifier = t._2(0)._2
  17.    val value = t._2(0)._3

  18.    val keyFamilyQualifier= new KeyFamilyQualifier(rowKey, family, qualifier)

  19.    Seq((keyFamilyQualifier, value)).iterator
  20.   },
  21.   stagingFolder.getPath)

  22. val load = new LoadIncrementalHFiles(config)
  23. load.doBulkLoad(new Path(stagingFolder.getPath),
  24.   conn.getAdmin, table, conn.getRegionLocator(TableName.valueOf(tableName)))
复制代码

使用道具

地板
Lisrelchen 发表于 2017-5-29 00:51:02 |只看作者 |坛友微信交流群
  1. Example 49. Using Additional Parameters
  2. val sc = new SparkContext("local", "test")
  3. val config = new HBaseConfiguration()

  4. val hbaseContext = new HBaseContext(sc, config)

  5. val stagingFolder = ...
  6. val rdd = sc.parallelize(Array(
  7.       (Bytes.toBytes("1"),
  8.         (Bytes.toBytes(columnFamily1), Bytes.toBytes("a"), Bytes.toBytes("foo1"))),
  9.       (Bytes.toBytes("3"),
  10.         (Bytes.toBytes(columnFamily1), Bytes.toBytes("b"), Bytes.toBytes("foo2.b"))), ...

  11. val familyHBaseWriterOptions = new java.util.HashMap[Array[Byte], FamilyHFileWriteOptions]
  12. val f1Options = new FamilyHFileWriteOptions("GZ", "ROW", 128, "PREFIX")

  13. familyHBaseWriterOptions.put(Bytes.toBytes("columnFamily1"), f1Options)

  14. rdd.hbaseBulkLoad(TableName.valueOf(tableName),
  15.   t => {
  16.    val rowKey = t._1
  17.    val family:Array[Byte] = t._2(0)._1
  18.    val qualifier = t._2(0)._2
  19.    val value = t._2(0)._3

  20.    val keyFamilyQualifier= new KeyFamilyQualifier(rowKey, family, qualifier)

  21.    Seq((keyFamilyQualifier, value)).iterator
  22.   },
  23.   stagingFolder.getPath,
  24.   familyHBaseWriterOptions,
  25.   compactionExclude = false,
  26.   HConstants.DEFAULT_MAX_FILE_SIZE)

  27. val load = new LoadIncrementalHFiles(config)
  28. load.doBulkLoad(new Path(stagingFolder.getPath),
  29.   conn.getAdmin, table, conn.getRegionLocator(TableName.valueOf(tableName)))
复制代码

使用道具

7
Lisrelchen 发表于 2017-5-29 00:52:21 |只看作者 |坛友微信交流群
  1. Example 50. Using thin record bulk load
  2. val sc = new SparkContext("local", "test")
  3. val config = new HBaseConfiguration()

  4. val hbaseContext = new HBaseContext(sc, config)

  5. val stagingFolder = ...
  6. val rdd = sc.parallelize(Array(
  7.       ("1",
  8.         (Bytes.toBytes(columnFamily1), Bytes.toBytes("a"), Bytes.toBytes("foo1"))),
  9.       ("3",
  10.         (Bytes.toBytes(columnFamily1), Bytes.toBytes("b"), Bytes.toBytes("foo2.b"))), ...

  11. rdd.hbaseBulkLoadThinRows(hbaseContext,
  12.       TableName.valueOf(tableName),
  13.       t => {
  14.         val rowKey = t._1

  15.         val familyQualifiersValues = new FamiliesQualifiersValues
  16.         t._2.foreach(f => {
  17.           val family:Array[Byte] = f._1
  18.           val qualifier = f._2
  19.           val value:Array[Byte] = f._3

  20.           familyQualifiersValues +=(family, qualifier, value)
  21.         })
  22.         (new ByteArrayWrapper(Bytes.toBytes(rowKey)), familyQualifiersValues)
  23.       },
  24.       stagingFolder.getPath,
  25.       new java.util.HashMap[Array[Byte], FamilyHFileWriteOptions],
  26.       compactionExclude = false,
  27.       20)

  28. val load = new LoadIncrementalHFiles(config)
  29. load.doBulkLoad(new Path(stagingFolder.getPath),
  30.   conn.getAdmin, table, conn.getRegionLocator(TableName.valueOf(tableName)))
复制代码

使用道具

8
Lisrelchen 发表于 2017-5-29 01:33:11 |只看作者 |坛友微信交流群
  1. Example 11. Examples
  2. #Create a namespace
  3. create_namespace 'my_ns'
  4. #create my_table in my_ns namespace
  5. create 'my_ns:my_table', 'fam'
  6. #drop namespace
  7. drop_namespace 'my_ns'
  8. #alter namespace
  9. alter_namespace 'my_ns', {METHOD => 'set', 'PROPERTY_NAME' => 'PROPERTY_VALUE'}
复制代码

使用道具

9
Lisrelchen 发表于 2017-5-29 01:33:49 |只看作者 |坛友微信交流群
  1. Example 12. Examples
  2. #namespace=foo and table qualifier=bar
  3. create 'foo:bar', 'fam'

  4. #namespace=default and table qualifier=bar
  5. create 'bar', 'fam'
复制代码

使用道具

10
Lisrelchen 发表于 2017-5-29 01:34:41 |只看作者 |坛友微信交流群
  1. Example 13. Modify the Maximum Number of Versions for a Column Family
  2. This example uses HBase Shell to keep a maximum of 5 versions of all columns in column family f1. You could also use HColumnDescriptor.

  3. hbase> alter ‘t1′, NAME => ‘f1′, VERSIONS => 5
复制代码

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-5-1 03:16