楼主: Lisrelchen
2171 14

Apache HBase ™ Reference Guide [推广有奖]

  • 0关注
  • 62粉丝

VIP

已卖:4192份资源

院士

67%

还不是VIP/贵宾

-

TA的文库  其他...

Bayesian NewOccidental

Spatial Data Analysis

东西方数据挖掘

威望
0
论坛币
50278 个
通用积分
83.5106
学术水平
253 点
热心指数
300 点
信用等级
208 点
经验
41518 点
帖子
3256
精华
14
在线时间
766 小时
注册时间
2006-5-4
最后登录
2022-11-6

楼主
Lisrelchen 发表于 2017-5-29 00:27:25 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

本帖隐藏的内容

Apache HBase Reference Guide.pdf (13.9 MB)



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Reference erence apache Guide refer

本帖被以下文库推荐

沙发
Lisrelchen 发表于 2017-5-29 00:29:13
  1. Example 40. Create, modify and delete a Table Using Java
  2. package com.example.hbase.admin;

  3. import java.io.IOException;

  4. import org.apache.hadoop.conf.Configuration;
  5. import org.apache.hadoop.fs.Path;
  6. import org.apache.hadoop.hbase.HBaseConfiguration;
  7. import org.apache.hadoop.hbase.HColumnDescriptor;
  8. import org.apache.hadoop.hbase.HConstants;
  9. import org.apache.hadoop.hbase.HTableDescriptor;
  10. import org.apache.hadoop.hbase.TableName;
  11. import org.apache.hadoop.hbase.client.Admin;
  12. import org.apache.hadoop.hbase.client.Connection;
  13. import org.apache.hadoop.hbase.client.ConnectionFactory;
  14. import org.apache.hadoop.hbase.io.compress.Compression.Algorithm;

  15. public class Example {

  16.   private static final String TABLE_NAME = "MY_TABLE_NAME_TOO";
  17.   private static final String CF_DEFAULT = "DEFAULT_COLUMN_FAMILY";

  18.   public static void createOrOverwrite(Admin admin, HTableDescriptor table) throws IOException {
  19.     if (admin.tableExists(table.getTableName())) {
  20.       admin.disableTable(table.getTableName());
  21.       admin.deleteTable(table.getTableName());
  22.     }
  23.     admin.createTable(table);
  24.   }

  25.   public static void createSchemaTables(Configuration config) throws IOException {
  26.     try (Connection connection = ConnectionFactory.createConnection(config);
  27.          Admin admin = connection.getAdmin()) {

  28.       HTableDescriptor table = new HTableDescriptor(TableName.valueOf(TABLE_NAME));
  29.       table.addFamily(new HColumnDescriptor(CF_DEFAULT).setCompressionType(Algorithm.NONE));

  30.       System.out.print("Creating table. ");
  31.       createOrOverwrite(admin, table);
  32.       System.out.println(" Done.");
  33.     }
  34.   }

  35.   public static void modifySchema (Configuration config) throws IOException {
  36.     try (Connection connection = ConnectionFactory.createConnection(config);
  37.          Admin admin = connection.getAdmin()) {

  38.       TableName tableName = TableName.valueOf(TABLE_NAME);
  39.       if (!admin.tableExists(tableName)) {
  40.         System.out.println("Table does not exist.");
  41.         System.exit(-1);
  42.       }

  43.       HTableDescriptor table = admin.getTableDescriptor(tableName);

  44.       // Update existing table
  45.       HColumnDescriptor newColumn = new HColumnDescriptor("NEWCF");
  46.       newColumn.setCompactionCompressionType(Algorithm.GZ);
  47.       newColumn.setMaxVersions(HConstants.ALL_VERSIONS);
  48.       admin.addColumn(tableName, newColumn);

  49.       // Update existing column family
  50.       HColumnDescriptor existingColumn = new HColumnDescriptor(CF_DEFAULT);
  51.       existingColumn.setCompactionCompressionType(Algorithm.GZ);
  52.       existingColumn.setMaxVersions(HConstants.ALL_VERSIONS);
  53.       table.modifyFamily(existingColumn);
  54.       admin.modifyTable(tableName, table);

  55.       // Disable an existing table
  56.       admin.disableTable(tableName);

  57.       // Delete an existing column family
  58.       admin.deleteColumn(tableName, CF_DEFAULT.getBytes("UTF-8"));

  59.       // Delete a table (Need to be disabled first)
  60.       admin.deleteTable(tableName);
  61.     }
  62.   }

  63.   public static void main(String... args) throws IOException {
  64.     Configuration config = HBaseConfiguration.create();

  65.     //Add any necessary configuration files (hbase-site.xml, core-site.xml)
  66.     config.addResource(new Path(System.getenv("HBASE_CONF_DIR"), "hbase-site.xml"));
  67.     config.addResource(new Path(System.getenv("HADOOP_CONF_DIR"), "core-site.xml"));
  68.     createSchemaTables(config);
  69.     modifySchema(config);
  70.   }
  71. }
复制代码

藤椅
Lisrelchen 发表于 2017-5-29 00:30:46
  1. Example 46. HBaseContext Usage Example
  2. This example shows how HBaseContext can be used to do a foreachPartition on a RDD in Scala:

  3. val sc = new SparkContext("local", "test")
  4. val config = new HBaseConfiguration()

  5. ...

  6. val hbaseContext = new HBaseContext(sc, config)

  7. rdd.hbaseForeachPartition(hbaseContext, (it, conn) => {
  8. val bufferedMutator = conn.getBufferedMutator(TableName.valueOf("t1"))
  9. it.foreach((putRecord) => {
  10. . val put = new Put(putRecord._1)
  11. . putRecord._2.foreach((putValue) => put.addColumn(putValue._1, putValue._2, putValue._3))
  12. . bufferedMutator.mutate(put)
  13. })
  14. bufferedMutator.flush()
  15. bufferedMutator.close()
  16. })
复制代码

板凳
Lisrelchen 发表于 2017-5-29 00:32:52
  1. Example 47. bulkPut Example with DStreams
  2. Below is an example of bulkPut with DStreams. It is very close in feel to the RDD bulk put.

  3. val sc = new SparkContext("local", "test")
  4. val config = new HBaseConfiguration()

  5. val hbaseContext = new HBaseContext(sc, config)
  6. val ssc = new StreamingContext(sc, Milliseconds(200))

  7. val rdd1 = ...
  8. val rdd2 = ...

  9. val queue = mutable.Queue[RDD[(Array[Byte], Array[(Array[Byte],
  10.     Array[Byte], Array[Byte])])]]()

  11. queue += rdd1
  12. queue += rdd2

  13. val dStream = ssc.queueStream(queue)

  14. dStream.hbaseBulkPut(
  15.   hbaseContext,
  16.   TableName.valueOf(tableName),
  17.   (putRecord) => {
  18.    val put = new Put(putRecord._1)
  19.    putRecord._2.foreach((putValue) => put.addColumn(putValue._1, putValue._2, putValue._3))
  20.    put
  21.   })
复制代码

报纸
Lisrelchen 发表于 2017-5-29 00:49:26
  1. Example 48. Bulk Loading Example
  2. The following example shows bulk loading with Spark.

  3. val sc = new SparkContext("local", "test")
  4. val config = new HBaseConfiguration()

  5. val hbaseContext = new HBaseContext(sc, config)

  6. val stagingFolder = ...
  7. val rdd = sc.parallelize(Array(
  8.       (Bytes.toBytes("1"),
  9.         (Bytes.toBytes(columnFamily1), Bytes.toBytes("a"), Bytes.toBytes("foo1"))),
  10.       (Bytes.toBytes("3"),
  11.         (Bytes.toBytes(columnFamily1), Bytes.toBytes("b"), Bytes.toBytes("foo2.b"))), ...

  12. rdd.hbaseBulkLoad(TableName.valueOf(tableName),
  13.   t => {
  14.    val rowKey = t._1
  15.    val family:Array[Byte] = t._2(0)._1
  16.    val qualifier = t._2(0)._2
  17.    val value = t._2(0)._3

  18.    val keyFamilyQualifier= new KeyFamilyQualifier(rowKey, family, qualifier)

  19.    Seq((keyFamilyQualifier, value)).iterator
  20.   },
  21.   stagingFolder.getPath)

  22. val load = new LoadIncrementalHFiles(config)
  23. load.doBulkLoad(new Path(stagingFolder.getPath),
  24.   conn.getAdmin, table, conn.getRegionLocator(TableName.valueOf(tableName)))
复制代码

地板
Lisrelchen 发表于 2017-5-29 00:51:02
  1. Example 49. Using Additional Parameters
  2. val sc = new SparkContext("local", "test")
  3. val config = new HBaseConfiguration()

  4. val hbaseContext = new HBaseContext(sc, config)

  5. val stagingFolder = ...
  6. val rdd = sc.parallelize(Array(
  7.       (Bytes.toBytes("1"),
  8.         (Bytes.toBytes(columnFamily1), Bytes.toBytes("a"), Bytes.toBytes("foo1"))),
  9.       (Bytes.toBytes("3"),
  10.         (Bytes.toBytes(columnFamily1), Bytes.toBytes("b"), Bytes.toBytes("foo2.b"))), ...

  11. val familyHBaseWriterOptions = new java.util.HashMap[Array[Byte], FamilyHFileWriteOptions]
  12. val f1Options = new FamilyHFileWriteOptions("GZ", "ROW", 128, "PREFIX")

  13. familyHBaseWriterOptions.put(Bytes.toBytes("columnFamily1"), f1Options)

  14. rdd.hbaseBulkLoad(TableName.valueOf(tableName),
  15.   t => {
  16.    val rowKey = t._1
  17.    val family:Array[Byte] = t._2(0)._1
  18.    val qualifier = t._2(0)._2
  19.    val value = t._2(0)._3

  20.    val keyFamilyQualifier= new KeyFamilyQualifier(rowKey, family, qualifier)

  21.    Seq((keyFamilyQualifier, value)).iterator
  22.   },
  23.   stagingFolder.getPath,
  24.   familyHBaseWriterOptions,
  25.   compactionExclude = false,
  26.   HConstants.DEFAULT_MAX_FILE_SIZE)

  27. val load = new LoadIncrementalHFiles(config)
  28. load.doBulkLoad(new Path(stagingFolder.getPath),
  29.   conn.getAdmin, table, conn.getRegionLocator(TableName.valueOf(tableName)))
复制代码

7
Lisrelchen 发表于 2017-5-29 00:52:21
  1. Example 50. Using thin record bulk load
  2. val sc = new SparkContext("local", "test")
  3. val config = new HBaseConfiguration()

  4. val hbaseContext = new HBaseContext(sc, config)

  5. val stagingFolder = ...
  6. val rdd = sc.parallelize(Array(
  7.       ("1",
  8.         (Bytes.toBytes(columnFamily1), Bytes.toBytes("a"), Bytes.toBytes("foo1"))),
  9.       ("3",
  10.         (Bytes.toBytes(columnFamily1), Bytes.toBytes("b"), Bytes.toBytes("foo2.b"))), ...

  11. rdd.hbaseBulkLoadThinRows(hbaseContext,
  12.       TableName.valueOf(tableName),
  13.       t => {
  14.         val rowKey = t._1

  15.         val familyQualifiersValues = new FamiliesQualifiersValues
  16.         t._2.foreach(f => {
  17.           val family:Array[Byte] = f._1
  18.           val qualifier = f._2
  19.           val value:Array[Byte] = f._3

  20.           familyQualifiersValues +=(family, qualifier, value)
  21.         })
  22.         (new ByteArrayWrapper(Bytes.toBytes(rowKey)), familyQualifiersValues)
  23.       },
  24.       stagingFolder.getPath,
  25.       new java.util.HashMap[Array[Byte], FamilyHFileWriteOptions],
  26.       compactionExclude = false,
  27.       20)

  28. val load = new LoadIncrementalHFiles(config)
  29. load.doBulkLoad(new Path(stagingFolder.getPath),
  30.   conn.getAdmin, table, conn.getRegionLocator(TableName.valueOf(tableName)))
复制代码

8
Lisrelchen 发表于 2017-5-29 01:33:11
  1. Example 11. Examples
  2. #Create a namespace
  3. create_namespace 'my_ns'
  4. #create my_table in my_ns namespace
  5. create 'my_ns:my_table', 'fam'
  6. #drop namespace
  7. drop_namespace 'my_ns'
  8. #alter namespace
  9. alter_namespace 'my_ns', {METHOD => 'set', 'PROPERTY_NAME' => 'PROPERTY_VALUE'}
复制代码

9
Lisrelchen 发表于 2017-5-29 01:33:49
  1. Example 12. Examples
  2. #namespace=foo and table qualifier=bar
  3. create 'foo:bar', 'fam'

  4. #namespace=default and table qualifier=bar
  5. create 'bar', 'fam'
复制代码

10
Lisrelchen 发表于 2017-5-29 01:34:41
  1. Example 13. Modify the Maximum Number of Versions for a Column Family
  2. This example uses HBase Shell to keep a maximum of 5 versions of all columns in column family f1. You could also use HColumnDescriptor.

  3. hbase> alter ‘t1′, NAME => ‘f1′, VERSIONS => 5
复制代码

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2025-12-9 08:36