本篇主要演示Hive的数据操作,包括向表中装载数据、插入数据、创建表以及导出数据。
一 向表中装载数据
1 语法结构
[sql]
- LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]
2 示例
[sql]
- hive> load data local inpath '/home/hadoop/emp/'
- > into table emp;
- Loading data to table hive.emp
- OK
- Time taken: 0.95 seconds
- hive> select *from emp;
- OK
- emp.id emp.name
- 1 Alen
- 2 Jane
- 3 Tom
- 4 Peter
- Time taken: 0.252 seconds, Fetched: 4 row(s)
- hive>
二 通过查询语句向表中插入数据
1 语法结构
[sql]
- INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...) [IF NOT EXISTS]] select_statement1 FROM from_statement;
- INSERT INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1 FROM from_statement;
- Hive extension (multiple inserts):
- FROM from_statement
- INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...) [IF NOT EXISTS]] select_statement1
- [INSERT OVERWRITE TABLE tablename2 [PARTITION ... [IF NOT EXISTS]] select_statement2]
- [INSERT INTO TABLE tablename2 [PARTITION ...] select_statement2] ...;
- FROM from_statement
- INSERT INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1
- [INSERT INTO TABLE tablename2 [PARTITION ...] select_statement2]
- [INSERT OVERWRITE TABLE tablename2 [PARTITION ... [IF NOT EXISTS]] select_statement2] ...;
- Hive extension (dynamic partition inserts):
- INSERT OVERWRITE TABLE tablename PARTITION (partcol1[=val1], partcol2[=val2] ...) select_statement FROM from_statement;
- INSERT INTO TABLE tablename PARTITION (partcol1[=val1], partcol2[=val2] ...) select_statement FROM from_statement;
2 示例
[sql]
- hive> create table t_emp(id int , name string);
- OK
- Time taken: 1.611 seconds
- hive> insert overwrite table t_emp
- > select *from emp;
- hive> select *from t_emp;
- OK
- t_emp.id t_emp.name
- 1 Alen
- 2 Jane
- 3 Tom
- 4 Peter
- Time taken: 0.296 seconds, Fetched: 4 row(s)
三 单个查询语句中创建表并加载数据
通过Create Table AS实现:
[sql]
- hive> create table t_emp as
- > select *from emp;
- hive> select *from t_emp;
- OK
- t_emp.id t_emp.name
- 1 Alen
- 2 Jane
- 3 Tom
- 4 Peter
- Time taken: 0.311 seconds, Fetched: 4 row(s)
四 导出数据
1 语法结构
[html]
- Standard syntax:
- INSERT OVERWRITE [LOCAL] DIRECTORY directory1
- [ROW FORMAT row_format] [STORED AS file_format] (Note: Only available starting with Hive 0.11.0)
- SELECT ... FROM ...
- Hive extension (multiple inserts):
- FROM from_statement
- INSERT OVERWRITE [LOCAL] DIRECTORY directory1 select_statement1
- [INSERT OVERWRITE [LOCAL] DIRECTORY directory2 select_statement2] ...
- row_format
- : DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS TERMINATED BY char]
- [MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char]
- [NULL DEFINED AS char] (Note: Only available starting with Hive 0.13)
2 示例
[sql]
- hive> insert overwrite local directory '/home/hadoop/emp'
- > select *from emp;
- [hadoop@strong ~]$ vim /home/hadoop/emp/000000_0
- 1^AAlen
- 2^AJane
- 3^ATom
- 4^APeter
注:数据中的列与列之间的分隔符是^A(ascii码是\00001)。
3 使用Hadoop命令导出
[sql]
- [hadoop@strong ~]$ hadoop fs -copyToLocal /user/hive/warehouse/hive.db/emp/ /home/hadoop/
- [hadoop@strong ~]$ cat emp/*
- 1Alen
- 2Jane
- 3Tom
- 4Peter
五 插入值到表中
1 语法结构
[sql]
- Standard Syntax:
- INSERT INTO TABLE tablename [PARTITION (partcol1[=val1], partcol2[=val2] ...)] VALUES values_row [, values_row ...]
- Where values_row is:
- ( value [, value ...] )
- where a value is either null or any valid SQL literal
2 示例
[sql]
- hive> insert into emp values(5,'Jack'),(6,'Winter');
六 修改操作
1 语法结构
[sql]
- Standard Syntax:
- UPDATE tablename SET column = value [, column = value ...] [WHERE expression]
注:update操作仅仅在支持ACID的表上执行。
七 删除操作
1 语法结构
[sql]
- Standard Syntax:
- DELETE FROM tablename [WHERE expression]
注:delete操作仅仅在支持ACID的表上执行。
八 合并操作
1 语法结构
[sql]
- Standard Syntax:
- MERGE INTO <target table> AS T USING <source expression/table> AS S
- ON <boolean expression1>
- WHEN MATCHED [AND <boolean expression2>] THEN UPDATE SET <set clause list>
- WHEN MATCHED [AND <boolean expression3>] THEN DELETE
- WHEN NOT MATCHED [AND <boolean expression4>] THEN INSERT VALUES<value list>
注:merge操作仅仅在支持ACID的表上执行。
登录 | 立即注册