Analyzing XML Data:
- To load the xml data in local directory
$cat>xmltestfile.txt
<emp><ename>Kiran</ename><sal>10000</sal></emp>
<emp><ename>Seshu</ename><sal>20000</sal></emp>
<emp><ename>Ramu</ename><sal>30000</sal></emp>
<emp><ename>Rama</ename><sal>40000</sal></emp>
<emp><ename>Srinu</ename><sal>50000</sal></emp>
<emp><ename>Ravi</ename><sal>60000</sal></emp>
<emp><ename>Sandhya</ename><sal>70000</sal></emp>
^d
- To create hive table
Hive>create table
xmldata(str string);
- To load data into hive table from local xml file
hive> load data local
inpath 'xmltestfile.txt' into table xmldata;
- To create another table to extract the xml data
hive>create table
xmld1(ename array<string>, sal array<string>);
- To insert xmld table data into xmld1 table
hive>insert overwrite table
xmld1 select xpath(str, 'emp/ename/text()'), xpath(str,
'emp/sal/text()') from xmldata;
- To create another table to convert the data from array type to normal type
hive> create table
xmld2(ename string, sal int);
- To insert the data to xmld2 from xmld1
hive> insert overwrite
table xmld2 select ename[0],sal[0] from xmld1;
XPath UDFs:
XPath expressions :
Ex:
hive>
SELECT xpath(\'<a><b
id="foo">b1</b><b
id="bar">b2</b></a>\',\'//@id\')
> FROM
src LIMIT
1;
[foo","bar]
hive>
SELECT xpath
(\'<a><b
class="bb">b1</b><b>b2</b><b>b3</b><c
class="bb">c1</c>
<c>c2</c></a>\',
\'a/*[@class="bb"]/text()\')
> FROM
src LIMIT
1;
[b1","c1]
(The long XML string was wrapped
for space.)
hive>
SELECT xpath_double
(\'<a><b>2</b><c>4</c></a>\',
\'a/b +
a/c\')
> FROM
src LIMIT
1;
6.0
No comments:
Post a Comment