HIVE: Both Left and Right Aliases Encountered in Join

Recently was stuck on this error while trying to do a JOIN in HIVE. Below is my SQL
create table ma.internet_v2 as
select A.msisdn, A.imsi, A.dt,
A.start_time, A.end_time, A.url, A.ttl_connection_dur_ms,
A.ttl_upload_bytes, A.ttl_download_bytes, A.ttl_cdr_cnt,
coalesce(B.domain_desc,A.domain_desc) domain_desc,
coalesce(B.subdomain_desc, A.subdomain_desc) subdomain_desc
db.internet A
left outer join
hfz_domain_name_mapping B
on instr(A.url, B.url_pattern) > 0;
HIVE then give me and error message "Both Left and Right Aliases Encountered in Join".

After going through some mail threads it seems that HIVE can support certain types of join.
Only equality joins, outer joins, and left semi joins are supported in Hive. Hive does not support join conditions that are not equality conditions as it is very difficult to express such conditions as a map/reduce job.
A poster from Stack Overflow suggested that one should use WHERE instead of ON. Duly following his advice, it seemed to do the trick. Final script is as below.
create table ma.internet_v2 as select A.msisdn, A.imsi, A.dt, A.start_time, A.end_time, A.url, A.ttl_connection_dur_ms,  A.ttl_upload_bytes, A.ttl_download_bytes, A.ttl_cdr_cnt,  coalesce(B.domain_desc,A.domain_desc) domain_desc, coalesce(B.subdomain_desc, A.subdomain_desc) subdomain_desc from db.internet A left outer join hfz_domain_name_mapping B where instr(A.url, B.url_pattern) > 0;
And now to wait for my map reduce to finish. Time to get some coffee :)


Popular posts from this blog

Assign select result to variable in Netezza stored procedure

Splitting value in Netezza using array_split