A certain piece of SQL needs to parse a json string in a field. However, the Chinese characters in it are also encoded and stored in a similar format.

u6e38u620f

The SQL statement is theoretically as follows

get_json_object(extends,'$.cate')='u6e38u620f'

However, the actual system execution involves layers of compilation processes. How many backslashes should be written to select the correct result?

The experimental conclusions are as follows:

exist

spark-hive>

In this case, just enter two backslashes

spark-hive> ……get_json_object(extends,'$.cate')='\u6e38\u620f'……

If using

spark-hive -e "" > out.txt

To execute, you need to enter four backslashes

spark-hive -e "……get_json_object(extends,'$.cate')='\\u6e38\\u620f'……" > out.txt


Leave a Reply