sql >> Base de Datos >  >> RDS >> Oracle

Cómo detectar caracteres UTF8 de 4 bytes en Oracle

Puede ser que hayas hecho algo mal con la creación de expresiones regulares:hay un breve ejemplo.

-- create table: 
create table tmp_a as 
select unistr('\D841\DF0E') col from dual;
insert into tmp_a(col)
values(UNISTR('\D800\DC00'));
insert into tmp_a(col)
values(UNISTR('\D800\DC01'));
insert into tmp_a(col)
values(UNISTR('\D803\DC03'));
insert into tmp_a(col)
values(UNISTR('\041f'));
insert into tmp_a(col)
values('a');
insert into tmp_a(col)
values('b');


-- then check. There is should be 7 rows and only 4 should pass then "check"
select col, dump(col) , regexp_instr(col, '['||UNISTR('\F090\8080')||'-'||UNISTR('\F48F\BFBF')||']') as check from  tmp_a ;


-- finaly we could build next query with regexp_like as in your example 
select count(*) 
  from tmp_a 
 where regexp_like(col, '['||UNISTR('\F090\8080')||'-'||UNISTR('\F48F\BFBF')||']')

Funciona en Oracle 11.2.0.4 y 12.2.0.1