I’m having issues when running the command MERGE INTO on Snowflake. The data is located in a bucket on S3. The files format are .snappy.parquet.
The command runs well, it identifies the files in S3, but it loads only NULL values to the table. The total row numbers are also good.
I confirmed that @myExternalStageToS3 points to the right location by running a query which returned the expected values:
SELECT
$1:DAY,
$1:CHANNEL_CATEGORY,
$1:SOURCE,
$1:PLATFORM,
$1:LOB
@myExternalStageToS3
(FILE_FORMAT => 'sf_parquet_format')
As it is a new table with no records, the condition uses INSERT.
MERGE INTO myTable as target USING
(
SELECT
$1:DAY,
$1:CHANNEL_CATEGORY,
$1:SOURCE,
$1:PLATFORM,
$1:LOB
FROM @myExternalStageToS3
(FILE_FORMAT => 'sf_parquet_format')
) as src
ON target.CHANNEL_CATEGORY = src.$1:CHANNEL_CATEGORY
AND target.SOURCE = src.$1:SOURCE
WHEN MATCHED THEN
UPDATE SET
DAY= src.$1:DAY
,CHANNEL_CATEGORY= src.$1:CHANNEL_CATEGORY
,SOURCE= src.$1:SOURCE
,PLATFORM= src.$1:PLATFORM
,LOB= src.$1:LOB
WHEN NOT MATCHED THEN
INSERT
(
DAY,
CHANNEL_CATEGORY,
SOURCE,
PLATFORM,
LOB
) VALUES
(
src.$1:DAY,
src.$1:CHANNEL_CATEGORY,
src.$1:SOURCE,
src.$1:PLATFORM,
src.$1:LOB
);
The sf_parque_format was created with these details:
create or replace file format sf_parquet_format
type = 'parquet'
compression = auto;
Do you have any idea what am I missing?
>Solution :
The query inside USING part was altered(data type casts and aliases):
MERGE INTO myTable as target USING (
SELECT
$1:DAY::TEXT AS DAY,
$1:CHANNEL_CATEGORY::TEXT AS CHANNEL_CATEGORY,
$1:SOURCE::TEXT AS SOURCE,
$1:PLATFORM::TEXT AS PLATFROM,
$1:LOB::TEXT AS LOB
FROM @myExternalStageToS3
(FILE_FORMAT => 'sf_parquet_format')
) as src
ON target.CHANNEL_CATEGORY = src.CHANNEL_CATEGORY
AND target.SOURCE = src.SOURCE
WHEN MATCHED THEN
UPDATE SET
DAY= src.DAY
,PLATFORM= src.PLATFORM
,LOB= src.LOB
WHEN NOT MATCHED THEN
INSERT (
DAY,
CHANNEL_CATEGORY,
SOURCE,
PLATFORM,
LOB
) VALUES (
src.DAY,
src.CHANNEL_CATEGORY,
src.SOURCE,
src.PLATFORM,
src.LOB
);
The UPDATE part does not require ,CHANNEL_CATEGORY= src.CHANNEL_CATEGORY ,SOURCE= src.SOURCE as condition is already met by ON clasue.