Snowflake SQL group-by behaving differently depending whether columns are referenced by position or alias

I am trying to understand why the group by function is yielding different results in snowflake depending on how I reference the group-by fields. Here are two Queries that I believe should yield the same result, but do NOT:

Query using explicit field alias references:

select
    hash('SHA2_256', CONCAT('field1','field2','field3','field4')) as hash
    ,field1
    ,field2
    ,field3
    ,field4
    ,count(*) as count
from <table>
    where 
        <some filters>
    group by hash, field1, field2, field3, field4;

Query using positional references to fields:

select
    hash('SHA2_256', CONCAT('field1','field2','field3','field4')) as hash
    ,field1
    ,field2
    ,field3
    ,field4
    ,count(*) as count
from <table>
    where 
        <same filters as above>
    group by 1,2,3,4,5;

The first query yields significantly more records, suggesting maybe it isn’t applying a grouping field that is being applied in the second query, but based on the snowflake docs I really believe they should be the same. How are these two different?

>Solution :

The clue is that the aliased expression hash does not overshadow existing columns so:

select
    hash('SHA2_256', CONCAT('field1','field2','field3','field4')) as hash
    ,field1
    ,field2
    ,field3
    ,field4
    ,count(*) as count
from <table>
where <some filters>
group by hash, field1, field2, field3, field4;

is

select
    hash('SHA2_256', CONCAT('field1','field2','field3','field4')) as hash
    ,field1
    ,field2
    ,field3
    ,field4
    ,count(*) as count
from <table>
where <some filters>
group by <table>.hash, field1, field2, field3, field4;

which is different than:

select
    hash('SHA2_256', CONCAT('field1','field2','field3','field4')) as hash
    ,field1
    ,field2
    ,field3
    ,field4
    ,count(*) as count
from <table>
where <same filters as above>
group by 1,2,3,4,5;

Leave a Reply