Wednesday, July 18, 2012

HAVING and CALCULATED in PROC SQL

HAVING: The distinction between HAVING and the WHERE clause is that HAVING conditions can reference summary statistics and are evaluated after aggregations are performed. Thus they take effect “downstream,” on the output side of the process. So we can use WHERE statement in the data sets to replace this HAVING, as is done below.

CALCULATED: if you want to use the calculated result during the SQL select clause, you need to add CALCULATED otherwise there will be error like: ERROR: The following columns were not found in the contributing tables: price_range

In the following the first one is faster than the second, although they generate the same results.


proc sql;
create table outdata.event_trends1 as
select event_id, active_max_price-active_min_price as price_range, max(calculated price_range) as max_range, datepart(trend_time) as trend_date
from fansnapd.event_trends
group by event_id
having calculated price_range=max_range
order by event_id;
quit;

proc sql;
create table outdata.event_trends(where=(price_range=max_range)) as
select event_id, active_max_price-active_min_price as price_range, max(calculated price_range) as max_range, datepart(trend_time) as trend_date
from fansnapd.event_trends
group by event_id
order by event_id;
quit;

No comments:

Post a Comment