第 12 章 - SQL语言艺术免费阅读 - 其他类型

hit_counter a,

    hit_counter b

    where b.tcomstamp = a.tcomstamp + 300

    and b.statistic_id = a.statistic_id

    order by a.tcomstamp, a.statistic_id

    上述脚本有重大缺陷：如果第二个累计值不是正好在第一个累计值之后5分钟取得的，那么就无

    法连接这两条记录。于是，我们改以“范围条件”定义连接。查询如下：

    select a.tcomstamp,

    a.statistic_id,

    (b.counter - a.counter) * 60 /

    (b.tcomstamp - a.tcomstamp) hits_per_minute

    from hit_counter a,

    hit_counter b

    where b.tcomstamp between a.tcomstamp + 200

    and a.tcomstamp + 400

    and b.statistic_id = a.statistic_id

    order by a.tcomstamp, a.statistic_id

    这个方法还是有缺陷：前后两次计算累计值的时间间隔，如果不介于200 到400 秒之间（例

    如取样频率改变了），如此之大的时间跨度就会引起风险。

    我们还有更安全的方法，就是使用基于“记录窗口（windows of rows）”的OLAP函数（OLAP

    function）。难以想象，这种本质上不太符合关系理论的技术可以显著提升xìng能，但应作为查询

    优化的最后手段使用。借助partition 子句，OLAP函数支持“分别处理结果集的不同子集”，比如

    分别对它们进行排序、总计等处理。借助OLAP 函数row_number()，可以根据statistic_id 建立

    子集，然后按时间戳增大的顺序为不同统计赋予连续整数编号，接下来，就可以连接statistic_id

    和两个序号了，如下例子所示：

    select a.tcomstamp,

    a.statistic_id,

    (b.counter - a.counter) * 60 /

    (b.tcomstamp - a.tcomstamp)

    from (select tcomstamp,

    statistic_id,

    counter,

    row_number( ) over (partition by statistic_id

    order by tcomstamp) rn

    from hit_counter) a,

    (select tcomstamp,

    statistic_id,

    counter,

    row_number( ) over (partition by statistic_id

    order by tcomstamp) rn

    from hit_counter) b