索引

定义

索引是为了加速对表中数据行的检索而创建的一种分散的存储结构。索引是针对表而建立的，它是由数据页面以外的索引页面组成的，每个索引页面中的行都会含有逻辑指针，以便加速检索物理数据。类似的像书的目录，前面有标题，后面有页数，便于方便快速的查找内容，而不是全量搜索。

索引工作原理

创建一个索引时，Hubble对你指定的列进行索引，它创建了列的备份，有了索引SQL可以便捷得使用索引来过滤值，而不需要全表逐行数据进行扫描，极大地提高了检索范围，时间上大幅度提高。

索引的应用

Hubble每个表都会自动创建一个primary索引，它对主键进行索引，如果没有主键，则为每个行唯一的rowid进行索引, 我们建议建表时候始终创建一个主键，而不要使用rowid，因为创建的主键将提供更好的查询性能。主键索引有助于过滤表的主键，可以通过以下方式对索引进行管理维护：

在CREATE TABLE建表时的INDEX子句中使用，也就是说在建表时候指定索引，Hubble还会自动为具有Unique约束的列创建二级索引。
对于已存在的表，使用CREATE INDEX语法进行建表。
使用ALTER TABLE添加唯一约束，它会为约束列自动创建索引。
在WHERE子句中用等号运算符（=、IN）过滤的列应在索引中排在第一位，并且在使用不等号运算符引用的列之前。
删除索引的语句drop index index_name。

优缺点

优点：

大大加快数据的检索速度
加速表和表之间的连接
在使用分组和排序子句进行数据检索时，可以显著减少查询中分组和排序的时间

缺点：

索引需要占物理空间
当对表中的数据进行增加、删除和修改的时候，索引也要动态的维护，降低了数据的维护速度

索引列

在设计索引时，重要的是要考虑索引哪些列以及列出它们的顺序。以下是一些指南，可帮助您做出最佳选择：每个表的主键（我们建议始终定义主键）会自动建立索引。它创建的索引不能更改，也不能在创建表后更改其主键，因此在建表时，确定如何创建主键索引是非常重要的。索引对查询很有帮助，即使仅过滤其列的前缀。例如，如果您创建列的索引(A，B，C)，则查询过滤有(A)或(A，B)的列都可以使用该索引。但是，未过滤带有(A)列的查询将不会从索引中得到性能提升的体现。索引在一张表中可以选用多个，但在实际应用中每张表的索引不会超过3个。

索引示例

示例1(主键索引)

 create table  custinfo(
    cust_no   varchar(30) not null,
    cust_name varchar(30) ,
    cust_card_no  varchar(18),  
    cust_phoneno  DECIMAL(15),
    primary key(cust_no)   
);

可以通过语句查看索引信息

show index from custinfo;

  table_name | index_name | non_unique | seq_in_index | column_name | direction | storing | implicit  
+------------+------------+------------+--------------+-------------+-----------+---------+----------+
  custinfo   | primary    |   false    |            1 | cust_no     | ASC       |  false  |  false

示例2(非主键索引)

create table  custinfo_bak(
    cust_no   varchar(30) not null,
    cust_name varchar(30) ,
    cust_card_no  varchar(18),  
    cust_phoneno  DECIMAL(15)
       
);

create index custbak_index on custinfo_bak (cust_card_no);

示例3(联合索引)

联合索引特别强调顺序性，联合索引大致类似上面的B+树结构，所以当索引的维护其实是以第一个字段来优先排序的，如果查询条件里没有第一个字段就没法通过索引比较来定位数据(like等造成的索引失效除外)。

只要列中包含有NULL值都将不会被包含在索引中，联合索引中只要有一列含有NULL值，那么这一列对于此联合索引就是无效的，所以我们在数据库设计时尽可能不要让字段的默认值为NULL。

 create table  school_student(
    stu_no     varchar(30) not null,
    stu_name   varchar(30) ,
    stu_class  varchar(18),  
    stu_order  DECIMAL(15)
    
);

create index ss_index on school_student (stu_no,stu_name,stu_class);

> select * from school_student where stu_no='1' and stu_name='zhangsan'; --走索引
> select * from school_student where stu_no='1' and stu_class='1';       --走索引
> select * from school_student where stu_no='1' and stu_name='zhangsan' and stu_class='1'; --走索引
> select * from school_student where stu_name='zhangsan' and stu_class='1'; --不走索引
> select * from school_student where stu_name='zhangsan'  ;--不走索引
> select * from school_student where stu_class='zhangsan' ;--不走索引

综上：只有包含第一列(stu_no)的时候才会走索引.

示例4(二级索引的应用)

o_ol_cnt通过STORING存储在索引中，则不再需要索引联接index join。

CREATE TABLE order_t (
	o_id integer NOT NULL,
	o_d_id integer NOT NULL,
	o_w_id integer NOT NULL,
	o_c_id integer not NULL,
	o_ol_cnt integer not NULL,
	o_all_local integer not NULL,
	PRIMARY KEY (o_w_id ASC, o_d_id ASC, o_id DESC),
	UNIQUE INDEX order_idx (o_w_id ASC, o_d_id ASC, o_c_id ASC, o_id DESC) 
);

explain select o_ol_cnt from order_t where o_d_id > 5 and o_w_id > 5 and o_c_id >2000;

       tree       |    field    |           description             
+-----------------+-------------+----------------------------------+
                  | distributed | true                              
                  | vectorized  | false                             
  render          |             |                                   
   └── index-join |             |                                   
        │         | table       | order_t@primary                   
        └── scan  |             |                                   
                  | table       | order_t@order_idx                 
                  | spans       | /6/6/2001-                        
                  | filter      | (o_d_id > 5) AND (o_c_id > 2000)

CREATE TABLE order_bak (
	o_id integer NOT NULL,
	o_d_id integer NOT NULL,
	o_w_id integer NOT NULL,
	o_c_id integer not NULL,
	o_ol_cnt integer not NULL,
	o_all_local integer not NULL,
	PRIMARY KEY (o_w_id ASC, o_d_id ASC, o_id DESC),
	UNIQUE INDEX order_idx (o_w_id ASC, o_d_id ASC, o_c_id ASC, o_id DESC) STORING (o_ol_cnt) --区分点
);

explain select o_ol_cnt from order_bak where o_d_id > 5 and o_w_id > 5 and o_c_id >2000;

    tree    |    field    |           description             
+-----------+-------------+----------------------------------+
            | distributed | true                              
            | vectorized  | false                             
  render    |             |                                   
   └── scan |             |                                   
            | table       | order_bak@order_idx               
            | spans       | /6/6/2001-                        
            | filter      | (o_d_id > 5) AND (o_c_id > 2000)

通过比较可看出：数据的查询仅需要从二级索引中读取，因此效率更高。

下载PDF

本页导航

定义
索引工作原理
索引的应用
优缺点
索引列
索引示例