influxdb中的概念和数据结构

overiew

database -> retention policy -> shard group -> shard -> bolt.DB

measurement&series

measurement和series是database的表示层,分别表示为database中的metric和metric中具有相同tag的points序列。它们两个被用于对数据进行检索。

真正存储数据的是database中的RetentionPolicy(这个名字起的不太好),数据库中可以有很多RetentionPolicy。

1
2
3
4
5
6
7
8
9
10
11
12
13
type database struct {
name string

policies map[string]*RetentionPolicy // retention policies by name
continuousQueries []*ContinuousQuery // continuous queries

defaultRetentionPolicy string

// in memory indexing structures
measurements map[string]*Measurement // measurement name to object and index
series map[uint64]*Series // map series id to the Series object
names []string // sorted list of the measurement names
}

retention Policy

其中的shardGroups包含一个shardgroup数组

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
type RetentionPolicy struct {
// Unique name within database. Required.
Name string `json:"name"`

// Length of time to keep data around. A zero duration means keep the data forever.
Duration time.Duration `json:"duration"`

// Length of time to create shard groups in.
ShardGroupDuration time.Duration `json:"shardGroupDuration"`

// The number of copies to make of each shard.
ReplicaN uint32 `json:"replicaN"`

shardGroups []*ShardGroup
}

shard group

shardgroup中包含一个shard数组

1
2
3
4
5
6
7
// ShardGroup represents a group of shards created for a single time range.
type ShardGroup struct {
ID uint64 `json:"id,omitempty"`
StartTime time.Time `json:"startTime,omitempty"`
EndTime time.Time `json:"endTime,omitempty"`
Shards []*Shard `json:"shards,omitempty"`
}

shard

database中实际存储数据的是bolt.DB,在influxdb中被包装成shard。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
type Shard struct {
ID uint64 `json:"id,omitempty"`
DataNodeIDs []uint64 `json:"nodeIDs,omitempty"` // owners

mu sync.RWMutex
index uint64 // highest replicated index
store *bolt.DB // underlying data store
conn MessagingConn // streaming connection to broker

stats *Stats // In-memory stats

wg sync.WaitGroup // pending goroutines
closing chan struct{} // close notification
}

shard中的store保存了bolt.db的指针,用来对数据进行读取写入操作

所以database通过
database -> retention policy -> shard group -> shard -> bolt.DB
接口来存储数据。

1
2
3
4
5
6
7
replicaN    range(1, len(nodes))
shardN = len(nodes) / replicaN
so shardN range(len(nodes), 1)
if len(nodes) == 100 && replicaN == 5
then shardN=20
and these 20 shards compose a shard group
which means the cluster now has 5 shard groups, num(shard group) == replicaN