Tag Archives: DB Design

Primary Key 跟 Table 可以在不同的 File Group

use Northwind
go
ALTER DATABASE [Northwind] ADD FILEGROUP [fg]
ALTER DATABASE [Northwind] ADD FILE
( NAME = N’Northwind1′, FILENAME = N’C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\Northwind1.ndf’  )
TO FILEGROUP [fg]

CREATE TABLE t(c1 int not null,c2 int)

CREATE CLUSTERED INDEX idx ON t(c1) ON [PRIMARY]

ALTER TABLE t ADD CONSTRAINT pk PRIMARY KEY(c1) ON fg

SELECT o.[name], o.[type], i.[name], i.[index_id], f.[name]
FROM sys.indexes i JOIN sys.filegroups f ON i.data_space_id = f.data_space_id
INNER JOIN sys.all_objects o ON i.[object_id] = o.[object_id]
AND o.name=’t’

image

DROP TABLE t

ALTER DATABASE [Northwind]  REMOVE FILE [Northwind1]
GO
ALTER DATABASE [Northwind] REMOVE FILEGROUP [fg]
GO

廣告

讓 SQL Server 所有的 File 一起長大

承修提了一個有趣的 trace flag,SQL 2008 後新增的 trace flag 1117,可以參考

http://blogs.msdn.com/b/axperf/archive/2011/09/12/consider-enabling-trace-flag-1117-on-dynamics-ax-sql-server.aspx

我的疑問是:一起等大長大,但若本來 SQL Server Instance 跨的硬碟大小不同,會不會造成因為小的空間不夠而無法增大?這或許要試試…

讀取 heap 的 FirstIAM

use tempdb

create table t(id int identity,c1 char(894))
–就算指定 100% 的 fillfactor 和 pad_index,在資料表擺放資料或建置索引時
–每個資料頁依然會保留空間
create index idx on t(id,c1) with fillfactor=100, PAD_INDEX
set nocount on
go
insert t values(‘hello’)
go 4000

 

select convert(char(30), name) ‘name’,
id,
indid,
convert(varchar(2), (convert(int, substring(first, 6, 1)) * power(2, 8)) +
(convert(int, substring(first, 5, 1)))) + ‘:’ +
convert(varchar(11),
(convert(int, substring(first, 4, 1)) * power(2, 24)) +
(convert(int, substring(first, 3, 1)) * power(2, 16)) +
(convert(int, substring(first, 2, 1)) * power(2, 8)) +
(convert(int, substring(first, 1, 1)))) ‘first’,
first,
convert(varchar(2), (convert(int, substring(root, 6, 1)) * power(2, 8)) +
(convert(int, substring(root, 5, 1)))) + ‘:’ +
convert(varchar(11),
(convert(int, substring(root, 4, 1)) * power(2, 24)) +
(convert(int, substring(root, 3, 1)) * power(2, 16)) +
(convert(int, substring(root, 2, 1)) * power(2, 8)) +
(convert(int, substring(root, 1, 1)))) ‘root’,
root,
convert(varchar(2), (convert(int, substring(firstIAM, 6, 1)) * power(2, 8))
+ (convert(int, substring(firstIAM, 5, 1)))) + ‘:’ +
convert(varchar(11),
(convert(int, substring(firstIAM, 4, 1)) * power(2, 24)) +
(convert(int, substring(firstIAM, 3, 1)) * power(2, 16)) +
(convert(int, substring(firstIAM, 2, 1)) * power(2, 8)) +
(convert(int, substring(firstIAM, 1, 1)))) ‘firstIAM’,
firstIAM
from sys.sysindexes
where id=OBJECT_ID(‘t’)

image

dbcc traceon(3604)
dbcc page(‘tempdb’,1,342,3)

執行結果

PAGE: (1:342)

BUFFER:

BUF @0x0000000089FC5A40

bpage = 0x0000000089646000           bhash = 0x0000000000000000           bpageno = (1:342)
bdbid = 2                            breferences = 0                      bcputicks = 0
bsampleCount = 0                     bUse1 = 11231                        bstat = 0xc0000b
blog = 0xcbbbb89b                    bnext = 0x0000000000000000          

PAGE HEADER:

Page @0x0000000089646000

m_pageId = (1:342)                   m_headerVersion = 1                  m_type = 10
m_typeFlagBits = 0x0                 m_level = 0                          m_flagBits = 0x0
m_objId (AllocUnitId.idObj) = 580029 m_indexId (AllocUnitId.idInd) = 8448
Metadata: AllocUnitId = 2377900641264402432                              
Metadata: PartitionId = 8502796127483068416                               Metadata: IndexId = 0
Metadata: ObjectId = 1694017166      m_prevPage = (0:0)                   m_nextPage = (0:0)
pminlen = 90                         m_slotCnt = 2                        m_freeCnt = 6
m_freeData = 8182                    m_reservedCnt = 0                    m_lsn = (420:255:130)
m_xactReserved = 0                   m_xdesId = (0:0)                     m_ghostRecCnt = 0
m_tornBits = 0                      

Allocation Status

GAM (1:2) = ALLOCATED                SGAM (1:3) = NOT ALLOCATED          
PFS (1:1) = 0x70 IAM_PG MIXED_EXT ALLOCATED   0_PCT_FULL                  DIFF (1:6) = NOT CHANGED
ML (1:7) = NOT MIN_LOGGED           

IAM: Header @0x0000000018CFA064 Slot 0, Offset 96

sequenceNumber = 0                   status = 0x0                         objectId = 0
indexId = 0                          page_count = 0                       start_pg = (1:0)

IAM: Single Page Allocations @0x0000000018CFA08E

Slot 0 = (1:330)                     Slot 1 = (1:359)                     Slot 2 = (1:364)
Slot 3 = (1:372)                     Slot 4 = (1:377)                     Slot 5 = (1:379)
Slot 6 = (1:382)                     Slot 7 = (1:387)                    

IAM: Extent Alloc Status Slot 1 @0x0000000018CFA0C2

(1:0)        – (1:448)      = NOT ALLOCATED                              
(1:456)      -              =     ALLOCATED                              
(1:464)      – (1:472)      = NOT ALLOCATED                              
(1:480)      -              =     ALLOCATED                              
(1:488)      -              = NOT ALLOCATED                              
(1:496)      -              =     ALLOCATED                              
(1:504)      – (1:552)      = NOT ALLOCATED                              
(1:560)      -              =     ALLOCATED                              
(1:568)      – (1:584)      = NOT ALLOCATED                              
(1:592)      -              =     ALLOCATED                              
(1:600)      -              = NOT ALLOCATED                              
(1:608)      -              =     ALLOCATED                              
(1:616)      – (1:632)      = NOT ALLOCATED                              
(1:640)      -              =     ALLOCATED                              
(1:648)      – (1:688)      = NOT ALLOCATED                              
(1:696)      -              =     ALLOCATED                              
(1:704)      – (1:752)      = NOT ALLOCATED                              
(1:760)      -              =     ALLOCATED                              
(1:768)      -              = NOT ALLOCATED                              
(1:776)      -              =     ALLOCATED                               
               
                         

Heap 的 Row Forward 與 Clustered Index 的 Page Split

觀察動態長度資料修改時,若該 page 無法提供足夠空間放變動後資料,會造成的影響。

若資料表為 Heap,則 SQL Server 會在該紀錄原址放一個 Forward 指標,並將新資料放在新增的 Page 內,這在修改上效率較佳。但查詢大量資料時效率較差,因為需來回逐紀錄讀取 Page

若資料表為 Clustered,則 SQL Server 會做 Page Split,這在大量修改超過長度的紀錄時效率較差。但查詢大量資料時效率較佳

Heap

建立 Heap:

use tempdb
drop table Heap
create table Heap(c1 int identity,c2 varchar(1000))
go
set nocount on
insert Heap values(‘a’)
go 1000

begin tran
    update Heap set c2=REPLICATE(‘a’,500) where c1<10

觀察 Performance Counter

\SQL Server:Access Methods\Forwarded Records/sec  : 9.007

\SQL Server:Access Methods\Page Splits/sec : 2.002


    exec sp_lock @@spid

image

    dbcc traceon(3604)
    dbcc page(tempdb,1,455,3)
commit

DBCC 的執行已經完成。如果 DBCC 印出錯誤訊息,請連絡您的系統管理員。

PAGE: (1:455)

BUFFER:

BUF @0x0000000097FF6A00

bpage = 0x0000000097E70000           bhash = 0x0000000000000000           bpageno = (1:455)
bdbid = 2                            breferences = 0                      bcputicks = 0
bsampleCount = 0                     bUse1 = 26877                        bstat = 0xc0000b
blog = 0xcbcbbbbb                    bnext = 0x0000000000000000          

PAGE HEADER:

Page @0x0000000097E70000

m_pageId = (1:455)                   m_headerVersion = 1                  m_type = 1
m_typeFlagBits = 0x4                 m_level = 0                          m_flagBits = 0x8
m_objId (AllocUnitId.idObj) = 628154 m_indexId (AllocUnitId.idInd) = 28160
Metadata: AllocUnitId = 7926335385338773504                              
Metadata: PartitionId = 7782220191937069056                               Metadata: IndexId = 0
Metadata: ObjectId = 1930489956      m_prevPage = (0:0)                   m_nextPage = (0:0)
pminlen = 8                          m_slotCnt = 428                      m_freeCnt = 455
m_freeData = 7025                    m_reservedCnt = 63                   m_lsn = (151:433:215)
m_xactReserved = 63                  m_xdesId = (0:21857)                 m_ghostRecCnt = 0
m_tornBits = 0                      

Allocation Status

GAM (1:2) = ALLOCATED                SGAM (1:3) = ALLOCATED              
PFS (1:1) = 0x63 MIXED_EXT ALLOCATED  95_PCT_FULL                         DIFF (1:6) = NOT CHANGED
ML (1:7) = NOT MIN_LOGGED           

Slot 0 Offset 0x1b20 Length 9

Record Type = FORWARDING_STUB        Record Attributes =                  Record Size = 9

Memory Dump @0x000000001694BB20

0000000000000000:   04e30100 00010090 00†††††††††††††††††………       
Forwarding to  =  file 1 page 483 slot 144                               

Slot 1 Offset 0x1b29 Length 9

 

select avg_fragmentation_in_percent,fragment_count,avg_fragment_size_in_pages,page_count,avg_page_space_used_in_percent,forwarded_record_count
from sys.dm_db_index_physical_stats(DB_ID(),NULL,NULL,NULL,’DETAILED’)
WHERE object_id=object_id(‘Heap’)

結果

avg_fragmentation_in_percent fragment_count       avg_fragment_size_in_pages page_count           avg_page_space_used_in_percent forwarded_record_count
—————————- ——————– ————————– ——————– —————————— ———————-
50                           2                    1.5                        3                    93.451939708426                9

查詢資料時會浪費 I/O,往返讀取 Forward 紀錄

set statistics io on
select * from Heap where c1 < 10

資料表 ‘Heap’。掃描計數 1,邏輯讀取 12,實體讀取 0,讀取前讀取 0,LOB 邏輯讀取 0,LOB 實體讀取 0,LOB 讀取前讀取 0。

SQL Server 2008 後才提供重整 Heap 的能力

alter table Heap rebuild

select avg_fragmentation_in_percent,fragment_count,avg_fragment_size_in_pages,page_count,avg_page_space_used_in_percent,forwarded_record_count
from sys.dm_db_index_physical_stats(DB_ID(),NULL,NULL,NULL,’DETAILED’)
WHERE object_id=object_id(‘Heap’)

avg_fragmentation_in_percent fragment_count       avg_fragment_size_in_pages page_count           avg_page_space_used_in_percent forwarded_record_count
—————————- ——————– ————————– ——————– —————————— ———————-
50                           2                    2                          4                    69.4495922905856               0

重整後 I/O 減少了

select * from Heap where c1 < 10

資料表 ‘Heap’。掃描計數 1,邏輯讀取 4,實體讀取 0,讀取前讀取 0,LOB 邏輯讀取 0,LOB 實體讀取 0,LOB 讀取前讀取 0。

 

Clustered Index

若是採用 Clustered Index,則會造成 Page Split

建立 Clustered Index:

drop table tblClustered
create table tblClustered(c1 int identity primary key,c2 varchar(1000))
go


set nocount on
insert tblClustered values(‘a’)
go 1000

begin tran
    update tblClustered set c2=REPLICATE(‘a’,500) where c1<10

觀察 Performance Counter

\SQL Server:Access Methods\Forwarded Records/sec : 0

\SQL Server:Access Methods\Page Splits/sec : 7.999

exec sp_lock @@spid

image

dbcc traceon(3604)
dbcc page(tempdb,1,448,3)
commit

DBCC 的執行已經完成。如果 DBCC 印出錯誤訊息,請連絡您的系統管理員。

PAGE: (1:448)

BUFFER:

BUF @0x0000000099FF2A40

bpage = 0x0000000099DC6000           bhash = 0x0000000000000000           bpageno = (1:448)
bdbid = 2                            breferences = 0                      bcputicks = 0
bsampleCount = 0                     bUse1 = 27917                        bstat = 0xc0000b
blog = 0xbcbcbbbb                    bnext = 0x0000000000000000          

PAGE HEADER:

Page @0x0000000099DC6000

m_pageId = (1:448)                   m_headerVersion = 1                  m_type = 1
m_typeFlagBits = 0x4                 m_level = 0                          m_flagBits = 0x8000
m_objId (AllocUnitId.idObj) = 647529 m_indexId (AllocUnitId.idInd) = 3328
Metadata: AllocUnitId = 936748764929523712                               
Metadata: PartitionId = 216172818896715776                                Metadata: IndexId = 1
Metadata: ObjectId = 39007220        m_prevPage = (1:240)                 m_nextPage = (1:247)
pminlen = 8                          m_slotCnt = 217                      m_freeCnt = 3192
m_freeData = 4582                    m_reservedCnt = 0                    m_lsn = (159:376:27)
m_xactReserved = 0                   m_xdesId = (0:0)                     m_ghostRecCnt = 0
m_tornBits = 0                      

Allocation Status

GAM (1:2) = ALLOCATED                SGAM (1:3) = ALLOCATED              
PFS (1:1) = 0x60 MIXED_EXT ALLOCATED   0_PCT_FULL                         DIFF (1:6) = NOT CHANGED
ML (1:7) = NOT MIN_LOGGED           

Slot 0 Offset 0x60 Length 515

Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP VARIABLE_COLUMNS
Record Size = 515                   
Memory Dump @0x000000001011A060

0000000000000000:   30000800 08000000 02000001 00030261 †0…………..a
0000000000000010:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
0000000000000020:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
0000000000000030:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
0000000000000040:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
0000000000000050:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
0000000000000060:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
0000000000000070:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
0000000000000080:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
0000000000000090:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
00000000000000A0:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
00000000000000B0:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
00000000000000C0:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
00000000000000D0:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
00000000000000E0:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
00000000000000F0:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
0000000000000100:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
0000000000000110:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
0000000000000120:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
0000000000000130:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
0000000000000140:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
0000000000000150:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
0000000000000160:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
0000000000000170:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
0000000000000180:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
0000000000000190:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
00000000000001A0:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
00000000000001B0:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
00000000000001C0:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
00000000000001D0:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
00000000000001E0:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
00000000000001F0:   61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
0000000000000200:   616161†††††††††††††††††††††††††††††††aaa             

Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4

select avg_fragmentation_in_percent,fragment_count,avg_fragment_size_in_pages,page_count,avg_page_space_used_in_percent,forwarded_record_count
from sys.dm_db_index_physical_stats(DB_ID(),NULL,NULL,NULL,’DETAILED’)
WHERE object_id=object_id(‘tblClustered’)

結果

avg_fragmentation_in_percent fragment_count       avg_fragment_size_in_pages page_count           avg_page_space_used_in_percent forwarded_record_count
—————————- ——————– ————————– ——————– —————————— ———————-
60                           4                    1.25                       5                    55.5497899678774              NULL
0                            1                    1                          1                    0.778354336545589             NULL

查詢資料時直接 Scan

set statistics io on
select * from Heap where c1 < 10

資料表 ‘tblClustered’。掃描計數 1,邏輯讀取 3,實體讀取 0,讀取前讀取 0,LOB 邏輯讀取 0,LOB 實體讀取 0,LOB 讀取前讀取 0。

資料壓縮可以依不同 table 不同 Partition 定義不同的壓縮方式

CREATE DATABASE [test] ON  PRIMARY
( NAME = N’test’, FILENAME = N’C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\test.mdf’),
FILEGROUP [fg2]
( NAME = N’test2′, FILENAME = N’C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\test2.ndf’),
FILEGROUP [fg3]
( NAME = N’test3′, FILENAME = N’C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\test3.ndf’)
LOG ON
( NAME = N’test_log’, FILENAME = N’C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\test_log.ldf’)
GO

USE [test]
GO
CREATE PARTITION FUNCTION [fn_PartFn](varchar(50)) AS RANGE LEFT FOR VALUES (N’F’,N’L’)

CREATE PARTITION SCHEME [fn_PartSc] AS PARTITION [fn_PartFn] TO ([PRIMARY], [fg2],[fg3])

CREATE TABLE [dbo].[tb](
    [c1] [int] NOT NULL,
    [c2] [char](10) NULL,
    [c3] [varchar](50) NOT NULL
    PRIMARY KEY (c3,c1)
)ON [fn_PartSc]([c3])

CREATE TABLE [dbo].[tb2](
    [c1] [int] NOT NULL,
    [c2] [char](10) NULL,
    [c3] [varchar](50) NOT NULL
    PRIMARY KEY (c3,c1)
)ON [fn_PartSc]([c3])

ALTER TABLE [dbo].[tb] REBUILD PARTITION=ALL
WITH
(DATA_COMPRESSION = NONE ON PARTITIONS(1),
DATA_COMPRESSION = ROW ON PARTITIONS(2),
DATA_COMPRESSION = PAGE ON PARTITIONS(3)
)

ALTER TABLE [dbo].[tb2] REBUILD PARTITION = 1 WITH(DATA_COMPRESSION = ROW )
ALTER TABLE [dbo].[tb2] REBUILD PARTITION = 2 WITH(DATA_COMPRESSION = PAGE )
ALTER TABLE [dbo].[tb2] REBUILD PARTITION = 3 WITH(DATA_COMPRESSION = NONE )

insert tb values(1,’a’,’a’),(2,’g’,’g’),(3,’m’,’m’)
insert tb2 values(1,’a’,’a’),(2,’g’,’g’),(3,’m’,’m’)

SELECT $PARTITION.fn_PartFn(c3) AS Partition,
COUNT(*) AS [COUNT] FROM tb
GROUP BY $PARTITION.fn_PartFn(c3)
ORDER BY Partition ;

SELECT
p.partition_number AS [PartitionNumber],
p.data_compression AS [DataCompression],
prv.value AS [RightBoundaryValue],
fg.name AS [FileGroupName],
CAST(p.rows AS float) AS [RowCount],
OBJECT_NAME(idx.Object_ID) ObjectName
FROM
sys.tables AS tbl
INNER JOIN sys.indexes AS idx ON idx.object_id = tbl.object_id
INNER JOIN sys.partitions AS p ON p.object_id=CAST(tbl.object_id AS int) AND p.index_id=idx.index_id
INNER JOIN sys.indexes AS indx ON p.object_id = indx.object_id and p.index_id = indx.index_id
LEFT OUTER JOIN sys.destination_data_spaces AS dds ON dds.partition_scheme_id = indx.data_space_id and dds.destination_id = p.partition_number
LEFT OUTER JOIN sys.partition_schemes AS ps ON ps.data_space_id = indx.data_space_id
LEFT OUTER JOIN sys.partition_range_values AS prv ON prv.boundary_id = p.partition_number and prv.function_id = ps.function_id
LEFT OUTER JOIN sys.filegroups AS fg ON  fg.data_space_id = dds.data_space_id or fg.data_space_id = indx.data_space_id
ORDER BY
[PartitionNumber],idx.object_id ASC

USE master
DROP DATABASE test

建立複合索引的順序

前晚許志學顧問分享了一個須注意的 primary key 結合 clustered index 的案例。例如多對多關係時,鍵值順序如何?

CREATE TABLE A(A1 INT PRIMARY KEY)
CREATE TABLE B(B1 INT PRIMARY KEY)

CREATE TABLE C(A1 INT,B1 INT PRIMARY KEY(A1,B1))
還是
CREATE TABLE C(A1 INT,B1 INT PRIMARY KEY(B1,A1))

ALTER TABLE C ADD CONSTRAINT FK_A FOREIGN KEY(A1) REFERENCES A(A1)
ALTER TABLE C ADD CONSTRAINT FK_B FOREIGN KEY(B1) REFERENCES B(B1)

例如:A 代表訂單,B 代表消費原因,一張訂單可能有多種消費原因,一種消費原因可能發生在多張訂單上,則 A1 是訂單編號,B1 是消費原因編號

我贊同他的觀點並加上自己的想法:

  • 若考慮大量資料新增,則哪個資料表常新增,則依它的新增順序讓該欄位排前面,避免 page split
  • 若以查詢而言,則越多記錄的,相較起來值越唯一的放在前面,大量重複的資料不適合當索引
  • C table 不一定要採用複合鍵當主鍵,仍可以另建 identity 欄位當主鍵

若設計一層層展開的子 table,每更深一層的子 table 都加上父 table 的 PK 再加一欄子 table 有意義的欄位;當作子 table 的 PK。則原理相同,通常建複合主鍵也是叢集索引時,依然是該子 table 的流水號放第一個欄位。但,我依然會嘗試另建子 table 欄位內的 identity 流水號當作 PK 與叢集索引。避免重集索引太大,且意義多,容易需要變動。

驗證未發生 Checkpoint 前,交易結果僅在 Log 和 Memory,未在資料庫檔

中午與俊宇聊天,他提及設定 recovery interval 來減少 I/O,讓我興起驗證是否這些動作是否僅有 .ldf 和記憶體有,但資料庫檔沒有,這麼做可以減少資料庫檔的 I/O,因為 .ldf 的 I/O 是免不了的。

換句話說,可能對同一筆紀錄發生的十次修改,這在 ldf 有十次紀錄。但因為延長 recovery interval,造成只做了一次 checkpoint,把十次對 memory data page 的最終結果,僅對 mdf 更新一次。

結果證明俊宇說的應該可行

–增長 CheckPoint 的週期

sp_configure ‘recovery interval (min)’,30
reconfigure

–簡單建點資料

create database dbTest
go
use dbTest
create table tb(c1 int identity, c2 char(100) default(‘Hello’),c3 uniqueidentifier default(newsequentialid()))
CHECKPOINT;

insert tb default values
select * from tb

begin tran
insert tb default values
exec sp_lock @@spid

image

commit
dbcc traceon(3604)
dbcc page(dbTest,1,80,1)

執行結果:

DBCC 的執行已經完成。如果 DBCC 印出錯誤訊息,請連絡您的系統管理員。

PAGE: (1:80)

BUFFER:

BUF @0x0000000086FABE80

bpage = 0x00000000861FC000           bhash = 0x0000000000000000           bpageno = (1:80)
bdbid = 28                           breferences = 0                      bcputicks = 0
bsampleCount = 0                     bUse1 = 48270                        bstat = 0xc0000b
blog = 0xca2159bb                    bnext = 0x0000000000000000

PAGE HEADER:

Page @0x00000000861FC000

m_pageId = (1:80)                    m_headerVersion = 1                  m_type = 1
m_typeFlagBits = 0x4                 m_level = 0                          m_flagBits = 0x8000
m_objId (AllocUnitId.idObj) = 29     m_indexId (AllocUnitId.idInd) = 256
Metadata: AllocUnitId = 72057594039828480
Metadata: PartitionId = 72057594038779904                                 Metadata: IndexId = 0
Metadata: ObjectId = 2105058535      m_prevPage = (0:0)                   m_nextPage = (0:0)
pminlen = 124                        m_slotCnt = 2                        m_freeCnt = 7838
m_freeData = 350                     m_reservedCnt = 0                    m_lsn = (36:144:3)
m_xactReserved = 0                   m_xdesId = (0:0)                     m_ghostRecCnt = 0
m_tornBits = 0

Allocation Status

GAM (1:2) = ALLOCATED                SGAM (1:3) = ALLOCATED
PFS (1:1) = 0x61 MIXED_EXT ALLOCATED  50_PCT_FULL                         DIFF (1:6) = CHANGED
ML (1:7) = NOT MIN_LOGGED

DATA:

Slot 0, Offset 0x60, Length 127, DumpStyle BYTE

Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP     Record Size = 127

Memory Dump @0x0000000014B9A060

0000000000000000:   10007c00 01000000 48656c6c 6f202020 †..|…..Hello   
0000000000000010:   20202020 20202020 20202020 20202020 †
0000000000000020:   20202020 20202020 20202020 20202020 †
0000000000000030:   20202020 20202020 20202020 20202020 †
0000000000000040:   20202020 20202020 20202020 20202020 †
0000000000000050:   20202020 20202020 20202020 20202020 †
0000000000000060:   20202020 20202020 20202020 0b9dc040 †            …@
0000000000000070:   7a4211e0 a505705a b6d3467d 030000††††zB….pZ..F}…

Slot 1, Offset 0xdf, Length 127, DumpStyle BYTE

Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP     Record Size = 127

Memory Dump @0x0000000014B9A0DF

0000000000000000:   10007c00 02000000 48656c6c 6f202020 †..|…..Hello
0000000000000010:   20202020 20202020 20202020 20202020 †
0000000000000020:   20202020 20202020 20202020 20202020 †
0000000000000030:   20202020 20202020 20202020 20202020 †
0000000000000040:   20202020 20202020 20202020 20202020 †
0000000000000050:   20202020 20202020 20202020 20202020 †
0000000000000060:   20202020 20202020 20202020 23f4a126 †            #..&
0000000000000070:   7a4211e0 a505705a b6d3467d 030000††††zB….pZ..F}…

OFFSET TABLE:

Row – Offset
1 (0x1) – 223 (0xdf)
0 (0x0) – 96 (0x60)

 

證明 Memory 資料都有上去

dbcc log(dbTest,3)

在 RowLog Contents 0 欄位的確看得到兩筆紀錄

立刻關掉 SQL Server,不准讓 Log 的內容更新到 mdf

SHUTDOWN WITH NOWAIT;

透過檔案總管刪掉 .ldf 後,重新啟動。透過以下語法救回 DB,等於是重建 ldf,但先前的交易內容就回不來了

alter database dbTest set emergency
alter database dbTest set single_user
DBCC CHECKDB (dbTest, REPAIR_ALLOW_DATA_LOSS)WITH  ALL_ERRORMSGS;

執行結果:

檔案啟用錯誤。實體檔案名稱 “C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\dbTest_log.LDF" 可能不正確。
無法重建記錄,因為關閉資料庫時仍有開啟的交易/使用者、資料庫未發生檢查點,或資料庫是唯讀的。如果手動刪除交易記錄檔,或因硬體或環境失敗而遺失交易記錄檔,就可能發生這種錯誤。
警告: 資料庫 ‘dbTest’ 的記錄已經重建。已遺失交易一致性。RESTORE 鏈已中斷,伺服器已經沒有先前記錄檔的內容,因此您必須知道這些內容。您應該執行 DBCC CHECKDB 來驗證實體一致性。資料庫已進入僅限 DBO 模式。您準備好讓資料庫能供使用後,必須重設資料庫選項,並刪除任何額外的記錄檔。

讓資料庫回復正常:

alter database dbTest set multi_user
use dbTest
select * from tb

果然那些資料都沒有。

SSD 用在資料庫檔,SAS HD 用在交易紀錄檔

看到 SQL MVP 的 e-mail 討論最近 HP 在 TPC-C 發佈的 SQL 2005 交易紀錄

http://www.tpc.org/results/FDR/TPCC/HP_ProLiant_DL580G7_2.26GHz_fdr_100830_Energy_v2.pdf

The database data was housed on 256 SSD drives at 120GB, two 72GB drives for the operating system, two 146GB drives for additional operating system space, 48 drives at 300GB for database log and 80 drives at 300 GB for backup and 60 day space. There were 8 x LSI 9200-16e controllers connected to 16 x D2700 storage boxes with 16 x 120GB SSD drives each for database data and 5 x 300GB drives each for backup, 48 x 300GB drives on one MSA P2000 G3 controller connected to 1 D2700 storage box for database log, and 2 x 72GB drives plus 2 x 146GB drives on the SMART P410i controller for the operating system. Section 1.2 of this report details the distribution of database tables across all disks.

MVP 們討論的重點是 SSD 用在資料庫檔,SAS HD 用在交易紀錄檔。因為 SSD 的隨機讀出效率高,大量多次寫入效率不穩且易損毀(隨機寫入更慘)。其技術原因可參考如下

http://en.wikipedia.org/wiki/Write_amplification

這份 HP 的報告有趣的地方是沒說 Tempdb 放在什麼樣格式的硬碟,是否也用一組 SSD 放 DB File 而 SAS HD 放 TLog,但因 Tempdb 是大量讀寫,是否比較適合全放在 SAS HD?