PostgreSQL tuple alignment padding - 对齐规则,以及如何选择字段顺序
PostgreSQL tuple内部有ALIGN机制,因此字段顺序选择实际上也是有讲究的,选择不好,可能因为ALIGN导致占用空间放大。
src/backend/access/common/heaptuple.c
*BeforePostgres8.3varlenasalwayshada4-bytelengthheader,and*thereforealwaysneeded4-bytealignment(atleast).Thiswastedspace*forshortvarlenas,forexampleCHAR(1)took5bytesandcouldneedupto*3additionalpaddingbytesforalignment.*Now,ashortvarlena(upto126databytes)isreducedtoa1-byteheader*andwedon'talignit.Tohidethisfromdatatype-specificfunctionsthat*don'twanttodealwithit,suchadatumisconsidered"toasted"andwill*beexpandedbacktothenormal4-byte-headerformatbypg_detoast_datum.*(Inperformance-criticalcodepathswecanusepg_detoast_datum_packed*andtheappropriateaccessmacrostoavoidthatoverhead.)Notethatthis*conversionisperformeddirectlyinheap_form_tuple,withoutinvoking*tuptoaster.c.
https://www.postgresql.org/docs/devel/static/catalog-pg-type.html
对齐规则tuple对齐规则
src/include/access/tupmacs.h
/**att_align_datumalignsthegivenoffsetasneededforadatumofalignment*requirementattalignandtyplenattlen.attdatumistheDatumvariable*weintendtopackintoatuple(it'sonlyaccessedifwearedealingwith*avarlenatype).NotethatthisassumestheDatumwillbestoredas-is;*callersthatareintendingtoconvertnon-shortvarlenadatumstoshort*formathavetoaccountforthatthemselves.*/#defineatt_align_datum(cur_offset,attalign,attlen,attdatum)\(\((attlen)==-1&&VARATT_IS_SHORT(DatumGetPointer(attdatum)))?\(uintptr_t)(cur_offset):\att_align_nominal(cur_offset,attalign)\)/**att_align_pointerperformsthesamecalculationasatt_align_datum,*butisusedwhenwalkingatuple.attptristhecurrentactualdata*pointer;whenaccessingavarlenafieldwehaveto"peek"toseeifwe*arelookingatapadbyteorthefirstbyteofa1-byte-headerdatum.*(Azerobytemustbeeitherapadbyte,orthefirstbyteofacorrectly*aligned4-bytelengthword;ineithercasewecanalignsafely.Anon-zero*bytemustbeeithera1-bytelengthword,orthefirstbyteofacorrectly*aligned4-bytelengthword;ineithercaseweneednotalign.)**Note:somecallerspassa"char*"pointerforcur_offset.Thisis*abitofahackbutshouldworkallrightaslongasuintptr_tisthe*correctwidth.*/#defineatt_align_pointer(cur_offset,attalign,attlen,attptr)\(\((attlen)==-1&&VARATT_NOT_PAD_BYTE(attptr))?\(uintptr_t)(cur_offset):\att_align_nominal(cur_offset,attalign)\)/**att_align_nominalalignsthegivenoffsetasneededforadatumofalignment*requirementattalign,ignoringanyconsiderationofpackedvarlenadatums.*Therearethreemainusecasesforusingthismacrodirectly:**weknowthattheattinquestionisnotvarlena(attlen!=-1);*inthiscaseitischeaperthantheabovemacrosandjustasgood.**weneedtoestimatealignmentpaddingcostabstractly,iewithout*referencetoarealtuple.Wemustassumetheworstcasethat*allvarlenasarealigned.**withinarrays,weunconditionallyalignvarlenas(XXXthisshouldbe*revisited,probably).**Theattaligncasesaretestedinwhatishopefullysomethingliketheir*frequencyofoccurrence.*/#defineatt_align_nominal(cur_offset,attalign)\(\((attalign)=='i')?INTALIGN(cur_offset):\(((attalign)=='c')?(uintptr_t)(cur_offset):\(((attalign)=='d')?DOUBLEALIGN(cur_offset):\(\AssertMacro((attalign)=='s'),\SHORTALIGN(cur_offset)\)))\)通过pg_attribute系统表,查看对齐情况
https://www.postgresql.org/docs/devel/static/catalog-pg-type.html
typalign is the alignment required when storing a value of this type. It applies to storage on disk as well as most representations of the value inside PostgreSQL. When multiple values are stored consecutively, such as in the representation of a complete row on disk, padding is inserted before a datum of this type so that it begins on the specified boundary. The alignment reference is the beginning of the first datum in the sequence.
Possiblevaluesare:c=charalignment,i.e.,noalignmentneeded.s=shortalignment(2bytesonmostmachines).i=intalignment(4bytesonmostmachines).d=doublealignment(8bytesonmanymachines,butbynomeansall).
表示在这个字段前面必须已经以以上大小对齐。
例如,
c表示,这个字段的前面已占用空间(包括24字节的tuple head)必须1字节的倍数。如果不是,则前面一个字段末尾必须有padding,使得符合这个对齐条件。
s表示,这个字段前面已占用空间(包括24字节的tuple head)必须是2字节的倍数。如果不是,则前面一个字段末尾必须有padding,使得符合这个对齐条件。
i表示,这个字段前面已占用空间(包括24字节的tuple head)必须是4字节的倍数。如果不是,则前面一个字段末尾必须有padding,使得符合这个对齐条件。
d表示,这个字段前面已占用空间(包括24字节的tuple head)必须是8字节的倍数。如果不是,则前面一个字段末尾必须有padding,使得符合这个对齐条件。
例如1、
1,4,8
会变成
1,3(padding),4,8
2、
1,1,8
会变成
1,1,6(padding),8
3、
1,1,1,8
会变成
1,1,1,5,8例子
1、空行
postgres=#selectpg_column_size(row());pg_column_size----------------24(1row)
2、PADDING
postgres=#selectpg_column_size(row(char'a',char'b',int4'1'));pg_column_size----------------32(1row)postgres=#selectpg_column_size(row(char'a',char'b',int8'1'));pg_column_size----------------40(1row)相关文档举例
https://blog.2ndquadrant.com/on-rocks-and-sand/
1、
SELECTpg_column_size(ROW())ASempty,pg_column_size(ROW(0::SMALLINT))ASbyte2,pg_column_size(ROW(0::BIGINT))ASbyte8,pg_column_size(ROW(0::SMALLINT,0::BIGINT))ASbyte16;--24,2,6(padding),8empty|byte2|byte8|byte16-------+-------+-------+--------24|26|32|40
2、包含padding的表
CREATETABLEuser_order(is_shippedBOOLEANNOTNULLDEFAULTFALSE,user_idBIGINTNOTNULL,order_totalNUMERICNOTNULL,order_dtTIMESTAMPTZNOTNULL,order_typeSMALLINTNOTNULL,ship_dtTIMESTAMPTZ,item_ctINTNOTNULL,ship_costNUMERIC,receive_dtTIMESTAMPTZ,tracking_cdTEXT,idBIGSERIALPRIMARYKEYNOTNULL);
查看这个表的对齐规则
SELECTa.attname,t.typname,t.typalign,t.typlenFROMpg_classcJOINpg_attributeaON(a.attrelid=c.oid)JOINpg_typetON(t.oid=a.atttypid)WHEREc.relname='user_order'ANDa.attnum>=0ORDERBYa.attnum;attname|typname|typalign|typlen-------------+-------------+----------+--------is_shipped|bool|c|1user_id|int8|d|8order_total|NUMERIC|i|-1order_dt|timestamptz|d|8order_type|int2|s|2ship_dt|timestamptz|d|8item_ct|int4|i|4ship_cost|NUMERIC|i|-1receive_dt|timestamptz|d|8tracking_cd|text|i|-1id|int8|d|8
3、插入测试数据
135MB
INSERTINTOuser_order(is_shipped,user_id,order_total,order_dt,order_type,ship_dt,item_ct,ship_cost,receive_dt,tracking_cd)SELECTTRUE,1000,500.00,now()-INTERVAL'7days',3,now()-INTERVAL'5days',10,4.99,now()-INTERVAL'3days','X5901324123479RROIENSTBKCV4'FROMgenerate_series(1,1000000);SELECTpg_relation_size('user_order')ASsize_bytes,pg_size_pretty(pg_relation_size('user_order'))ASsize_pretty;size_bytes|size_pretty------------+-------------141246464|135MB
4、调整字段顺序,可以把padding消除掉,基于前面说的规则即可。
SELECTpg_column_size(ROW())ASempty_row,pg_column_size(ROW(0::NUMERIC))ASno_val,pg_column_size(ROW(1::NUMERIC))ASno_dec,pg_column_size(ROW(9.9::NUMERIC))ASwith_dec,pg_column_size(ROW(1::INT2,1::NUMERIC))AScol2,pg_column_size(ROW(1::INT4,1::NUMERIC))AScol4,pg_column_size(ROW(1::NUMERIC,1::INT4))ASround8;empty_row|no_val|no_dec|with_dec|col2|col4|round8-----------+--------+--------+----------+------+------+--------24|27|29|31|31|33|36
SELECTpg_column_size(ROW())ASempty_row,pg_column_size(ROW(''::TEXT))ASno_text,pg_column_size(ROW('a'::TEXT))ASmin_text,pg_column_size(ROW(1::INT4,'a'::TEXT))AStwo_col,pg_column_size(ROW('a'::TEXT,1::INT4))ASround4;empty_row|no_text|min_text|two_col|round4-----------+---------+----------+---------+--------24|25|26|30|32
SELECTpg_column_size(ROW())ASempty_row,pg_column_size(ROW(1::SMALLINT))ASint2,pg_column_size(ROW(1::INT))ASint4,pg_column_size(ROW(1::BIGINT))ASint8,pg_column_size(ROW(1::SMALLINT,1::BIGINT))ASpadded,pg_column_size(ROW(1::INT,1::INT,1::BIGINT))ASnot_padded;empty_row|int2|int4|int8|padded|not_padded-----------+------+------+------+--------+------------24|26|28|32|40|40
5、消除tuple padding的优化,字段顺序如下
5.1、定长字段(从大到小)
5.2、变长字段
DROPTABLEuser_order;CREATETABLEuser_order(idBIGSERIALPRIMARYKEYNOTNULL,user_idBIGINTNOTNULL,order_dtTIMESTAMPTZNOTNULL,ship_dtTIMESTAMPTZ,receive_dtTIMESTAMPTZ,item_ctINTNOTNULL,order_typeSMALLINTNOTNULL,is_shippedBOOLEANNOTNULLDEFAULTFALSE,tracking_cdTEXT,order_totalNUMERICNOTNULL,ship_costNUMERIC);INSERTINTOuser_order(is_shipped,user_id,order_total,order_dt,order_type,ship_dt,item_ct,ship_cost,receive_dt,tracking_cd)SELECTTRUE,1000,500.00,now()-INTERVAL'7days',3,now()-INTERVAL'5days',10,4.99,now()-INTERVAL'3days','X5901324123479RROIENSTBKCV4'FROMgenerate_series(1,1000000);postgres=#\dt+user_orderListofrelationsSchema|Name|Type|Owner|Size|Description--------+------------+-------+----------+--------+-------------public|user_order|table|postgres|112MB|(1row)
6、优化后的padding情况,可以看到已经消除了padding,空间降低到112MB。
SELECTa.attname,t.typname,t.typalign,t.typlenFROMpg_classcJOINpg_attributeaON(a.attrelid=c.oid)JOINpg_typetON(t.oid=a.atttypid)WHEREc.relname='user_order'ANDa.attnum>=0ORDERBYt.typlenDESC;attname|typname|typalign|typlen-------------+-------------+----------+--------id|int8|d|8user_id|int8|d|8order_dt|timestamptz|d|8ship_dt|timestamptz|d|8receive_dt|timestamptz|d|8item_ct|int4|i|4order_type|int2|s|2is_shipped|bool|c|1tracking_cd|text|i|-1ship_cost|NUMERIC|i|-1order_total|NUMERIC|i|-1小结
消除tuple PADDING, 字段顺序规则:
1、定长字段(从大到小)
2、变长字段
本文的例子简单的说明了padding引入的TUPLE变大的情况,使用以上规则调整字段顺序后,空间占用下降了10%左右。
参考src/include/access/tupmacs.h
src/backend/access/common/heaptuple.c
https://blog.2ndquadrant.com/on-rocks-and-sand/
《Greenplum 优化CASE - 对齐JOIN字段类型,使用数组代替字符串,降低字符串处理开销,列存降低扫描开销》
《PostgreSQL 10.0 preview 性能增强 - pg_xact align(cacheline对齐)》
《未对齐(alignment)造成SSD 写放大一例》
https://yq.aliyun.com/articles/237
https://www.pgcon.org/2012/schedule/attachments/258_212_Internals%20Of%20PostgreSQL%20Wal.pdf
https://www.postgresql.org/docs/devel/static/runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS
https://www.postgresql.org/docs/devel/static/wal-reliability.html
原文地址:https://github.com/digoal/blog/blob/master/201810/20181001_01.md
声明:本站所有文章资源内容,如无特殊说明或标注,均为采集网络资源。如若本站内容侵犯了原著者的合法权益,可联系本站删除。