What is the purpose of PAD_INDEX in this SQL Server constraint?











up vote
37
down vote

favorite
14












I have the following constraint being applied to one of my tables, but I don't know what PAD_INDEX means.



Can someone enlighten me?



CONSTRAINT [PK_Employees] PRIMARY KEY CLUSTERED 
(
[EmployeeId] ASC
) WITH (PAD_INDEX = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
^--------------^
this part here









share|improve this question




















  • 3




    Hi and welcome to Stack Overflow. Please review How to Ask and faq for information on how to write good/appropriate questions here on Stack Overflow. I took the liberty of cleaning up your question, to make it less likely to accrue a lot of down-votes.
    – Lasse Vågsæther Karlsen
    Jul 28 '11 at 10:03















up vote
37
down vote

favorite
14












I have the following constraint being applied to one of my tables, but I don't know what PAD_INDEX means.



Can someone enlighten me?



CONSTRAINT [PK_Employees] PRIMARY KEY CLUSTERED 
(
[EmployeeId] ASC
) WITH (PAD_INDEX = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
^--------------^
this part here









share|improve this question




















  • 3




    Hi and welcome to Stack Overflow. Please review How to Ask and faq for information on how to write good/appropriate questions here on Stack Overflow. I took the liberty of cleaning up your question, to make it less likely to accrue a lot of down-votes.
    – Lasse Vågsæther Karlsen
    Jul 28 '11 at 10:03













up vote
37
down vote

favorite
14









up vote
37
down vote

favorite
14






14





I have the following constraint being applied to one of my tables, but I don't know what PAD_INDEX means.



Can someone enlighten me?



CONSTRAINT [PK_Employees] PRIMARY KEY CLUSTERED 
(
[EmployeeId] ASC
) WITH (PAD_INDEX = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
^--------------^
this part here









share|improve this question















I have the following constraint being applied to one of my tables, but I don't know what PAD_INDEX means.



Can someone enlighten me?



CONSTRAINT [PK_Employees] PRIMARY KEY CLUSTERED 
(
[EmployeeId] ASC
) WITH (PAD_INDEX = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
^--------------^
this part here






sql sql-server indexing






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited May 5 '17 at 18:47









HappyTown

1,89841627




1,89841627










asked Jul 28 '11 at 9:47









radio star

196123




196123








  • 3




    Hi and welcome to Stack Overflow. Please review How to Ask and faq for information on how to write good/appropriate questions here on Stack Overflow. I took the liberty of cleaning up your question, to make it less likely to accrue a lot of down-votes.
    – Lasse Vågsæther Karlsen
    Jul 28 '11 at 10:03














  • 3




    Hi and welcome to Stack Overflow. Please review How to Ask and faq for information on how to write good/appropriate questions here on Stack Overflow. I took the liberty of cleaning up your question, to make it less likely to accrue a lot of down-votes.
    – Lasse Vågsæther Karlsen
    Jul 28 '11 at 10:03








3




3




Hi and welcome to Stack Overflow. Please review How to Ask and faq for information on how to write good/appropriate questions here on Stack Overflow. I took the liberty of cleaning up your question, to make it less likely to accrue a lot of down-votes.
– Lasse Vågsæther Karlsen
Jul 28 '11 at 10:03




Hi and welcome to Stack Overflow. Please review How to Ask and faq for information on how to write good/appropriate questions here on Stack Overflow. I took the liberty of cleaning up your question, to make it less likely to accrue a lot of down-votes.
– Lasse Vågsæther Karlsen
Jul 28 '11 at 10:03












5 Answers
5






active

oldest

votes

















up vote
55
down vote













An index in SQL Server is a B-Tree




  • FILLFACTOR applies to the bottom layer

    This is the leaf node/data layer in the picture below


  • PAD_INDEX ON means "Apply FILLFACTOR to all layers"

    This is the intermediate levels in the picture below (between root and data)



This means that PAD_INDEX is only useful if FILLFACTOR is set. FILLFACTOR determines how much free space in an data page (roughly)



A picture from MSDN:



B-Tree structure






share|improve this answer























  • On this page msdn.microsoft.com/en-us/library/ms186869.aspx it says when pad_index is on, "The percentage of free space that is specified by FILLFACTOR is applied to the intermediate-level pages of the index". Will it also apply to the root level? Maybe it's just a books online oversight.
    – Ogrish Man
    Dec 4 '16 at 13:32










  • No. The root level's fill is determined by the number of 'next to root' blocks required to hold all the rows in the index given the fill requested. If you were to set the root fill explicitly you would have no control over the fill of the intermediate blocks because their count would be dictated by the number of entries in the root (to match the fill you set) and intermediate block fill would be dictated by the number of rows covered by the index. So you can only control one or the other, not both.
    – bielawski
    Oct 26 at 13:55


















up vote
46
down vote













Basically, you set PAD_INDEX = ON if you expect a lot of random changes to the index regularly.



That helps avoiding index page splits.



I set it on when I expect 30%+ of random records included in the index to be deleted on a regular basis.






share|improve this answer

















  • 5




    Always like comments on practical use cases.
    – neizan
    Aug 18 '15 at 14:07










  • nice this explains the things. if we do not make deletes regularly dont set it on
    – MonsterMMORPG
    Feb 19 '17 at 15:44










  • I would expect that you would only pad if inserts dramatically outweighed deletes. Deletes free up space resulting in a lower need to pad an index in the prevention of splits.
    – bielawski
    Oct 26 at 14:00


















up vote
19
down vote













From MSDN:



PAD_INDEX = { ON | OFF }



Specifies index padding. The default is OFF.



ON:
The percentage of free space that is specified by fillfactor is applied to the intermediate-level pages of the index.



OFF or fillfactor is not specified:
The intermediate-level pages are filled to near capacity, leaving sufficient space for at least one row of the maximum size the index can have, considering the set of keys on the intermediate pages.



The PAD_INDEX option is useful only when FILLFACTOR is specified, because PAD_INDEX uses the percentage specified by FILLFACTOR. If the percentage specified for FILLFACTOR is not large enough to allow for one row, the Database Engine internally overrides the percentage to allow for the minimum. The number of rows on an intermediate index page is never less than two, regardless of how low the value of fillfactor.



In backward compatible syntax, WITH PAD_INDEX is equivalent to WITH PAD_INDEX = ON.






share|improve this answer





















  • Downvote because it is purely a copy from Microsoft with no additional elaboration.
    – GaTechThomas
    Mar 17 '14 at 19:41






  • 10




    Actually, I'm upvoting this because it is the explanation that made the most sense... even if it is C&P'd from somewhere else.
    – TomXP411
    May 19 '14 at 19:25








  • 1




    Downvote because the question asks for the purpose. The pure technical explanation what is is says nothing about the purpose and cases when to use it.
    – Magier
    Oct 7 '15 at 8:52


















up vote
0
down vote













This is actually a highly complex subject. Turning on PAD_INDEX can have dramatic effects on read performance and memory pressure in large tables. The larger the table the bigger the effect. As a rule I'd say you want to leave it off unless you fall into some NOT UNCOMMON categories. Then, follow this advice carefully. As I show in the example case below, adjusting FILLFACTOR when PAD_INDEX is ON can have an exponential effect that needs to be carefully balanced.




  1. PAD_INDEX ALWAYS has a detrimental effect on reads! The lower your FILLFACTOR the bigger the effect so you need to pay close attention to the value of FILLFACTOR when you turn it on. On large tables you essentially stop thinking about FILLFACTOR in terms of reducing leaf splits and start thinking about its effect on intermediate bloat vs intermediate splits.

  2. PAD_INDEX rarely has a useful effect on indexes with less than 100,000 rows and NEVER has a positive effect on indexes covering identity or insert-time type columns were inserts are always to the end of the table.

  3. From the above you should see that if you turn PAD_INDEX on you must carefully balance the negative effects with the positive.


Rules of thumb: PAD_INDEX is rarely useful on non-clustered indexes unless they are quite wide, on clustered indexes of very narrow tables, or on tables that have less than 100K rows unless inserts are highly clustered and even then it can be questionable.



You MUST understand how it works:
When you insert into an index the row must fit into the the leaf block that contains the appropriate range of keys. Clustered indexes typically have much wider rows than non-clustered indexes and so their leaf blocks hold fewer rows. FillFactor creates space for new rows in the leaf but in the case of very wide rows or a large volume of inserts that are clustered together rather than evenly distributed it's often impractical or impossible to create enough slack (1-pct fill) to prevent splits.



When a split occurs a new intermediate row is created to point to the new block and that row must fit into its appropriate block. If that intermediate block is full it must first be split. Splits can run all the way down to the root if you are particularly unlucky. When the root splits you end up creating a new index level.



The point of PAD_INDEX is to force a minimum amount of free space in your intermediate level blocks.



After a rebuild there may be little or no space at the lower levels. So you can have massive splitting of your intermediates all over the place if you have lots of leaf splits and PAD_INDEX isn't turned on!



Mostly though, splits can be managed with FILLFACTOR. The bigger split problems happen with insert patterns that virtually guarantee you won't have enough free space and turning PAD_INDEX on then helps alleviate this by providing space at deeper levels so when a split does occur you are less likely to incur lots of multilevel splits.



Example Case



I have a customer table with 100K rows. On any given day about 5% of my customers will be active. I have a table that records activity by customer by time. On average a customer performs 20 actions and the description takes, on average, 1K. So I collect 100MB of data and lets assume I've got a year already in the table - so 36GB.



The table has inserts of 1Kb rows with customer_number and insert_time (in that order) for key columns. Clearly the average customer will split an 8K leaf block several times while inserting their expected 20 rows because each row will insert immediately after the preceding row in the same block until it splits and splits and splits (makes one consider a heap with only non-clustered indexes...). If the intermediate block pointing to the appropriate leaf doesn't have enough room for at least 4 rows (in reality probably 8 but...) the intermediate will need to split. Given this example's key takes up 22 bytes, an intermediate block can hold 367 entries. This means I need 6% free space in my intermediate block or a fill of 94% to hold the 4 entries.



Notice that even a 1% FILLFACTOR won't stop leaf block splits since a block can only hold 8 rows. Setting FILLFACTOR TO 80% will only allow 1 row to be added before the leaf splits but will inject over 800 bytes of free space per intermediate block if PAD_INDEX is on. That's ~800 empty bytes for EVERY intermediate block when I only need 88.



This is really important!: So if I have 36M rows already in the table, using 80% means 294 rows per intermediate block, meaning 122K blocks, meaning I've injected 98MB into my intermediate block structure when 94% lets 345 rows fit per block so there are only 104K intermediate blocks (yes I'm leaving out the lower levels for simplicity). Adding 88 bytes to each of 104K blocks adds only 9.2MB as opposed to 98MB.



Now consider that only 5% of my customers did anything. Some did more than 20 things and some less so some blocks split anyway and since only 275KB were actually needed to hold the day's index rows (100k/8*22), the best case is that only 8.9MB of my 9.2MB were dead air. If split prevention is important it's well worth 9mb however I'd be thinking harder about 98mb.



So by turning PAD_INDEX on I should be giving up on controlling leaf splits entirely and turning to controlling intermediate splits.



DON'T bother worrying about anything but the first intermediate level! There is a butterfly effect induced by any clustering (in this case clustering of customer_number) that will throw any calculation you make out the window. Unless your inserts are perfectly uniform your margin of error in finding the right number to balance bloat with splits is typically far bigger than the effect of the lower level block space.






share|improve this answer






























    up vote
    0
    down vote













    @bielawski
    You describe only cases when PAD_INDEX=ON and FILLFACTOR is between 1 to 99.
    What you're thinking about set PAD_INDEX=ON and FILLFACTOR=0 or 100 in case I insert ordered rows, which always be newer then previous one.



    CREATE CLUSTERED INDEX [IX_z_arch_export_dzienny_pre] ON [dbo].[z_arch_export_daily_pre]
    (
    [Date] ASC,
    [Object Code] ASC,
    [From date] ASC,
    [Person_role] ASC,
    [Departure] ASC,
    [Room code] ASC,
    [period_7_14] ASC
    )WITH (PAD_INDEX = ON, FILLFACTOR=100)


    insert into z_arch_export_daily_pre
    select * from export_daily_pre
    order by [Date] ASC,[Object Code] ASC,[From date] ASC,[Person_role] ASC,[Departure] ASC,[Room code] ASC,[period_7_14] ASC


    I have 100% assurance that all new rows will be inserted "at the end" of index, and only with this options (PAD_INDEX = ON, FILLFACTOR=100) I could achieve 0.01% of fragmentation index after insert.
    Is something dangerous with this settings with that assumptions?






    share|improve this answer























    • This should be a comment, not an answer :)
      – m__
      Nov 8 at 14:24






    • 1




      Yes I know, but when i click add comment under @bielawski answer I get information, that i have had at least 50 of reputation :(
      – Peter_K
      Nov 8 at 14:27











    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f6857007%2fwhat-is-the-purpose-of-pad-index-in-this-sql-server-constraint%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    5 Answers
    5






    active

    oldest

    votes








    5 Answers
    5






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    55
    down vote













    An index in SQL Server is a B-Tree




    • FILLFACTOR applies to the bottom layer

      This is the leaf node/data layer in the picture below


    • PAD_INDEX ON means "Apply FILLFACTOR to all layers"

      This is the intermediate levels in the picture below (between root and data)



    This means that PAD_INDEX is only useful if FILLFACTOR is set. FILLFACTOR determines how much free space in an data page (roughly)



    A picture from MSDN:



    B-Tree structure






    share|improve this answer























    • On this page msdn.microsoft.com/en-us/library/ms186869.aspx it says when pad_index is on, "The percentage of free space that is specified by FILLFACTOR is applied to the intermediate-level pages of the index". Will it also apply to the root level? Maybe it's just a books online oversight.
      – Ogrish Man
      Dec 4 '16 at 13:32










    • No. The root level's fill is determined by the number of 'next to root' blocks required to hold all the rows in the index given the fill requested. If you were to set the root fill explicitly you would have no control over the fill of the intermediate blocks because their count would be dictated by the number of entries in the root (to match the fill you set) and intermediate block fill would be dictated by the number of rows covered by the index. So you can only control one or the other, not both.
      – bielawski
      Oct 26 at 13:55















    up vote
    55
    down vote













    An index in SQL Server is a B-Tree




    • FILLFACTOR applies to the bottom layer

      This is the leaf node/data layer in the picture below


    • PAD_INDEX ON means "Apply FILLFACTOR to all layers"

      This is the intermediate levels in the picture below (between root and data)



    This means that PAD_INDEX is only useful if FILLFACTOR is set. FILLFACTOR determines how much free space in an data page (roughly)



    A picture from MSDN:



    B-Tree structure






    share|improve this answer























    • On this page msdn.microsoft.com/en-us/library/ms186869.aspx it says when pad_index is on, "The percentage of free space that is specified by FILLFACTOR is applied to the intermediate-level pages of the index". Will it also apply to the root level? Maybe it's just a books online oversight.
      – Ogrish Man
      Dec 4 '16 at 13:32










    • No. The root level's fill is determined by the number of 'next to root' blocks required to hold all the rows in the index given the fill requested. If you were to set the root fill explicitly you would have no control over the fill of the intermediate blocks because their count would be dictated by the number of entries in the root (to match the fill you set) and intermediate block fill would be dictated by the number of rows covered by the index. So you can only control one or the other, not both.
      – bielawski
      Oct 26 at 13:55













    up vote
    55
    down vote










    up vote
    55
    down vote









    An index in SQL Server is a B-Tree




    • FILLFACTOR applies to the bottom layer

      This is the leaf node/data layer in the picture below


    • PAD_INDEX ON means "Apply FILLFACTOR to all layers"

      This is the intermediate levels in the picture below (between root and data)



    This means that PAD_INDEX is only useful if FILLFACTOR is set. FILLFACTOR determines how much free space in an data page (roughly)



    A picture from MSDN:



    B-Tree structure






    share|improve this answer














    An index in SQL Server is a B-Tree




    • FILLFACTOR applies to the bottom layer

      This is the leaf node/data layer in the picture below


    • PAD_INDEX ON means "Apply FILLFACTOR to all layers"

      This is the intermediate levels in the picture below (between root and data)



    This means that PAD_INDEX is only useful if FILLFACTOR is set. FILLFACTOR determines how much free space in an data page (roughly)



    A picture from MSDN:



    B-Tree structure







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Oct 29 at 7:42

























    answered Jul 28 '11 at 10:02









    gbn

    338k56480574




    338k56480574












    • On this page msdn.microsoft.com/en-us/library/ms186869.aspx it says when pad_index is on, "The percentage of free space that is specified by FILLFACTOR is applied to the intermediate-level pages of the index". Will it also apply to the root level? Maybe it's just a books online oversight.
      – Ogrish Man
      Dec 4 '16 at 13:32










    • No. The root level's fill is determined by the number of 'next to root' blocks required to hold all the rows in the index given the fill requested. If you were to set the root fill explicitly you would have no control over the fill of the intermediate blocks because their count would be dictated by the number of entries in the root (to match the fill you set) and intermediate block fill would be dictated by the number of rows covered by the index. So you can only control one or the other, not both.
      – bielawski
      Oct 26 at 13:55


















    • On this page msdn.microsoft.com/en-us/library/ms186869.aspx it says when pad_index is on, "The percentage of free space that is specified by FILLFACTOR is applied to the intermediate-level pages of the index". Will it also apply to the root level? Maybe it's just a books online oversight.
      – Ogrish Man
      Dec 4 '16 at 13:32










    • No. The root level's fill is determined by the number of 'next to root' blocks required to hold all the rows in the index given the fill requested. If you were to set the root fill explicitly you would have no control over the fill of the intermediate blocks because their count would be dictated by the number of entries in the root (to match the fill you set) and intermediate block fill would be dictated by the number of rows covered by the index. So you can only control one or the other, not both.
      – bielawski
      Oct 26 at 13:55
















    On this page msdn.microsoft.com/en-us/library/ms186869.aspx it says when pad_index is on, "The percentage of free space that is specified by FILLFACTOR is applied to the intermediate-level pages of the index". Will it also apply to the root level? Maybe it's just a books online oversight.
    – Ogrish Man
    Dec 4 '16 at 13:32




    On this page msdn.microsoft.com/en-us/library/ms186869.aspx it says when pad_index is on, "The percentage of free space that is specified by FILLFACTOR is applied to the intermediate-level pages of the index". Will it also apply to the root level? Maybe it's just a books online oversight.
    – Ogrish Man
    Dec 4 '16 at 13:32












    No. The root level's fill is determined by the number of 'next to root' blocks required to hold all the rows in the index given the fill requested. If you were to set the root fill explicitly you would have no control over the fill of the intermediate blocks because their count would be dictated by the number of entries in the root (to match the fill you set) and intermediate block fill would be dictated by the number of rows covered by the index. So you can only control one or the other, not both.
    – bielawski
    Oct 26 at 13:55




    No. The root level's fill is determined by the number of 'next to root' blocks required to hold all the rows in the index given the fill requested. If you were to set the root fill explicitly you would have no control over the fill of the intermediate blocks because their count would be dictated by the number of entries in the root (to match the fill you set) and intermediate block fill would be dictated by the number of rows covered by the index. So you can only control one or the other, not both.
    – bielawski
    Oct 26 at 13:55












    up vote
    46
    down vote













    Basically, you set PAD_INDEX = ON if you expect a lot of random changes to the index regularly.



    That helps avoiding index page splits.



    I set it on when I expect 30%+ of random records included in the index to be deleted on a regular basis.






    share|improve this answer

















    • 5




      Always like comments on practical use cases.
      – neizan
      Aug 18 '15 at 14:07










    • nice this explains the things. if we do not make deletes regularly dont set it on
      – MonsterMMORPG
      Feb 19 '17 at 15:44










    • I would expect that you would only pad if inserts dramatically outweighed deletes. Deletes free up space resulting in a lower need to pad an index in the prevention of splits.
      – bielawski
      Oct 26 at 14:00















    up vote
    46
    down vote













    Basically, you set PAD_INDEX = ON if you expect a lot of random changes to the index regularly.



    That helps avoiding index page splits.



    I set it on when I expect 30%+ of random records included in the index to be deleted on a regular basis.






    share|improve this answer

















    • 5




      Always like comments on practical use cases.
      – neizan
      Aug 18 '15 at 14:07










    • nice this explains the things. if we do not make deletes regularly dont set it on
      – MonsterMMORPG
      Feb 19 '17 at 15:44










    • I would expect that you would only pad if inserts dramatically outweighed deletes. Deletes free up space resulting in a lower need to pad an index in the prevention of splits.
      – bielawski
      Oct 26 at 14:00













    up vote
    46
    down vote










    up vote
    46
    down vote









    Basically, you set PAD_INDEX = ON if you expect a lot of random changes to the index regularly.



    That helps avoiding index page splits.



    I set it on when I expect 30%+ of random records included in the index to be deleted on a regular basis.






    share|improve this answer












    Basically, you set PAD_INDEX = ON if you expect a lot of random changes to the index regularly.



    That helps avoiding index page splits.



    I set it on when I expect 30%+ of random records included in the index to be deleted on a regular basis.







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Jul 3 '15 at 10:46









    SQLador

    46143




    46143








    • 5




      Always like comments on practical use cases.
      – neizan
      Aug 18 '15 at 14:07










    • nice this explains the things. if we do not make deletes regularly dont set it on
      – MonsterMMORPG
      Feb 19 '17 at 15:44










    • I would expect that you would only pad if inserts dramatically outweighed deletes. Deletes free up space resulting in a lower need to pad an index in the prevention of splits.
      – bielawski
      Oct 26 at 14:00














    • 5




      Always like comments on practical use cases.
      – neizan
      Aug 18 '15 at 14:07










    • nice this explains the things. if we do not make deletes regularly dont set it on
      – MonsterMMORPG
      Feb 19 '17 at 15:44










    • I would expect that you would only pad if inserts dramatically outweighed deletes. Deletes free up space resulting in a lower need to pad an index in the prevention of splits.
      – bielawski
      Oct 26 at 14:00








    5




    5




    Always like comments on practical use cases.
    – neizan
    Aug 18 '15 at 14:07




    Always like comments on practical use cases.
    – neizan
    Aug 18 '15 at 14:07












    nice this explains the things. if we do not make deletes regularly dont set it on
    – MonsterMMORPG
    Feb 19 '17 at 15:44




    nice this explains the things. if we do not make deletes regularly dont set it on
    – MonsterMMORPG
    Feb 19 '17 at 15:44












    I would expect that you would only pad if inserts dramatically outweighed deletes. Deletes free up space resulting in a lower need to pad an index in the prevention of splits.
    – bielawski
    Oct 26 at 14:00




    I would expect that you would only pad if inserts dramatically outweighed deletes. Deletes free up space resulting in a lower need to pad an index in the prevention of splits.
    – bielawski
    Oct 26 at 14:00










    up vote
    19
    down vote













    From MSDN:



    PAD_INDEX = { ON | OFF }



    Specifies index padding. The default is OFF.



    ON:
    The percentage of free space that is specified by fillfactor is applied to the intermediate-level pages of the index.



    OFF or fillfactor is not specified:
    The intermediate-level pages are filled to near capacity, leaving sufficient space for at least one row of the maximum size the index can have, considering the set of keys on the intermediate pages.



    The PAD_INDEX option is useful only when FILLFACTOR is specified, because PAD_INDEX uses the percentage specified by FILLFACTOR. If the percentage specified for FILLFACTOR is not large enough to allow for one row, the Database Engine internally overrides the percentage to allow for the minimum. The number of rows on an intermediate index page is never less than two, regardless of how low the value of fillfactor.



    In backward compatible syntax, WITH PAD_INDEX is equivalent to WITH PAD_INDEX = ON.






    share|improve this answer





















    • Downvote because it is purely a copy from Microsoft with no additional elaboration.
      – GaTechThomas
      Mar 17 '14 at 19:41






    • 10




      Actually, I'm upvoting this because it is the explanation that made the most sense... even if it is C&P'd from somewhere else.
      – TomXP411
      May 19 '14 at 19:25








    • 1




      Downvote because the question asks for the purpose. The pure technical explanation what is is says nothing about the purpose and cases when to use it.
      – Magier
      Oct 7 '15 at 8:52















    up vote
    19
    down vote













    From MSDN:



    PAD_INDEX = { ON | OFF }



    Specifies index padding. The default is OFF.



    ON:
    The percentage of free space that is specified by fillfactor is applied to the intermediate-level pages of the index.



    OFF or fillfactor is not specified:
    The intermediate-level pages are filled to near capacity, leaving sufficient space for at least one row of the maximum size the index can have, considering the set of keys on the intermediate pages.



    The PAD_INDEX option is useful only when FILLFACTOR is specified, because PAD_INDEX uses the percentage specified by FILLFACTOR. If the percentage specified for FILLFACTOR is not large enough to allow for one row, the Database Engine internally overrides the percentage to allow for the minimum. The number of rows on an intermediate index page is never less than two, regardless of how low the value of fillfactor.



    In backward compatible syntax, WITH PAD_INDEX is equivalent to WITH PAD_INDEX = ON.






    share|improve this answer





















    • Downvote because it is purely a copy from Microsoft with no additional elaboration.
      – GaTechThomas
      Mar 17 '14 at 19:41






    • 10




      Actually, I'm upvoting this because it is the explanation that made the most sense... even if it is C&P'd from somewhere else.
      – TomXP411
      May 19 '14 at 19:25








    • 1




      Downvote because the question asks for the purpose. The pure technical explanation what is is says nothing about the purpose and cases when to use it.
      – Magier
      Oct 7 '15 at 8:52













    up vote
    19
    down vote










    up vote
    19
    down vote









    From MSDN:



    PAD_INDEX = { ON | OFF }



    Specifies index padding. The default is OFF.



    ON:
    The percentage of free space that is specified by fillfactor is applied to the intermediate-level pages of the index.



    OFF or fillfactor is not specified:
    The intermediate-level pages are filled to near capacity, leaving sufficient space for at least one row of the maximum size the index can have, considering the set of keys on the intermediate pages.



    The PAD_INDEX option is useful only when FILLFACTOR is specified, because PAD_INDEX uses the percentage specified by FILLFACTOR. If the percentage specified for FILLFACTOR is not large enough to allow for one row, the Database Engine internally overrides the percentage to allow for the minimum. The number of rows on an intermediate index page is never less than two, regardless of how low the value of fillfactor.



    In backward compatible syntax, WITH PAD_INDEX is equivalent to WITH PAD_INDEX = ON.






    share|improve this answer












    From MSDN:



    PAD_INDEX = { ON | OFF }



    Specifies index padding. The default is OFF.



    ON:
    The percentage of free space that is specified by fillfactor is applied to the intermediate-level pages of the index.



    OFF or fillfactor is not specified:
    The intermediate-level pages are filled to near capacity, leaving sufficient space for at least one row of the maximum size the index can have, considering the set of keys on the intermediate pages.



    The PAD_INDEX option is useful only when FILLFACTOR is specified, because PAD_INDEX uses the percentage specified by FILLFACTOR. If the percentage specified for FILLFACTOR is not large enough to allow for one row, the Database Engine internally overrides the percentage to allow for the minimum. The number of rows on an intermediate index page is never less than two, regardless of how low the value of fillfactor.



    In backward compatible syntax, WITH PAD_INDEX is equivalent to WITH PAD_INDEX = ON.







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Jul 28 '11 at 9:57









    Edwin de Koning

    11.7k64466




    11.7k64466












    • Downvote because it is purely a copy from Microsoft with no additional elaboration.
      – GaTechThomas
      Mar 17 '14 at 19:41






    • 10




      Actually, I'm upvoting this because it is the explanation that made the most sense... even if it is C&P'd from somewhere else.
      – TomXP411
      May 19 '14 at 19:25








    • 1




      Downvote because the question asks for the purpose. The pure technical explanation what is is says nothing about the purpose and cases when to use it.
      – Magier
      Oct 7 '15 at 8:52


















    • Downvote because it is purely a copy from Microsoft with no additional elaboration.
      – GaTechThomas
      Mar 17 '14 at 19:41






    • 10




      Actually, I'm upvoting this because it is the explanation that made the most sense... even if it is C&P'd from somewhere else.
      – TomXP411
      May 19 '14 at 19:25








    • 1




      Downvote because the question asks for the purpose. The pure technical explanation what is is says nothing about the purpose and cases when to use it.
      – Magier
      Oct 7 '15 at 8:52
















    Downvote because it is purely a copy from Microsoft with no additional elaboration.
    – GaTechThomas
    Mar 17 '14 at 19:41




    Downvote because it is purely a copy from Microsoft with no additional elaboration.
    – GaTechThomas
    Mar 17 '14 at 19:41




    10




    10




    Actually, I'm upvoting this because it is the explanation that made the most sense... even if it is C&P'd from somewhere else.
    – TomXP411
    May 19 '14 at 19:25






    Actually, I'm upvoting this because it is the explanation that made the most sense... even if it is C&P'd from somewhere else.
    – TomXP411
    May 19 '14 at 19:25






    1




    1




    Downvote because the question asks for the purpose. The pure technical explanation what is is says nothing about the purpose and cases when to use it.
    – Magier
    Oct 7 '15 at 8:52




    Downvote because the question asks for the purpose. The pure technical explanation what is is says nothing about the purpose and cases when to use it.
    – Magier
    Oct 7 '15 at 8:52










    up vote
    0
    down vote













    This is actually a highly complex subject. Turning on PAD_INDEX can have dramatic effects on read performance and memory pressure in large tables. The larger the table the bigger the effect. As a rule I'd say you want to leave it off unless you fall into some NOT UNCOMMON categories. Then, follow this advice carefully. As I show in the example case below, adjusting FILLFACTOR when PAD_INDEX is ON can have an exponential effect that needs to be carefully balanced.




    1. PAD_INDEX ALWAYS has a detrimental effect on reads! The lower your FILLFACTOR the bigger the effect so you need to pay close attention to the value of FILLFACTOR when you turn it on. On large tables you essentially stop thinking about FILLFACTOR in terms of reducing leaf splits and start thinking about its effect on intermediate bloat vs intermediate splits.

    2. PAD_INDEX rarely has a useful effect on indexes with less than 100,000 rows and NEVER has a positive effect on indexes covering identity or insert-time type columns were inserts are always to the end of the table.

    3. From the above you should see that if you turn PAD_INDEX on you must carefully balance the negative effects with the positive.


    Rules of thumb: PAD_INDEX is rarely useful on non-clustered indexes unless they are quite wide, on clustered indexes of very narrow tables, or on tables that have less than 100K rows unless inserts are highly clustered and even then it can be questionable.



    You MUST understand how it works:
    When you insert into an index the row must fit into the the leaf block that contains the appropriate range of keys. Clustered indexes typically have much wider rows than non-clustered indexes and so their leaf blocks hold fewer rows. FillFactor creates space for new rows in the leaf but in the case of very wide rows or a large volume of inserts that are clustered together rather than evenly distributed it's often impractical or impossible to create enough slack (1-pct fill) to prevent splits.



    When a split occurs a new intermediate row is created to point to the new block and that row must fit into its appropriate block. If that intermediate block is full it must first be split. Splits can run all the way down to the root if you are particularly unlucky. When the root splits you end up creating a new index level.



    The point of PAD_INDEX is to force a minimum amount of free space in your intermediate level blocks.



    After a rebuild there may be little or no space at the lower levels. So you can have massive splitting of your intermediates all over the place if you have lots of leaf splits and PAD_INDEX isn't turned on!



    Mostly though, splits can be managed with FILLFACTOR. The bigger split problems happen with insert patterns that virtually guarantee you won't have enough free space and turning PAD_INDEX on then helps alleviate this by providing space at deeper levels so when a split does occur you are less likely to incur lots of multilevel splits.



    Example Case



    I have a customer table with 100K rows. On any given day about 5% of my customers will be active. I have a table that records activity by customer by time. On average a customer performs 20 actions and the description takes, on average, 1K. So I collect 100MB of data and lets assume I've got a year already in the table - so 36GB.



    The table has inserts of 1Kb rows with customer_number and insert_time (in that order) for key columns. Clearly the average customer will split an 8K leaf block several times while inserting their expected 20 rows because each row will insert immediately after the preceding row in the same block until it splits and splits and splits (makes one consider a heap with only non-clustered indexes...). If the intermediate block pointing to the appropriate leaf doesn't have enough room for at least 4 rows (in reality probably 8 but...) the intermediate will need to split. Given this example's key takes up 22 bytes, an intermediate block can hold 367 entries. This means I need 6% free space in my intermediate block or a fill of 94% to hold the 4 entries.



    Notice that even a 1% FILLFACTOR won't stop leaf block splits since a block can only hold 8 rows. Setting FILLFACTOR TO 80% will only allow 1 row to be added before the leaf splits but will inject over 800 bytes of free space per intermediate block if PAD_INDEX is on. That's ~800 empty bytes for EVERY intermediate block when I only need 88.



    This is really important!: So if I have 36M rows already in the table, using 80% means 294 rows per intermediate block, meaning 122K blocks, meaning I've injected 98MB into my intermediate block structure when 94% lets 345 rows fit per block so there are only 104K intermediate blocks (yes I'm leaving out the lower levels for simplicity). Adding 88 bytes to each of 104K blocks adds only 9.2MB as opposed to 98MB.



    Now consider that only 5% of my customers did anything. Some did more than 20 things and some less so some blocks split anyway and since only 275KB were actually needed to hold the day's index rows (100k/8*22), the best case is that only 8.9MB of my 9.2MB were dead air. If split prevention is important it's well worth 9mb however I'd be thinking harder about 98mb.



    So by turning PAD_INDEX on I should be giving up on controlling leaf splits entirely and turning to controlling intermediate splits.



    DON'T bother worrying about anything but the first intermediate level! There is a butterfly effect induced by any clustering (in this case clustering of customer_number) that will throw any calculation you make out the window. Unless your inserts are perfectly uniform your margin of error in finding the right number to balance bloat with splits is typically far bigger than the effect of the lower level block space.






    share|improve this answer



























      up vote
      0
      down vote













      This is actually a highly complex subject. Turning on PAD_INDEX can have dramatic effects on read performance and memory pressure in large tables. The larger the table the bigger the effect. As a rule I'd say you want to leave it off unless you fall into some NOT UNCOMMON categories. Then, follow this advice carefully. As I show in the example case below, adjusting FILLFACTOR when PAD_INDEX is ON can have an exponential effect that needs to be carefully balanced.




      1. PAD_INDEX ALWAYS has a detrimental effect on reads! The lower your FILLFACTOR the bigger the effect so you need to pay close attention to the value of FILLFACTOR when you turn it on. On large tables you essentially stop thinking about FILLFACTOR in terms of reducing leaf splits and start thinking about its effect on intermediate bloat vs intermediate splits.

      2. PAD_INDEX rarely has a useful effect on indexes with less than 100,000 rows and NEVER has a positive effect on indexes covering identity or insert-time type columns were inserts are always to the end of the table.

      3. From the above you should see that if you turn PAD_INDEX on you must carefully balance the negative effects with the positive.


      Rules of thumb: PAD_INDEX is rarely useful on non-clustered indexes unless they are quite wide, on clustered indexes of very narrow tables, or on tables that have less than 100K rows unless inserts are highly clustered and even then it can be questionable.



      You MUST understand how it works:
      When you insert into an index the row must fit into the the leaf block that contains the appropriate range of keys. Clustered indexes typically have much wider rows than non-clustered indexes and so their leaf blocks hold fewer rows. FillFactor creates space for new rows in the leaf but in the case of very wide rows or a large volume of inserts that are clustered together rather than evenly distributed it's often impractical or impossible to create enough slack (1-pct fill) to prevent splits.



      When a split occurs a new intermediate row is created to point to the new block and that row must fit into its appropriate block. If that intermediate block is full it must first be split. Splits can run all the way down to the root if you are particularly unlucky. When the root splits you end up creating a new index level.



      The point of PAD_INDEX is to force a minimum amount of free space in your intermediate level blocks.



      After a rebuild there may be little or no space at the lower levels. So you can have massive splitting of your intermediates all over the place if you have lots of leaf splits and PAD_INDEX isn't turned on!



      Mostly though, splits can be managed with FILLFACTOR. The bigger split problems happen with insert patterns that virtually guarantee you won't have enough free space and turning PAD_INDEX on then helps alleviate this by providing space at deeper levels so when a split does occur you are less likely to incur lots of multilevel splits.



      Example Case



      I have a customer table with 100K rows. On any given day about 5% of my customers will be active. I have a table that records activity by customer by time. On average a customer performs 20 actions and the description takes, on average, 1K. So I collect 100MB of data and lets assume I've got a year already in the table - so 36GB.



      The table has inserts of 1Kb rows with customer_number and insert_time (in that order) for key columns. Clearly the average customer will split an 8K leaf block several times while inserting their expected 20 rows because each row will insert immediately after the preceding row in the same block until it splits and splits and splits (makes one consider a heap with only non-clustered indexes...). If the intermediate block pointing to the appropriate leaf doesn't have enough room for at least 4 rows (in reality probably 8 but...) the intermediate will need to split. Given this example's key takes up 22 bytes, an intermediate block can hold 367 entries. This means I need 6% free space in my intermediate block or a fill of 94% to hold the 4 entries.



      Notice that even a 1% FILLFACTOR won't stop leaf block splits since a block can only hold 8 rows. Setting FILLFACTOR TO 80% will only allow 1 row to be added before the leaf splits but will inject over 800 bytes of free space per intermediate block if PAD_INDEX is on. That's ~800 empty bytes for EVERY intermediate block when I only need 88.



      This is really important!: So if I have 36M rows already in the table, using 80% means 294 rows per intermediate block, meaning 122K blocks, meaning I've injected 98MB into my intermediate block structure when 94% lets 345 rows fit per block so there are only 104K intermediate blocks (yes I'm leaving out the lower levels for simplicity). Adding 88 bytes to each of 104K blocks adds only 9.2MB as opposed to 98MB.



      Now consider that only 5% of my customers did anything. Some did more than 20 things and some less so some blocks split anyway and since only 275KB were actually needed to hold the day's index rows (100k/8*22), the best case is that only 8.9MB of my 9.2MB were dead air. If split prevention is important it's well worth 9mb however I'd be thinking harder about 98mb.



      So by turning PAD_INDEX on I should be giving up on controlling leaf splits entirely and turning to controlling intermediate splits.



      DON'T bother worrying about anything but the first intermediate level! There is a butterfly effect induced by any clustering (in this case clustering of customer_number) that will throw any calculation you make out the window. Unless your inserts are perfectly uniform your margin of error in finding the right number to balance bloat with splits is typically far bigger than the effect of the lower level block space.






      share|improve this answer

























        up vote
        0
        down vote










        up vote
        0
        down vote









        This is actually a highly complex subject. Turning on PAD_INDEX can have dramatic effects on read performance and memory pressure in large tables. The larger the table the bigger the effect. As a rule I'd say you want to leave it off unless you fall into some NOT UNCOMMON categories. Then, follow this advice carefully. As I show in the example case below, adjusting FILLFACTOR when PAD_INDEX is ON can have an exponential effect that needs to be carefully balanced.




        1. PAD_INDEX ALWAYS has a detrimental effect on reads! The lower your FILLFACTOR the bigger the effect so you need to pay close attention to the value of FILLFACTOR when you turn it on. On large tables you essentially stop thinking about FILLFACTOR in terms of reducing leaf splits and start thinking about its effect on intermediate bloat vs intermediate splits.

        2. PAD_INDEX rarely has a useful effect on indexes with less than 100,000 rows and NEVER has a positive effect on indexes covering identity or insert-time type columns were inserts are always to the end of the table.

        3. From the above you should see that if you turn PAD_INDEX on you must carefully balance the negative effects with the positive.


        Rules of thumb: PAD_INDEX is rarely useful on non-clustered indexes unless they are quite wide, on clustered indexes of very narrow tables, or on tables that have less than 100K rows unless inserts are highly clustered and even then it can be questionable.



        You MUST understand how it works:
        When you insert into an index the row must fit into the the leaf block that contains the appropriate range of keys. Clustered indexes typically have much wider rows than non-clustered indexes and so their leaf blocks hold fewer rows. FillFactor creates space for new rows in the leaf but in the case of very wide rows or a large volume of inserts that are clustered together rather than evenly distributed it's often impractical or impossible to create enough slack (1-pct fill) to prevent splits.



        When a split occurs a new intermediate row is created to point to the new block and that row must fit into its appropriate block. If that intermediate block is full it must first be split. Splits can run all the way down to the root if you are particularly unlucky. When the root splits you end up creating a new index level.



        The point of PAD_INDEX is to force a minimum amount of free space in your intermediate level blocks.



        After a rebuild there may be little or no space at the lower levels. So you can have massive splitting of your intermediates all over the place if you have lots of leaf splits and PAD_INDEX isn't turned on!



        Mostly though, splits can be managed with FILLFACTOR. The bigger split problems happen with insert patterns that virtually guarantee you won't have enough free space and turning PAD_INDEX on then helps alleviate this by providing space at deeper levels so when a split does occur you are less likely to incur lots of multilevel splits.



        Example Case



        I have a customer table with 100K rows. On any given day about 5% of my customers will be active. I have a table that records activity by customer by time. On average a customer performs 20 actions and the description takes, on average, 1K. So I collect 100MB of data and lets assume I've got a year already in the table - so 36GB.



        The table has inserts of 1Kb rows with customer_number and insert_time (in that order) for key columns. Clearly the average customer will split an 8K leaf block several times while inserting their expected 20 rows because each row will insert immediately after the preceding row in the same block until it splits and splits and splits (makes one consider a heap with only non-clustered indexes...). If the intermediate block pointing to the appropriate leaf doesn't have enough room for at least 4 rows (in reality probably 8 but...) the intermediate will need to split. Given this example's key takes up 22 bytes, an intermediate block can hold 367 entries. This means I need 6% free space in my intermediate block or a fill of 94% to hold the 4 entries.



        Notice that even a 1% FILLFACTOR won't stop leaf block splits since a block can only hold 8 rows. Setting FILLFACTOR TO 80% will only allow 1 row to be added before the leaf splits but will inject over 800 bytes of free space per intermediate block if PAD_INDEX is on. That's ~800 empty bytes for EVERY intermediate block when I only need 88.



        This is really important!: So if I have 36M rows already in the table, using 80% means 294 rows per intermediate block, meaning 122K blocks, meaning I've injected 98MB into my intermediate block structure when 94% lets 345 rows fit per block so there are only 104K intermediate blocks (yes I'm leaving out the lower levels for simplicity). Adding 88 bytes to each of 104K blocks adds only 9.2MB as opposed to 98MB.



        Now consider that only 5% of my customers did anything. Some did more than 20 things and some less so some blocks split anyway and since only 275KB were actually needed to hold the day's index rows (100k/8*22), the best case is that only 8.9MB of my 9.2MB were dead air. If split prevention is important it's well worth 9mb however I'd be thinking harder about 98mb.



        So by turning PAD_INDEX on I should be giving up on controlling leaf splits entirely and turning to controlling intermediate splits.



        DON'T bother worrying about anything but the first intermediate level! There is a butterfly effect induced by any clustering (in this case clustering of customer_number) that will throw any calculation you make out the window. Unless your inserts are perfectly uniform your margin of error in finding the right number to balance bloat with splits is typically far bigger than the effect of the lower level block space.






        share|improve this answer














        This is actually a highly complex subject. Turning on PAD_INDEX can have dramatic effects on read performance and memory pressure in large tables. The larger the table the bigger the effect. As a rule I'd say you want to leave it off unless you fall into some NOT UNCOMMON categories. Then, follow this advice carefully. As I show in the example case below, adjusting FILLFACTOR when PAD_INDEX is ON can have an exponential effect that needs to be carefully balanced.




        1. PAD_INDEX ALWAYS has a detrimental effect on reads! The lower your FILLFACTOR the bigger the effect so you need to pay close attention to the value of FILLFACTOR when you turn it on. On large tables you essentially stop thinking about FILLFACTOR in terms of reducing leaf splits and start thinking about its effect on intermediate bloat vs intermediate splits.

        2. PAD_INDEX rarely has a useful effect on indexes with less than 100,000 rows and NEVER has a positive effect on indexes covering identity or insert-time type columns were inserts are always to the end of the table.

        3. From the above you should see that if you turn PAD_INDEX on you must carefully balance the negative effects with the positive.


        Rules of thumb: PAD_INDEX is rarely useful on non-clustered indexes unless they are quite wide, on clustered indexes of very narrow tables, or on tables that have less than 100K rows unless inserts are highly clustered and even then it can be questionable.



        You MUST understand how it works:
        When you insert into an index the row must fit into the the leaf block that contains the appropriate range of keys. Clustered indexes typically have much wider rows than non-clustered indexes and so their leaf blocks hold fewer rows. FillFactor creates space for new rows in the leaf but in the case of very wide rows or a large volume of inserts that are clustered together rather than evenly distributed it's often impractical or impossible to create enough slack (1-pct fill) to prevent splits.



        When a split occurs a new intermediate row is created to point to the new block and that row must fit into its appropriate block. If that intermediate block is full it must first be split. Splits can run all the way down to the root if you are particularly unlucky. When the root splits you end up creating a new index level.



        The point of PAD_INDEX is to force a minimum amount of free space in your intermediate level blocks.



        After a rebuild there may be little or no space at the lower levels. So you can have massive splitting of your intermediates all over the place if you have lots of leaf splits and PAD_INDEX isn't turned on!



        Mostly though, splits can be managed with FILLFACTOR. The bigger split problems happen with insert patterns that virtually guarantee you won't have enough free space and turning PAD_INDEX on then helps alleviate this by providing space at deeper levels so when a split does occur you are less likely to incur lots of multilevel splits.



        Example Case



        I have a customer table with 100K rows. On any given day about 5% of my customers will be active. I have a table that records activity by customer by time. On average a customer performs 20 actions and the description takes, on average, 1K. So I collect 100MB of data and lets assume I've got a year already in the table - so 36GB.



        The table has inserts of 1Kb rows with customer_number and insert_time (in that order) for key columns. Clearly the average customer will split an 8K leaf block several times while inserting their expected 20 rows because each row will insert immediately after the preceding row in the same block until it splits and splits and splits (makes one consider a heap with only non-clustered indexes...). If the intermediate block pointing to the appropriate leaf doesn't have enough room for at least 4 rows (in reality probably 8 but...) the intermediate will need to split. Given this example's key takes up 22 bytes, an intermediate block can hold 367 entries. This means I need 6% free space in my intermediate block or a fill of 94% to hold the 4 entries.



        Notice that even a 1% FILLFACTOR won't stop leaf block splits since a block can only hold 8 rows. Setting FILLFACTOR TO 80% will only allow 1 row to be added before the leaf splits but will inject over 800 bytes of free space per intermediate block if PAD_INDEX is on. That's ~800 empty bytes for EVERY intermediate block when I only need 88.



        This is really important!: So if I have 36M rows already in the table, using 80% means 294 rows per intermediate block, meaning 122K blocks, meaning I've injected 98MB into my intermediate block structure when 94% lets 345 rows fit per block so there are only 104K intermediate blocks (yes I'm leaving out the lower levels for simplicity). Adding 88 bytes to each of 104K blocks adds only 9.2MB as opposed to 98MB.



        Now consider that only 5% of my customers did anything. Some did more than 20 things and some less so some blocks split anyway and since only 275KB were actually needed to hold the day's index rows (100k/8*22), the best case is that only 8.9MB of my 9.2MB were dead air. If split prevention is important it's well worth 9mb however I'd be thinking harder about 98mb.



        So by turning PAD_INDEX on I should be giving up on controlling leaf splits entirely and turning to controlling intermediate splits.



        DON'T bother worrying about anything but the first intermediate level! There is a butterfly effect induced by any clustering (in this case clustering of customer_number) that will throw any calculation you make out the window. Unless your inserts are perfectly uniform your margin of error in finding the right number to balance bloat with splits is typically far bigger than the effect of the lower level block space.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Oct 29 at 19:58

























        answered Oct 26 at 17:35









        bielawski

        460413




        460413






















            up vote
            0
            down vote













            @bielawski
            You describe only cases when PAD_INDEX=ON and FILLFACTOR is between 1 to 99.
            What you're thinking about set PAD_INDEX=ON and FILLFACTOR=0 or 100 in case I insert ordered rows, which always be newer then previous one.



            CREATE CLUSTERED INDEX [IX_z_arch_export_dzienny_pre] ON [dbo].[z_arch_export_daily_pre]
            (
            [Date] ASC,
            [Object Code] ASC,
            [From date] ASC,
            [Person_role] ASC,
            [Departure] ASC,
            [Room code] ASC,
            [period_7_14] ASC
            )WITH (PAD_INDEX = ON, FILLFACTOR=100)


            insert into z_arch_export_daily_pre
            select * from export_daily_pre
            order by [Date] ASC,[Object Code] ASC,[From date] ASC,[Person_role] ASC,[Departure] ASC,[Room code] ASC,[period_7_14] ASC


            I have 100% assurance that all new rows will be inserted "at the end" of index, and only with this options (PAD_INDEX = ON, FILLFACTOR=100) I could achieve 0.01% of fragmentation index after insert.
            Is something dangerous with this settings with that assumptions?






            share|improve this answer























            • This should be a comment, not an answer :)
              – m__
              Nov 8 at 14:24






            • 1




              Yes I know, but when i click add comment under @bielawski answer I get information, that i have had at least 50 of reputation :(
              – Peter_K
              Nov 8 at 14:27















            up vote
            0
            down vote













            @bielawski
            You describe only cases when PAD_INDEX=ON and FILLFACTOR is between 1 to 99.
            What you're thinking about set PAD_INDEX=ON and FILLFACTOR=0 or 100 in case I insert ordered rows, which always be newer then previous one.



            CREATE CLUSTERED INDEX [IX_z_arch_export_dzienny_pre] ON [dbo].[z_arch_export_daily_pre]
            (
            [Date] ASC,
            [Object Code] ASC,
            [From date] ASC,
            [Person_role] ASC,
            [Departure] ASC,
            [Room code] ASC,
            [period_7_14] ASC
            )WITH (PAD_INDEX = ON, FILLFACTOR=100)


            insert into z_arch_export_daily_pre
            select * from export_daily_pre
            order by [Date] ASC,[Object Code] ASC,[From date] ASC,[Person_role] ASC,[Departure] ASC,[Room code] ASC,[period_7_14] ASC


            I have 100% assurance that all new rows will be inserted "at the end" of index, and only with this options (PAD_INDEX = ON, FILLFACTOR=100) I could achieve 0.01% of fragmentation index after insert.
            Is something dangerous with this settings with that assumptions?






            share|improve this answer























            • This should be a comment, not an answer :)
              – m__
              Nov 8 at 14:24






            • 1




              Yes I know, but when i click add comment under @bielawski answer I get information, that i have had at least 50 of reputation :(
              – Peter_K
              Nov 8 at 14:27













            up vote
            0
            down vote










            up vote
            0
            down vote









            @bielawski
            You describe only cases when PAD_INDEX=ON and FILLFACTOR is between 1 to 99.
            What you're thinking about set PAD_INDEX=ON and FILLFACTOR=0 or 100 in case I insert ordered rows, which always be newer then previous one.



            CREATE CLUSTERED INDEX [IX_z_arch_export_dzienny_pre] ON [dbo].[z_arch_export_daily_pre]
            (
            [Date] ASC,
            [Object Code] ASC,
            [From date] ASC,
            [Person_role] ASC,
            [Departure] ASC,
            [Room code] ASC,
            [period_7_14] ASC
            )WITH (PAD_INDEX = ON, FILLFACTOR=100)


            insert into z_arch_export_daily_pre
            select * from export_daily_pre
            order by [Date] ASC,[Object Code] ASC,[From date] ASC,[Person_role] ASC,[Departure] ASC,[Room code] ASC,[period_7_14] ASC


            I have 100% assurance that all new rows will be inserted "at the end" of index, and only with this options (PAD_INDEX = ON, FILLFACTOR=100) I could achieve 0.01% of fragmentation index after insert.
            Is something dangerous with this settings with that assumptions?






            share|improve this answer














            @bielawski
            You describe only cases when PAD_INDEX=ON and FILLFACTOR is between 1 to 99.
            What you're thinking about set PAD_INDEX=ON and FILLFACTOR=0 or 100 in case I insert ordered rows, which always be newer then previous one.



            CREATE CLUSTERED INDEX [IX_z_arch_export_dzienny_pre] ON [dbo].[z_arch_export_daily_pre]
            (
            [Date] ASC,
            [Object Code] ASC,
            [From date] ASC,
            [Person_role] ASC,
            [Departure] ASC,
            [Room code] ASC,
            [period_7_14] ASC
            )WITH (PAD_INDEX = ON, FILLFACTOR=100)


            insert into z_arch_export_daily_pre
            select * from export_daily_pre
            order by [Date] ASC,[Object Code] ASC,[From date] ASC,[Person_role] ASC,[Departure] ASC,[Room code] ASC,[period_7_14] ASC


            I have 100% assurance that all new rows will be inserted "at the end" of index, and only with this options (PAD_INDEX = ON, FILLFACTOR=100) I could achieve 0.01% of fragmentation index after insert.
            Is something dangerous with this settings with that assumptions?







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 8 at 14:13

























            answered Nov 8 at 14:00









            Peter_K

            11




            11












            • This should be a comment, not an answer :)
              – m__
              Nov 8 at 14:24






            • 1




              Yes I know, but when i click add comment under @bielawski answer I get information, that i have had at least 50 of reputation :(
              – Peter_K
              Nov 8 at 14:27


















            • This should be a comment, not an answer :)
              – m__
              Nov 8 at 14:24






            • 1




              Yes I know, but when i click add comment under @bielawski answer I get information, that i have had at least 50 of reputation :(
              – Peter_K
              Nov 8 at 14:27
















            This should be a comment, not an answer :)
            – m__
            Nov 8 at 14:24




            This should be a comment, not an answer :)
            – m__
            Nov 8 at 14:24




            1




            1




            Yes I know, but when i click add comment under @bielawski answer I get information, that i have had at least 50 of reputation :(
            – Peter_K
            Nov 8 at 14:27




            Yes I know, but when i click add comment under @bielawski answer I get information, that i have had at least 50 of reputation :(
            – Peter_K
            Nov 8 at 14:27


















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f6857007%2fwhat-is-the-purpose-of-pad-index-in-this-sql-server-constraint%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            這個網誌中的熱門文章

            Hercules Kyvelos

            Tangent Lines Diagram Along Smooth Curve

            Yusuf al-Mu'taman ibn Hud