Document Description

Chapter 14: Indexing Structures for Files
1
CHAPTER 14: INDEXING STRUCTURES FOR FILES
Answers to Selected Exercises 14.14 Consider a disk with block size B=512 bytes. A block pointer is P=6 bytes long, and a record pointer is P R =7 bytes long. A file has r=30,000 EMPLOYEE records of fixed-length. Each record has the following fields: NAME (30 bytes), SSN (9 bytes), DEPARTMENTCODE (9 bytes), ADDRESS (40 bytes), PHONE (9 bytes), BIRTHDATE (8 bytes), SEX (1 byte), JOBCODE (4 bytes), SALARY (4 by

Document Share

Document Tags

Document Transcript

Chapter 14: Indexing Structures for FilesPre-Publication Material: This is draft manuscript yet to be copy edited or paged.Copyright AWL20041
CHAPTER 14: INDEXING STRUCTURES FOR FILES
Answers to Selected Exercises
14.14 Consider a disk with block size B=512 bytes. A block pointer is P=6 bytes long,and a record pointer is P R =7 bytes long. A file has r=30,000 EMPLOYEE recordsof fixed-length. Each record has the following fields: NAME (30 bytes), SSN (9bytes), DEPARTMENTCODE (9 bytes), ADDRESS (40 bytes), PHONE (9 bytes),BIRTHDATE (8 bytes), SEX (1 byte), JOBCODE (4 bytes), SALARY (4 bytes, realnumber). An additional byte is used as a deletion marker.Answers:(a) Calculate the record size R in bytes.(b) Calculate the blocking factor bfr and the number of file blocks b assuming anunspanned organization.(c) Suppose the file is ordered by the key field SSN and we want to construct a primaryindex on SSN. Calculate (i) the index blocking factor bfr i (which is also the indexfan-out fo); (ii) the number of first-level index entries and the number of first-levelindex blocks; (iii) the number of levels needed if we make it into a multi-levelindex; (iv) the total number of blocks required by the multi-level index; and(v) the number of block accesses needed to search for and retrieve a record fromthe file--given its SSN value--using the primary index.(d) Suppose the file is not ordered by the key field SSN and we want to construct asecondary index on SSN. Repeat the previous exercise (part c) for the secondaryindex and compare with the primary index.(e) Suppose the file is not ordered by the non-key field DEPARTMENTCODE and we wantto construct a secondary index on SSN using Option 3 of Section 5.1.3, with an extralevel of indirection that stores record pointers. Assume there are 1000 distinctvalues of DEPARTMENTCODE, and that the EMPLOYEE records are evenly distributedamong these values. Calculate (i) the index blocking factor bfr i (which is also theindex fan-out fo); (ii) the number of blocks needed by the level of indirection thatstores record pointers; (iii) the number of first-level index entries and thenumber of first-level index blocks; (iv) the number of levels needed if we make it amulti-level index; (v) the total number of blocks required by the multi-level indexand the blocks used in the extra level of indirection; and (vi) the approximatenumber of block accesses needed to search for and retrieve all records in the filehaving a specific DEPARTMENTCODE value using the index.(f) Suppose the file is ordered by the non-key field DEPARTMENTCODE and we want toconstruct a clustering index on DEPARTMENTCODE that uses block anchors (everynew value of DEPARTMENTCODE starts at the beginning of a new block). Assumethere are 1000 distinct values of DEPARTMENTCODE, and that the EMPLOYEErecords are evenly distributed among these values. Calculate (i) the index blockingfactor bfr i (which is also the index fan-out fo); (ii) the number of first-levelindex entries and the number of first-level index blocks; (iii) the number of levelsneeded if we make it a multi-level index; (iv) the total number of blocks requiredby the multi-level index; and (v) the number of block accesses needed to search forand retrieve all records in the file having a specific DEPARTMENTCODE value using
Chapter 14: Indexing Structures for FilesPre-Publication Material: This is draft manuscript yet to be copy edited or paged.Copyright AWL20042
the clustering index (assume that multiple blocks in a cluster are either contiguousor linked by pointers).(g) Suppose the file is not ordered by the key field SSN and we want to construct a B + -treeaccess structure (index) on SSN. Calculate (i) the orders p and p leaf of theB + -tree; (ii) the number of leaf-level blocks needed if blocks are approximately69% full (rounded up for convenience); (iii) the number of levels needed ifinternal nodes are also 69% full (rounded up for convenience); (iv) the totalnumber of blocks required by the B + -tree; and (v) the number of block accessesneeded to search for and retrieve a record from the file--given its SSN value--using the B + -tree.Answer:(a) Record length R = (30 + 9 + 9 + 40 + 9 + 8 + 1 + 4 + 4) + 1 = 115 bytes(b) Blocking factor bfr = floor(B/R) = floor(512/115) = 4 records per blockNumber of blocks needed for file = ceiling(r/bfr) = ceiling(30000/4) = 7500(c) i. Index record size R i = (V SSN + P) = (9 + 6) = 15 bytesIndex blocking factor bfr i = fo = floor(B/R i ) = floor(512/15) = 34ii. Number of first-level index entries r 1 = number of file blocks b = 7500 entriesNumber of first-level index blocks b 1 = ceiling(r 1 /bfr i ) = ceiling(7500/34)= 221 blocksiii. We can calculate the number of levels as follows:Number of second-level index entries r 2 = number of first-level blocks b 1= 221 entriesNumber of second-level index blocks b 2 = ceiling(r 2 /bfr i ) = ceiling(221/34)= 7 blocksNumber of third-level index entries r 3 = number of second-level index blocks b 2= 7 entriesNumber of third-level index blocks b 3 = ceiling(r 3 /bfr i ) = ceiling(7/34) = 1Since the third level has only one block, it is the top index level.Hence, the index has x = 3 levelsiv. Total number of blocks for the index b i = b 1 + b 2 + b 3 = 221 + 7 + 1= 229 blocksv. Number of block accesses to search for a record = x + 1 = 3 + 1 = 4(d) i. Index record size R i = (V SSN + P) = (9 + 6) = 15 bytesIndex blocking factor bfr i = (fan-out) fo = floor(B/R i ) = floor(512/15)= 34 index records per block(This has not changed from part (c) above)(Alternative solution: The previous solution assumes that leaf-level index blocks containblock pointers; it is also possible to assume that they contain record pointers, inwhich case the index record size would be V SSN + P R = 9 + 7 = 16 bytes. In thiscase, the calculations for leaf nodes in (i) below would then have to use R i = 16bytes rather than R i = 15 bytes, so we get:Index record size R i = (V SSN + P R ) = (9 + 7) = 15 bytesLeaf-level ndex blocking factor bfr i = floor(B/R i ) = floor(512/16)= 32 index records per blockHowever, for internal nodes, block pointers are always used so the fan-out forinternal nodes fo would still be 34.)
Chapter 14: Indexing Structures for FilesPre-Publication Material: This is draft manuscript yet to be copy edited or paged.Copyright AWL20043
ii. Number of first-level index entries r 1 = number of file records r = 30000Number of first-level index blocks b 1 = ceiling(r 1 /bfr i ) = ceiling(30000/34)= 883 blocks(Alternative solution:Number of first-level index entries r 1 = number of file records r = 30000Number of first-level index blocks b 1 = ceiling(r 1 /bfr i ) = ceiling(30000/32)= 938 blocks)iii. We can calculate the number of levels as follows:Number of second-level index entries r 2 = number of first-level index blocks b 1= 883 entriesNumber of second-level index blocks b 2 = ceiling(r 2 /bfr i ) = ceiling(883/34)= 26 blocksNumber of third-level index entries r 3 = number of second-level index blocks b 2= 26 entriesNumber of third-level index blocks b 3 = ceiling(r 3 /bfr i ) = ceiling(26/34) = 1Since the third level has only one block, it is the top index level.Hence, the index has x = 3 levels(Alternative solution:Number of second-level index entries r 2 = number of first-level index blocks b 1= 938 entriesNumber of second-level index blocks b 2 = ceiling(r 2 /bfr i ) = ceiling(938/34)= 28 blocksNumber of third-level index entries r 3 = number of second-level index blocks b 2= 28 entriesNumber of third-level index blocks b 3 = ceiling(r 3 /bfr i ) = ceiling(28/34) = 1Since the third level has only one block, it is the top index level.Hence, the index has x = 3 levels)iv. Total number of blocks for the index b i = b 1 + b 2 + b 3 = 883 + 26 + 1 = 910(Alternative solution:Total number of blocks for the index b i = b 1 + b 2 + b 3 = 938 + 28 + 1 = 987)v. Number of block accesses to search for a record = x + 1 = 3 + 1 = 4(e) i. Index record size R i = (V DEPARTMENTCODE + P) = (9 + 6) = 15 bytesIndex blocking factor bfr i = (fan-out) fo = floor(B/R i ) = floor(512/15)= 34 index records per blockii. There are 1000 distinct values of DEPARTMENTCODE, so the average number ofrecords for each value is (r/1000) = (30000/1000) = 30Since a record pointer size P R = 7 bytes, the number of bytes needed at the levelof indirection for each value of DEPARTMENTCODE is 7 * 30 =210 bytes, whichfits in one block. Hence, 1000 blocks are needed for the level of indirection.iii. Number of first-level index entries r 1= number of distinct values of DEPARTMENTCODE = 1000 entriesNumber of first-level index blocks b 1 = ceiling(r 1 /bfr i ) = ceiling(1000/34)= 30 blocksiv. We can calculate the number of levels as follows:Number of second-level index entries r 2 = number of first-level index blocks b 1= 30 entriesNumber of second-level index blocks b 2 = ceiling(r 2 /bfr i ) = ceiling(30/34) = 1Hence, the index has x = 2 levelsv. total number of blocks for the index b i = b 1 + b 2 + b indirection= 30 + 1 + 1000 = 1031 blocksvi. Number of block accesses to search for and retrieve the block containing therecord pointers at the level of indirection = x + 1 = 2 + 1 = 3 block accesses
Chapter 14: Indexing Structures for FilesPre-Publication Material: This is draft manuscript yet to be copy edited or paged.Copyright AWL20044
If we assume that the 30 records are distributed over 30 distinct blocks, we needan additional 30 block accesses to retrieve all 30 records. Hence, total blockaccesses needed on average to retrieve all the records with a given value forDEPARTMENTCODE = x + 1 + 30 = 33(f) i. Index record size R i = (V DEPARTMENTCODE + P) = (9 + 6) = 15 bytesIndex blocking factor bfr i = (fan-out) fo = floor(B/R i ) = floor(512/15)= 34 index records per blockii. Number of first-level index entries r 1= number of distinct DEPARTMENTCODE values= 1000 entriesNumber of first-level index blocks b 1 = ceiling(r 1 /bfr i )= ceiling(1000/34) = 30 blocksiii. We can calculate the number of levels as follows:Number of second-level index entries r 2 = number of first-level index blocks b 1= 30 entriesNumber of second-level index blocks b 2 = ceiling(r 2 /bfr i ) = ceiling(30/34) = 1Since the second level has one block, it is the top index level.Hence, the index has x = 2 levelsiv. Total number of blocks for the index b i = b 1 + b 2 = 30 + 1 = 31 blocksv. Number of block accesses to search for the first block in the cluster of blocks= x + 1 = 2 + 1 = 3The 30 records are clustered in ceiling(30/bfr) = ceiling(30/4) = 8 blocks.Hence, total block accesses needed on average to retrieve all the records with a givenDEPARTMENTCODE = x + 8 = 2 + 8 = 10 block accesses(g) i. For a B + -tree of order p, the following inequality must be satisfied for eachinternal tree node: (p * P) + ((p - 1) * V SSN ) < B, or(p * 6) + ((p - 1) * 9) < 512, which gives 15p < 521, so p=34For leaf nodes, assuming that record pointers are included in the leaf nodes, thefollowing inequality must be satisfied: (p leaf * (V SSN +P R )) + P < B, or(p leaf * (9+7)) + 6 < 512, which gives 16p leaf < 506, so p leaf =31ii. Assuming that nodes are 69% full on the average, the average number of keyvalues in a leaf node is 0.69*p leaf = 0.69*31 = 21.39. If we round this up forconvenience, we get 22 key values (and 22 record pointers) per leaf node. Since thefile has 30000 records and hence 30000 values of SSN, the number of leaf-levelnodes (blocks) needed is b 1 = ceiling(30000/22) = 1364 blocksiii. We can calculate the number of levels as follows:The average fan-out for the internal nodes (rounded up for convenience) isfo = ceiling(0.69*p) = ceiling(0.69*34) = ceiling(23.46) = 24number of second-level tree blocks b 2 = ceiling(b 1 /fo) = ceiling(1364/24)= 57 blocksnumber of third-level tree blocks b 3 = ceiling(b 2 /fo) = ceiling(57/24)= 3number of fourth-level tree blocks b 4 = ceiling(b 3 /fo) = ceiling(3/24) = 1Since the fourth level has only one block, the tree has x = 4 levels (counting theleaf level). Note: We could use the formula:x = ceiling(log fo (b 1 )) + 1 = ceiling(log 24 1364) + 1 = 3 + 1 = 4 levelsiv. total number of blocks for the tree b i = b 1 + b 2 + b 3 + b 4= 1364 + 57 + 3 + 1 = 1425 blocksv. number of block accesses to search for a record = x + 1 = 4 + 1 = 514.15 A PARTS file with Part# as key field includes records with the following Part#values: 23, 65, 37, 60, 46, 92, 48, 71, 56, 59, 18, 21, 10, 74, 78, 15, 16,

Similar documents

Search Related

We Need Your Support

Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks

SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...Sign Now!

We are very appreciated for your Prompt Action!

x