The following information relates to the BTSCANNER configuration parameter
BTSCANNER num=<number>:prio=<high|low>:rangescan=<number>:threshold=<number>
Question If this is set low the btscanner will continually be scanning the leaf pages,
which causes excessive IO. If it is high there would be large numbers of
'dirty' index pages which would cause normal threads to do extra work on a
regular basis.
Answer The setting of the threshold ( onmode -C threshold {size} )
sets the number of deleted items an index must encounter before it
is placed onto the hot list to be cleaned. The size is dependent on usage,
thus falls into the tuning area.
Question What is the resource use of the btscanner like. For example if the btscanner
is given a low priority on a busy system will it still be able to keep up with
the cleaning it needs to do or will it be overwhelmed as it rarely gets the
chance to run. Would this be characterised by lots of yielding scanner
threads?
Answer The priority of the btscanner thread(s) can only be set to either low or high,
low being a priority lower than normal user threads, and high being a
priority equal to that of normal user threads. If the system is busy and you
want index cleaning to occur, use onmode -C high
to increase the priority as above.
Question Does this work with version 9.40x? Currently we have it disabled and any
onmode commands relating to range scanning just return a usage message.
How is it enabled?
Answer The Light Range Scan is incorporated in 9.40x. It is only applicable when
the number of indexes on a partition is exactly equal to one (i.e. detached
indexes or an attached index with only one index on the table). For a full
decsription of implementation see the rest of this article. When using
the onmode -range {value} command, Light Range
scanning will automatically occur when only only one index exists on that
partition.
Question Is it possible to permanently configure the number of btscanner threads? At
the moment additional threads can be added with onmode but these are lost
when the instance is restarted. If there were 20 btscanner threads but
only one table in the hot list would only would thread clean that it?
Sometimes some threads will do a lot of reads but no writes - if one scanner
was assigned to one table then surely all would do some writes?
Answer The engine will start by default one and only one thread. The thread will run in a low priority, you can run the onstats to see this. In the
later engines the number of threads can be configured at startup. In earlier
engines there's not a way to configure it to start multiple threads each time. If
you want to add more, you must do that with onmode each time.
The new Btscanner replaces the earlier btree cleaner implementation of earlier Informix Dynamic Server versions. The design covers three major areas that have changed, that of
workload generation,
The workload for cleaning indexes is determined by keeping track of how
many times items an index caused the server to do extra work. The index
which causes the server to do the most extra work, will be the next
index cleaned by the btree scanner thread(s).
how the btree is cleaned of dirty items,
An index will have it's entire leaf level examined looking for deleted items.
Upon finding a deleted index item, the cleaner will test lock the item,
then undertake a foreground remove of the item, and then determine if the
page warrants compression of the index page.
the use of multiple threads
The implementation now allows the dynamic allocation of threads for
configurable workloads.
Submission of work to be cleaned will now be accomplished by profiling the number of times a reader of the index encounters deleted items that require the reader to do extra work. The extra work will be profiled for each index, and will be the basis for developing a Hot List which will drive the workload of the btree scanner.
This hot list will be created by under the following conditions,
Any btree scanner seeing a sort task pending will acquire the task,
setting the sort task in progress and starting a scan of the partitions,
creating a list of part numbers, key numbers and dirty hits.
This list is then sorted by hits and replaces the previous hot list.
The btree scanner thread now replaces the btree cleaner thread.
Correspondingly, the new name of the thread(s) have also be changed
to btscanner #. The btree scanner has three main task groups, and are
processed in order.
Therefore it's typical task cycle consists of checking to see if
Administrative Tasks | Exit a thread | Start a new thread | Kill | Enable | Disable | Set Priority High | Set Priority Low | Yield N | Yield 0 | Set Threshold |
Sort Task | Sort in Progress | Sort Pending |
Cleaning Task | Scan Index |
There now exists two types of scans that may be undertaken to clean an index. The first is a Basic Index Scan and the second is a Light Range Scan. N.B . It should be noted that the Light Range Scan can only be used when the number of indexes on a partition is exactly equal to one (i.e. detached indexes or an attached index with only one index on the table). The Basic Index Scan, however, can be used any time.
The Basic Index (Leaf) Scan consists of starting at the root node of the index and then walking to the farthest left leaf. Once at the leaf level, each node is checked for dirty items and cleaned if required. The leaf node's next pointer is used to move to the next node to be processed, until the last leaf node is visited.
Advantages:
Examines all index nodes from left to right.
Looks at the buffer pool.
Able to operate when more than one index key exists in a partition.
Disadvantages:
Reads every index page into the buffer pool.
Slow due to the amount of I/O operations involved.
The Light Range Scan was added to improve the performance of index cleaning.
It combines several other online performance features, such as light
scanning and range scanning into the index cleaning scan. When a user
submits a request for cleaning, the minimum and maximum logical page numbers
are tracked off the memory partition. The scan then starts by reading a
block of pages from the disk starting with the lowest logical page that a
request has been submitted for. This block is then examined for any leaf
pages having deleted items. Any pages having deleted items are then read
into the buffer pool and cleaned.
During the cleaning of dirty pages, an asynchronous I/O is submitted for
the next block to process until the highest logical page number is
encountered.
Advantages:
Uses light I/O scans.
Only scans between the high and low boundaries.
Disadvantages:
Does not clean index pages that have not been flushed to disk.
This new implementation involves changes to the server engine. Therefore is should not directly affect how applications are run. Any applications that previously did large batch updates or deletions to a single table will no longer bottleneck on the btree cleaner latch. End users will not have to change their code in order to take advantage of the new features of the btree scanner. The changes in the code are to engine algorithms, which will only affect the DBA.s tuning of resources.
Both the onmode and the onstat command have a new option -C
onmode -C start {count} | There can be a maximum of 32 btree scanner threads running at one time. If a count is not specified a default count of 1 is assumed. |
onmode -C stop {count} | This command is used to stop or kill btree scanner threads. This command will not execute immediately, but will take place on the assignment of the next unit of work. If a count is not specified a default count of 1 is assumed |
onmode -C threshold {size} | Sets the minimum number of deleted items an index must encounter before an index will be placed onto the hot list. Once all indexes above the threshold have been cleaned then indexes below this threshold will be added to the hot list. |
onmode -C high | Sets the priority of all running btree scanner threads. This will set the priority of the btree scanner threads equal to that of normal users. |
onmode -C low | This command sets the priority of all running btree scanner threads lower than normal users. This will allow the btree scanner threads to consume only spare resources and ensure that they will not use CPU cycles of normal users. |
onmode -C enable | Enables the btree scanner thread(s) after the disable command has been issued. (Normally only used during testing) |
onmode -C disable | Disables the btree scanner thread(s) from generating a sort list or scanning any indexes for deleted items. (Normally only used during testing) |
onstat -C | Prints the profile information about the btree scanner subsystem and about each btree scanner thread active. |
onstat -C prof | Print the profile information for the system and each thread. |
onstat -C hot | Print the hot list index key in the order they are to be cleaned. |
onstat -C part | Print all partitions with index statistics. |
onstat -C clean | Show information about all partitions which have been cleaned or are in need of being cleaned. |
onstat -C all | Print all onstat - C options. |
Number of threads versus Priority:
The btree scanner threads run at a lower priority than user threads, so
when the system becomes busy the cleaning of the indexes will not occur
as fast. If the system is busy and you want index cleaning to occur,
set the threads to high priority (a priority which is equal to normal
users priority).
Active Threads | The number of currently running B-tree scanners | Global Commands | The commands that have been requested to run | Number of partition scans | The number of times the B-tree scanner has examined all partitions looking for index partitions to clean | Main Block | the pointer to the B-tree scanners main block | BTC Admin | The pointer to the current assigned admin thread |
BTS info | The pointer to the current B-tree scanner information |
Id | The B-tree scanner id and array position |
Prio | The current priority assigned to the B-tree scanner n HIGH - The B-tree scanner run with the same priority as a normal user n LOW - The B-tree scanner runs behind all normal users |
Partnum | The partition number of the index the B-tree scanner is cleaning. If set to 0 then it is not currently cleaning |
Key | The index key number that the B-tree scanner is cleaning |
Cmd | The current command being processed by this B-tree scanner |
Number of leaves pages Scanned | The number of pages the B-tree scanner has read and processed |
Number of leaves with deleted items | The number of leaf pages in which the B-tree scanner has deleted items |
Time spent cleaning (sec) | A very gross estimate of the number of whole seconds the B-tree scanner has spent cleaning indexes |
Current Item | The current item which is being cleaned. If this number is greater than the size of the list, then the entire list has been assigned or has been already cleaned. |
List Size | The number of items on the hot list |
Hit Threshold | The number of dirty items that must be encountered on a specific index before the index will be placed on the hot list. If the B-tree scanner has been idle for over 5 minutes it might decide to internally lower this number. |
List Created | The time the current list was created |
List expires in | The number of seconds left before this list will expire |
Range Scan Threshold | Index contains more than X pages will use Range Scanning to clean the index. If the value is -1 then range scan cleaning is disabled |
Partnum | The partition of the index which needs to be cleaned |
Key | The key number of the index which needs to be cleaned |
Hits | The number of hits encountered * This index has been assigned to be cleaned or has already been cleaned |
Partnum | The partition of the index which needs to be cleaned C - the index is in the process of being cleaned. N - theindex may NOT be cleaned |
Dirty Hits | The current number of hits (index items which have not been removed from the index which a user has encountered) on this index |
Clean Time | The time in seconds spent cleaning this index |
Pg Examined | The number of pages which have been examined by a B-tree scanner for this index |
Items Del | The number of dirty items removed from this index |
Pages/Sec | The average number of pages cleaned by the B-tree scanner per second on this index key. This should only be used for gross performance calculations because a timer with 1 second granularity is used, so the precision is low. |
Partnum | The partition of the index which needs to be cleaned C - the index is in the process of being cleaned. N - theindex may NOT be cleaned |
Low | The lowest logical page which needs to be scanned |
High | The highest page which needs to be scanned |
Size | The current number of pages which exist in this partition |
Saving | The percentage of pages saved by not having to scan the entire index |
The sentences "Therefore, each modification is done in a series. The operations make one attempt at a modification. If the index is locked, the operation fails." Are not worded very well. Thank you for bringing it to my attention and I will work with the team on improving it.
Let me expand a little on how bts behaves. It does depend a little on the version of IDS:
In 11.50.xC5 and earlier, the BTS index operations execute in The a single bts VP and each operation executed sequentially. Multiple readers and writers can interleave their operations.
There is an internal lock to ensure no concurrent (threaded) operations execute on the VP and multiple VPs were not allowed. There is a slim possibility that the lock would timeout and the SQL statement would fail but I have not had a report of this happening. That was essentially the behaviour for bts.1.00, bts.1.10 and bts.2.00 up to and including 11.50.xC5.
Starting with 11.50.xC6, we enabled bts to work with multiple bts VPs. There is still a restriction that each VP only executes one operation at one time, however multiple bts VPs may be create with the VPCLASS onconfig variable or more bts VPs can be added with the onmode -p command. (also, the noyield flag is no longer required when defining a bts VPCLASS).
There may be multiple readers and writers of any bts index across several bts VPs. There is a Critical Section in the transaction commit with put an exclusive (write) lock on the BTS index during this phase. The lock will wait a significantly long time before it times out.
The compact operation still has an exclusive lock. The purpose of compact is to free up space back to the file system in an index built in an extent space or if the index is built in an sbspace (a bts.2.00 feature) will free up pages back to the sbspace.
But unlike the first release of BTS, BTS in 11.50 will eventually reuse pages in the index when new rows are inserted after rows have been deleted. This greatly reduces the need to compact.
To discuss how Oninit ® can assist please call on +1-913-674-0360 or alternatively just send an email specifying your requirements.