“enq: TX – allocate ITL entry” wait event on index block

Over the weekend, some processes on one of our databases locked up.  Support killed some sessions to get things back running, but on Monday I wanted to try to drill into what had occurred.

Querying dba_hist_active_sess_history I identified most problem sessions were waiting on the event “enq: TX – allocate ITL entry”.

For more details about what this event is, and why it happens, Arup Nanda has a good blog post (with pictures) about why it occurs.  In summary if a session wants to update a block, but there is no  remaining space in the block to create a new ITL entry, then it has to wait for one of the sessions currently holding an existing entry to commit (or rollback).

dba_hist_active_sess_history also identified, via the current_obj# column, that the object involved was  an index on a STATUS field, and I used the current_file#, and current_block# from one of the records to identify one of the blocks involved.  Below are some extracts from the block dump:

 seg/obj: 0x4bf2e  csc: 0x49.7a33a31d  itc: 169  flg: E  typ: 2 - INDEX
     brn: 0  bdba: 0x70490 ver: 0x01 opc: 0
     inc: 0  exflg: 0

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0069.004.00000004  0x00c01ce8.0001.41  C---    0  scn 0x0049.78de4f73
0x02   0x012a.000.00000004  0x00c0f996.0002.22  --U-    1  fsc 0x000d.7a33aa37
0x03   0x016c.003.0000000b  0x00c0584a.0010.07  --U-    1  fsc 0x000d.7a35cd79
0x04   0x0144.00e.0000000b  0x00c0e3fa.0004.2e  --U-    1  fsc 0x000d.7a33aa24
0x05   0x0130.00c.00000005  0x00c033cf.0000.34  C---    0  scn 0x0049.7a2ec617
0x06   0x0005.009.0006b8f0  0x00c0f190.8b4b.1c  --U-    1  fsc 0x000d.7a33a967
0x07   0x0160.011.00000004  0x00c04e80.0001.0c  C---    0  scn 0x0049.7a2ec71a
0x08   0x0115.002.00000005  0x00c03082.0001.02  --U-    1  fsc 0x000d.7a33a9f8
0x09   0x0028.001.0000000e  0x00c0e93b.0010.01  C---    0  scn 0x0049.7a2eca94
0x0a   0x00d4.005.00000005  0x00c0275d.0001.1e  --U-    1  fsc 0x000d.7a33aa49
0x0b   0x014c.005.00000007  0x00c0ee84.0003.25  C---    0  scn 0x0049.7a2eb3fb
0x0c   0x00a4.010.0000000b  0x00c0e6ad.0010.40  --U-    1  fsc 0x000d.7a35ce41
...
0xa6   0x000f.015.0006bd01  0x00c25946.7676.21  --U-    1  fsc 0x000d.7a33a763
0xa7   0x0025.002.00000011  0x00c051b2.0011.17  --U-    1  fsc 0x000d.7a33a711
0xa8   0x0151.004.00000004  0x00c04afe.0000.09  --U-    1  fsc 0x000d.7a33a8c4
0xa9   0x00fc.01d.00000003  0x00c02b51.0001.0e  --U-    1  fsc 0x000d.7a33a785
...
row#0[1109] flag: ---D---, lock: 169, len=11
col 0; len 1; (1):  30
col 1; len 6; (6):  00 09 c8 8b 00 5a
row#1[1098] flag: ---D---, lock: 164, len=11
col 0; len 1; (1):  30
col 1; len 6; (6):  00 09 c8 8b 00 5b
row#2[1087] flag: ---D---, lock: 75, len=11
col 0; len 1; (1):  30
col 1; len 6; (6):  00 09 c8 8b 00 5c
row#3[3969] flag: ---DS--, lock: 0, len=11
col 0; len 1; (1):  30
col 1; len 6; (6):  00 09 c8 93 00 05
row#4[1307] flag: ---D---, lock: 155, len=11
col 0; len 1; (1):  30
col 1; len 6; (6):  00 09 c8 9e 00 51
...
row#173[2330] flag: -------, lock: 0, len=11
col 0; len 1; (1):  35
col 1; len 6; (6):  00 07 04 8c 00 14
row#174[2319] flag: -------, lock: 0, len=11
col 0; len 1; (1):  35
col 1; len 6; (6):  00 07 04 8c 00 15
row#175[2308] flag: -------, lock: 0, len=11
col 0; len 1; (1):  35
col 1; len 6; (6):  00 07 04 8c 00 16

In particular note that the block contains 175 index entries and 169 ITL entries.  This makes me suspect that 168 transactions were accessing the block simultaneously (one ITL entry is reserved by Oracle for leaf node splits).

Time to look at the process and understand what is occurring.  A highly simplified version of the process is provided below:

INSERT RECORD INTO EVENT_TABLE STATUS 0
DO SOME STUFF
UPDATE RECORD SET STATUS = 9
DO SOME MORE STUFF
COMMIT

As this is process is provided via a web-service, many sessions can be performing this task at the same time.  At this point I had a pretty good idea about what was occurring:

  • Many sessions each insert a row into EVENT_TABLE at the same time
  • Index entries for all these new records are likely to be in the same block (the index is on status, all these rows have the same value for status).
  • Each session will need to grab an ITL entry from this same index block to perform the update (note index entry update is really done as a delete and then an insert)
  • Other sessions will need to wait until ITL entries are released

Based on this hypothesis, time to build a test case:

CREATE TABLE ITL_TEST_TABLE (ID NUMBER PRIMARY KEY,
                             STATUS VARCHAR2(2 CHAR) DEFAULT '0');
CREATE SEQUENCE IT_TEST_SEQUENCE;
CREATE INDEX ITL_TEST_STATUS ON ITL_TEST_TABLE(STATUS);

CREATE OR REPLACE PROCEDURE ITL_TEST_PROC(sleep number) 
AS
   INSERTED_ID NUMBER;
BEGIN
   INSERT INTO ITL_TEST_TABLE (ID, STATUS) VALUES (IT_TEST_SEQUENCE.NEXTVAL, NULL) RETURNING ID INTO INSERTED_ID;
   UPDATE ITL_TEST_TABLE SET STATUS = '9' WHERE ID = INSERTED_ID;
   DBMS_LOCK.SLEEP(sleep);
   COMMIT;
END;   
/
>cat test_itl.sql
exec itl_test_proc(60);
exit;
>cat test_itl.sh:
#!/bin/bash
for i in {1..200}; do 
   nohup sqlplus / as sysdba @test_itl &
done;

When I execute test_itl.sh and query v$session I can see there are 32 sessions waiting on ‘enq: TX – allocate ITL entry’.  Exactly the same as the real incident, 168 of these sessions can run this code simultaneously, the rest are blocked.

I am following up with the developers about resolving the issue, some of my ideas are:

  • Initially create record with status NULL, so initial index entry will not be created.
  • Optimize the logic in process so that transactions do not hold uncommitted data for so long
  • Hold off on creating row until end of process until status is known, and then just inserting rather than inserting and then updating.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s