|
The behavior of an application using MyBatis changes when second level
caching is enabled. For example, consider the following use case. Table A: ------------------------------------- id name age 1 'John Smith' 22 2 'Jane Doe' 20 1) SqlSession 1 executes "select * from A". 2) SqlSession 1 closes. 3) SqlSession 2 executes "delete from A where id = 1" 4) SqlSession 2 executes "select * from A" When second level caching isn't enabled Step 4 returns the following row from the database. 2 'Jane Doe' 20 When second level caching is enabled, Step 4 returns the the following rows from the database. 1 'John Smith' 22 2 'Jane Doe' 20 The difference in behavior is a result of Step 4 using the second level cache (which was populated by Step 1). However, Step 4 incorrectly uses the second level cache in this case because the cache has become stale (for SqlSession 2) as a result of Step 3. Issue 524 was open recently identifying this problem. Earlier last year this issue was discussed in the thread below but not completely understood. Issue: http://code.google.com/p/mybatis/issues/detail?id=524 Thread: http://groups.google.com/group/mybatis-user/browse_thread/thread/71ab76b1b083deb2/722729e7c13a0e5e?lnk=gst&q=second+level+cache#722729e7c13a0e5e This post is an attempt to gather community feedback on whether the behavior of the second level cache should be changed so that the results returned in Step 4 are consistent whether second level caching is enabled or not. Two potential options have been identified at this point: 1) Clear the second level cache immediately when an insert occurs. 2) Ignore stale second level caches for a SqlSession that has performed an insert/update/delete. |
|
What should SqlSession 3 get if it executes "select * from A" sometime
between steps 3 and 4? I think it should get both rows. What should happen if SqlSession 2 does a rollback instead of a commit? I think the cache should be left as is. This is the heart of the problem - I don't think we should allow dirty reads from the cache (or at least it should be configurable somehow). But I do understand the problem for SqlSession 2. I think I'm in favor of a solution like #2 - once you insert/update/delete in a session no further selects in that session should use the cache, OR update the cache. And the cache should still be cleared at the commit. I haven't looked at the code to know how big an effort this is, but I think that would address the issue. Thoughts?? Jeff Butler On Thu, Feb 23, 2012 at 10:13 AM, Robert Hafner <[hidden email]> wrote: > The behavior of an application using MyBatis changes when second level > caching is enabled. For example, consider the following use case. > > Table A: > ------------------------------------- > id name age > 1 'John Smith' 22 > 2 'Jane Doe' 20 > > 1) SqlSession 1 executes "select * from A". > 2) SqlSession 1 closes. > 3) SqlSession 2 executes "delete from A where id = 1" > 4) SqlSession 2 executes "select * from A" > > When second level caching isn't enabled Step 4 returns the following > row from the database. > 2 'Jane Doe' 20 > > When second level caching is enabled, Step 4 returns the the following > rows from the database. > 1 'John Smith' 22 > 2 'Jane Doe' 20 > > The difference in behavior is a result of Step 4 using the second > level cache (which was populated by Step 1). However, Step 4 > incorrectly uses the second level cache in this case because the cache > has become stale (for SqlSession 2) as a result of Step 3. > > Issue 524 was open recently identifying this problem. Earlier last > year this issue was discussed in the thread below but not completely > understood. > > Issue: > http://code.google.com/p/mybatis/issues/detail?id=524 > > Thread: > http://groups.google.com/group/mybatis-user/browse_thread/thread/71ab76b1b083deb2/722729e7c13a0e5e?lnk=gst&q=second+level+cache#722729e7c13a0e5e > > This post is an attempt to gather community feedback on whether the > behavior of the second level cache should be changed so that the > results returned in Step 4 are consistent whether second level caching > is enabled or not. Two potential options have been identified at this > point: > > 1) Clear the second level cache immediately when an insert occurs. > 2) Ignore stale second level caches for a SqlSession that has > performed an insert/update/delete. > > > > > > > |
|
Both solutions should work. None of them will cause dirty reads.
I am talking from memory but there is already a dirty flag in session that indicates that an update has been called (that will also mark the cache to be flushed when commit is done). It seems that this is just avoiding reading from the cache when session is dirty. I don't see a problem with updating the 2nd level cache on commit with selects if those selects have been executed after the updates. Looks that solution #2 is better. The only problem I see with it is that caches are difficult to understand as they are now and this will make them more complex. That is the good point of option #1, its trivial! |
|
In reply to this post by Jeff Butler
I agree SqlSession3 should get both rows and if SqlSession2 performs a rollback the cache shouldn't be modified. I also tend to agree that the cache shouldn't perform dirty reads (at least when the transaction isolation level is set to READ COMMITTED). Not really sure if it should behave differently in other isolation levels. In general I think the caching behavior should not differ from the "no-caching" behavior.
On Thu, Feb 23, 2012 at 12:10 PM, Jeff Butler <[hidden email]> wrote: What should SqlSession 3 get if it executes "select * from A" sometime |
|
In reply to this post by Eduardo Macarron
I think there is a problem with #1.
In this scenario, step 3 would clear the cache, step 4 would refill the cache with uncommitted changes. If SqlSession 2 does a rollback in step 5, then the cache is incorrect. I guess we could clear the cache on rollback - that would be an optimistic caching strategy that allows dirty reads :) Jeff Butler On Thu, Feb 23, 2012 at 2:02 PM, Eduardo <[hidden email]> wrote: > Both solutions should work. None of them will cause dirty reads. > > I am talking from memory but there is already a dirty flag in session > that indicates that an update has been called (that will also mark the > cache to be flushed when commit is done). It seems that this is just > avoiding reading from the cache when session is dirty. I don't see a > problem with updating the 2nd level cache on commit with selects if > those selects have been executed after the updates. > > Looks that solution #2 is better. The only problem I see with it is > that caches are difficult to understand as they are now and this will > make them more complex. That is the good point of option #1, its > trivial! |
|
Maybe I should say "allows reads of uncommitted changes", that's more
accurate than dirty read. Jeff Butler On Thu, Feb 23, 2012 at 2:16 PM, Jeff Butler <[hidden email]> wrote: > I think there is a problem with #1. > > In this scenario, step 3 would clear the cache, step 4 would refill > the cache with uncommitted changes. If SqlSession 2 does a rollback > in step 5, then the cache is incorrect. I guess we could clear the > cache on rollback - that would be an optimistic caching strategy that > allows dirty reads :) > > Jeff Butler > > > On Thu, Feb 23, 2012 at 2:02 PM, Eduardo <[hidden email]> wrote: >> Both solutions should work. None of them will cause dirty reads. >> >> I am talking from memory but there is already a dirty flag in session >> that indicates that an update has been called (that will also mark the >> cache to be flushed when commit is done). It seems that this is just >> avoiding reading from the cache when session is dirty. I don't see a >> problem with updating the 2nd level cache on commit with selects if >> those selects have been executed after the updates. >> >> Looks that solution #2 is better. The only problem I see with it is >> that caches are difficult to understand as they are now and this will >> make them more complex. That is the good point of option #1, its >> trivial! |
|
In reply to this post by Jeff Butler
I'm not sure I understand completely. In this scenario are you changing when the cache is populated from commit time to immediately? Or just forcing the cache to be cleared when an insert/update/delete occurs.
On Thu, Feb 23, 2012 at 2:16 PM, Jeff Butler <[hidden email]> wrote: I think there is a problem with #1. |
|
I think the way MyBatis works now is that a select will immediately
populate the cache if necessary. insert/update/delete cause the cache to be cleared at commit time. I thought you were proposing that the cache be cleared immediately on insert/update/delete rather than waiting for the commit. That would cause the problem I mentioned. Of course, I may be wrong about my understanding - but I don't think MyBatis waits for a commit to populate the cache on select. I may have misunderstood the proposal. Can you elaborate on what #1 looks like and how it deals with uncommitted changes? Jeff Butler On Thu, Feb 23, 2012 at 2:33 PM, Robert Hafner <[hidden email]> wrote: > I'm not sure I understand completely. In this scenario are you changing when > the cache is populated from commit time to immediately? Or just forcing the > cache to be cleared when an insert/update/delete occurs. > > > On Thu, Feb 23, 2012 at 2:16 PM, Jeff Butler <[hidden email]> wrote: >> >> I think there is a problem with #1. >> >> In this scenario, step 3 would clear the cache, step 4 would refill >> the cache with uncommitted changes. If SqlSession 2 does a rollback >> in step 5, then the cache is incorrect. I guess we could clear the >> cache on rollback - that would be an optimistic caching strategy that >> allows dirty reads :) >> >> Jeff Butler >> >> >> On Thu, Feb 23, 2012 at 2:02 PM, Eduardo <[hidden email]> >> wrote: >> > Both solutions should work. None of them will cause dirty reads. >> > >> > I am talking from memory but there is already a dirty flag in session >> > that indicates that an update has been called (that will also mark the >> > cache to be flushed when commit is done). It seems that this is just >> > avoiding reading from the cache when session is dirty. I don't see a >> > problem with updating the 2nd level cache on commit with selects if >> > those selects have been executed after the updates. >> > >> > Looks that solution #2 is better. The only problem I see with it is >> > that caches are difficult to understand as they are now and this will >> > make them more complex. That is the good point of option #1, its >> > trivial! > > |
|
My understanding thus far may be incorrect. I was under the impression the cacheable results from SqlSession X are only available to another SqlSession after X commits. I thought this is what is meant by a "Transactional cache". If that's not the case, then we have a bigger problem than I initially thought because caching allows dirty reads and our transaction isolation level does not. Can you explain (maybe with some examples) how caching is designed to work today?
Option 1: Clear the L2 cache immediately on inserts/update/delete. I'm not sure how this affects other SqlSessions that are running concurrently that may have gotten results from the cache already. Option 2: Do not use caches after an insert/update/delete is invoked. This behavior may be an all or nothing approach. Or it could isolate out the stale caches and only return results form the "valid" caches. On Feb 23, 2012, at 2:41 PM, Jeff Butler wrote: > I think the way MyBatis works now is that a select will immediately > populate the cache if necessary. insert/update/delete cause the cache > to be cleared at commit time. I thought you were proposing that the > cache be cleared immediately on insert/update/delete rather than > waiting for the commit. That would cause the problem I mentioned. Of > course, I may be wrong about my understanding - but I don't think > MyBatis waits for a commit to populate the cache on select. > > I may have misunderstood the proposal. Can you elaborate on what #1 > looks like and how it deals with uncommitted changes? > > Jeff Butler > > > On Thu, Feb 23, 2012 at 2:33 PM, Robert Hafner <[hidden email]> wrote: >> I'm not sure I understand completely. In this scenario are you changing when >> the cache is populated from commit time to immediately? Or just forcing the >> cache to be cleared when an insert/update/delete occurs. >> >> >> On Thu, Feb 23, 2012 at 2:16 PM, Jeff Butler <[hidden email]> wrote: >>> >>> I think there is a problem with #1. >>> >>> In this scenario, step 3 would clear the cache, step 4 would refill >>> the cache with uncommitted changes. If SqlSession 2 does a rollback >>> in step 5, then the cache is incorrect. I guess we could clear the >>> cache on rollback - that would be an optimistic caching strategy that >>> allows dirty reads :) >>> >>> Jeff Butler >>> >>> >>> On Thu, Feb 23, 2012 at 2:02 PM, Eduardo <[hidden email]> >>> wrote: >>>> Both solutions should work. None of them will cause dirty reads. >>>> >>>> I am talking from memory but there is already a dirty flag in session >>>> that indicates that an update has been called (that will also mark the >>>> cache to be flushed when commit is done). It seems that this is just >>>> avoiding reading from the cache when session is dirty. I don't see a >>>> problem with updating the 2nd level cache on commit with selects if >>>> those selects have been executed after the updates. >>>> >>>> Looks that solution #2 is better. The only problem I see with it is >>>> that caches are difficult to understand as they are now and this will >>>> make them more complex. That is the good point of option #1, its >>>> trivial! >> >> |
|
Well my understanding might be wrong as I said. TransactionalCache
does indeed have some code to deal with cache updates on commit. So maybe #1 will work after all. Eduardo seems to think it will work and he's definitely "the man" right now :) What would be great is to write some unit tests for this stuff so we can try it out thoroughly. Anyone up for some multi-threaded synchronized JUnit magic? I might do it myself, but not for a few days. Jeff Butler On Thu, Feb 23, 2012 at 3:05 PM, Robert Hafner <[hidden email]> wrote: > My understanding thus far may be incorrect. I was under the impression the cacheable results from SqlSession X are only available to another SqlSession after X commits. I thought this is what is meant by a "Transactional cache". If that's not the case, then we have a bigger problem than I initially thought because caching allows dirty reads and our transaction isolation level does not. Can you explain (maybe with some examples) how caching is designed to work today? > > Option 1: Clear the L2 cache immediately on inserts/update/delete. I'm not sure how this affects other SqlSessions that are running concurrently that may have gotten results from the cache already. > > Option 2: Do not use caches after an insert/update/delete is invoked. This behavior may be an all or nothing approach. Or it could isolate out the stale caches and only return results form the "valid" caches. > > > > > > > On Feb 23, 2012, at 2:41 PM, Jeff Butler wrote: > >> I think the way MyBatis works now is that a select will immediately >> populate the cache if necessary. insert/update/delete cause the cache >> to be cleared at commit time. I thought you were proposing that the >> cache be cleared immediately on insert/update/delete rather than >> waiting for the commit. That would cause the problem I mentioned. Of >> course, I may be wrong about my understanding - but I don't think >> MyBatis waits for a commit to populate the cache on select. >> >> I may have misunderstood the proposal. Can you elaborate on what #1 >> looks like and how it deals with uncommitted changes? >> >> Jeff Butler >> >> >> On Thu, Feb 23, 2012 at 2:33 PM, Robert Hafner <[hidden email]> wrote: >>> I'm not sure I understand completely. In this scenario are you changing when >>> the cache is populated from commit time to immediately? Or just forcing the >>> cache to be cleared when an insert/update/delete occurs. >>> >>> >>> On Thu, Feb 23, 2012 at 2:16 PM, Jeff Butler <[hidden email]> wrote: >>>> >>>> I think there is a problem with #1. >>>> >>>> In this scenario, step 3 would clear the cache, step 4 would refill >>>> the cache with uncommitted changes. If SqlSession 2 does a rollback >>>> in step 5, then the cache is incorrect. I guess we could clear the >>>> cache on rollback - that would be an optimistic caching strategy that >>>> allows dirty reads :) >>>> >>>> Jeff Butler >>>> >>>> >>>> On Thu, Feb 23, 2012 at 2:02 PM, Eduardo <[hidden email]> >>>> wrote: >>>>> Both solutions should work. None of them will cause dirty reads. >>>>> >>>>> I am talking from memory but there is already a dirty flag in session >>>>> that indicates that an update has been called (that will also mark the >>>>> cache to be flushed when commit is done). It seems that this is just >>>>> avoiding reading from the cache when session is dirty. I don't see a >>>>> problem with updating the 2nd level cache on commit with selects if >>>>> those selects have been executed after the updates. >>>>> >>>>> Looks that solution #2 is better. The only problem I see with it is >>>>> that caches are difficult to understand as they are now and this will >>>>> make them more complex. That is the good point of option #1, its >>>>> trivial! >>> >>> > |
|
I am still the rookie Jeff :)
TransactionalCache holds the 2nd level cache and a temporary buffer (log) - All reads are done from the 2nd level cache (never from the buffer) - All select results comming from database are stored in the buffer and are moved to the 2nd level cache on commit. - When an update is executed the temporary buffer is flushed inmediately and the session is marked so the 2nd level cache will be flushed if the session ends with commit. Change #2 consists on disabling the 2nd level cache access for dirty sessions. All selects in a dirty session will hit the database. It looks that #1 will work fine. #2 is a bit more complex to understand, but seems it will behave better and the code change is minor in both options. I have attached the code to issue #524. I will really appreciate you all to have a look at the change. Thanks in advance! |
|
In reply to this post by Jeff Butler
I put together a test case which exercises the use cases we have discussed. I don't believe the test case needs to be multi-threaded if the transaction manager is configured to use "JDBC". Unless of course you have concerns with the ReadWrite lock on the cache in a multithreaded use case. (Correct me if I am wrong here). So far the only test case that fails is the one that I originally reported. I'll attach the test code to issue 524 so that you can review it.
On Thu, Feb 23, 2012 at 3:24 PM, Jeff Butler <[hidden email]> wrote: Well my understanding might be wrong as I said. TransactionalCache |
|
I have tried to integrate your cases in mybatis tests with hsqldb and
derby but they fail with deadlocks. Anyway, I would say that test1 is enough. I have uploaded an updated snapshot so you can test it. Looks that option #2 works. Please let me know you opinion on committing this change for 3.1. |
|
I tried on H2 as well and it failed with a deadlock. I think its a limitation of running the db in embeded mode. But I didn't investigate it much. Instead I ran the testcase against MySQL. |
|
Yep, I suppose they are blocking the whole table. Anyway thanks for
the testing with MySQL. I think the fix is good and safe, at least safer than not having it. Let's wait for the opinion of other devs about adding it to next release. 2012/2/25 Robert Hafner <[hidden email]>: > I tried on H2 as well and it failed with a deadlock. I think its a > limitation of running the db in embeded mode. But I didn't investigate it > much. Instead I ran the testcase against MySQL. > > On Feb 24, 2012 3:18 PM, "Eduardo Macarron" <[hidden email]> > wrote: >> >> I have tried to integrate your cases in mybatis tests with hsqldb and >> derby but they fail with deadlocks. >> >> Anyway, I would say that test1 is enough. I have uploaded an updated >> snapshot so you can test it. >> >> Looks that option #2 works. Please let me know you opinion on >> committing this change for 3.1. |
|
Ok guys. I will include this change for 3.1 as a bug fix. I suppose it
will be released this weekend or maybe next (it depends on Simo availability) Thanks for your help and your patience Robert! |
|
I'm writing some multi-threaded tests for this. I'll commit them
tomorrow hopefully. Jeff On Mon, Feb 27, 2012 at 12:06 PM, Eduardo <[hidden email]> wrote: > Ok guys. I will include this change for 3.1 as a bug fix. I suppose it > will be released this weekend or maybe next (it depends on Simo > availability) > > Thanks for your help and your patience Robert! |
|
Thank you Jeff! I will keep the issue open then until we have an OK.
2012/2/27 Jeff Butler <[hidden email]>: > I'm writing some multi-threaded tests for this. I'll commit them > tomorrow hopefully. > > Jeff > > On Mon, Feb 27, 2012 at 12:06 PM, Eduardo <[hidden email]> wrote: >> Ok guys. I will include this change for 3.1 as a bug fix. I suppose it >> will be released this weekend or maybe next (it depends on Simo >> availability) >> >> Thanks for your help and your patience Robert! |
|
This has been a frustrating exercise! I've tried with both HSQLDB and
Derby in both embedded and server modes. I can't make a test for the rollback scenario work. Row locking in the DB prevents the scenario I was concerned about. If I test with cache enabled, it works exactly as it should. Disabling the cache causes the tests to hang because of row locks in the DB. I think we're good to go with closing this issue. Jeff On Mon, Feb 27, 2012 at 12:21 PM, Eduardo Macarron <[hidden email]> wrote: > Thank you Jeff! I will keep the issue open then until we have an OK. > > 2012/2/27 Jeff Butler <[hidden email]>: >> I'm writing some multi-threaded tests for this. I'll commit them >> tomorrow hopefully. >> >> Jeff >> >> On Mon, Feb 27, 2012 at 12:06 PM, Eduardo <[hidden email]> wrote: >>> Ok guys. I will include this change for 3.1 as a bug fix. I suppose it >>> will be released this weekend or maybe next (it depends on Simo >>> availability) >>> >>> Thanks for your help and your patience Robert! |
|
I had the same problem :( Anyway thanks for your help the more eyes we
have on this, the better. This one was the last issue. I think 3.1 is ready to be released. |
| Powered by Nabble | Edit this page |
