Archive for the ‘javatech’ Category

Oracle’s JDBC driver + prefetch == garbage [collection]

February 23rd, 2007 | Ryan Porter

The Problem

Recently, we were experiencing major performance problems with loading documents from the database. Profiling did not isolate a single cause; everything (including unrelated, background operations) seemed slow. So, we started logging garbage collection, and found that we were collecting garbage at a rate of 20GB/min!

Profiling revealed that the worst offender, by far, was OracleStatement.prepareAccessors(). Interestingly, it only caused a problem when our result set included a LOB. For such queries, it allocates a 1MB object, regardless of whether the query returned any results at all.

Google searches revealed others who saw similar problems when accessing LOBs, but no solutions other than upgrading or changing drivers. We were already using the latest Oracle JDBC driver, and reverting to earlier drivers did not help. Switching drivers did solve the problem; however, pushing the change to production would require extensive testing to ensure that we were not trading in one problem for another (or more).

I was about to start conducting these tests when John discovered that we were setting the OracleConnection parameter “defaultRowPrefetch” to 1000. This parameter determines how many rows are pulled back from the database on each round-trip, and increasing this value from its default of 10 will normally yield a performance gain. As an experiment, I set the value to 1, and re-profiled memory allocation. The amount of memory allocated by OracleStatement.prepareAccessors() decreased by about three orders of magnitude. Thus, it appears that when a query can return a LOB, Oracle’s JDBC driver allocates approximately “rowPrefetch” KB of memory, even when zero rows are returned.

The Solution

Returning the “defaultRowPrefetch” parameter to 10 did rid us of our garbage collection problems. However, because this is a global setting, reverting it reduced the performance of many other queries which returned many rows with no LOBs. The prospect of setting “rowPrefetch” on a per-query basis was unappetizing, to say the least, but the performance loss was significant. In the end, we altered how we retrieve rows from the database so that the fetch size geometrically increases as we pull results back from the database.

Specifically, the first batch we retrieve contains at most 10 rows, after which we increase the batch size to 20. Once we’ve retrieved 20 more rows, we increase the fetch size to 40, and so on. In this way, we never allocate large amounts of memory for queries which return few (or no) results, but we still quickly ramp up to a large fetch size.

For large queries which returned no LOBs, this solution is still slower than when “defaultRowPrefetch” was 1000. However, the slowdown on those queries was minor, overall system performance was substantially improved, and, importantly, the changes did not require any per-query tuning.

Add speel checking to your Swng text components (the squiggly way)

February 12th, 2007 | Brien



web start | download source

Marking up txt

Let’s hook Swing text components up to some tokenizing logic: a spell checker (the example above uses Jazzy), a regex (the example above will pick out some electronic musicians), or something more advanced.

Like all Swing components, text components are factored into an MVC setup. The model is javax.swing.text.Document; the view is javax.swing.plaf.TextUI (which delegates out to a javax.swing.text.View, which is generated from some ViewFactory); and the controller is the text component itself. A very simple way to add the notion of a token to this setup is to create a new kind Document – a TokenDocument.

When text is inserted into a token document, not only does it need to be tokenized, but existing tokens need to be shifted. We could do this manually; however, all javax.swing.text.Document provide something called a sticky position (javax.swing.text.Position) that that will do the shifting for us. Sticky positions are automatically updated by the document to reflect insertions and deletions of text. They also are guaranteed to maintain their ordering – that is, if position A is <= position B, it will always remain that way. This means the token document can maintain sorted trees of sticky positions (to store tokens) without worrying that their sort order will change.

Once we have the tokens in the model, we need to hook them into the view. We do this through a custom TextUI. It basically does everything a BasicTextUI does except it also paints a token layer underneath the highlights (above the background). In general with the javax.swing.text package, whenever code start painting outside of the view bounds (for each offset, this is the tightest bounding box for the letter at that offset), the dirty region needs to be expanded to include everywhere that was painted. In this code, you’ll see a line in the UI to deal with the dirty region.

Playing wth lines

Custom strokes like the squiggle stroke (above left) and smoothed noise stroke (above right) help give meaning to lines. Also, they can make an interface more fun.

Wrap pu

In this code, we extended the UI to paint lines under text. To change the display of the text itself, we would have to write our own View implementation (or more likely, extend PlainView [0]). This is not exclusive of the approach we took here. A more powerful View implementation could work in tandem with our custom UI, opening up even more ways to present information extracted from user-entered text.

Until next time!

[0] There’s a great introduction to this at Customizing a Text Editor, an article on the Sun Developer Network.

LICENSE — I wrote this using the Jazzy spell check engine + some open source trinkets (especially a Perlin noise generator). Except Jazzy (which is LGPL’d), all of it is Apache/BSD licensed.


Palantir