| By Andreas Grabner | Article Rating: |
|
| January 24, 2011 03:47 PM EST | Reads: |
3,496 |
We are using Confluence for both our internal Wiki as well as for our external Community Portal. I just came across a very nasty performance bug in the version we are running on our external system. We run 3.2 and the User Search Feature keeps me waiting several minutes each time I search for users. I analyzed the problem and want to give some answers to folks that have similar problems with either confluence or other search features in any type of software. It turns out to be the typical N+1 Query and Requesting Too Much Data Problem Pattern that is responsible for this performance problem.
While writing this article I reached out to a Performance Engineer at Atlassian to confirm this behavior and get some guidance. It seems they were already aware of this problem and a solution for this is on its way with Confluence 3.5 – That’s Good News
. But now to the problem we ran into and how I analyzed it.
We almost have 25k users on our external Confluence installation. I admit – it is a lot – but Confluence works really well for us. When using the User Search Feature though, I always have to wait more several minute until I get to the empty User Search dialog where I can enter a filter on username or group membership:
When I search for a specific user or users of a specific group (where I know there is only a handful), the response of this dialog again takes about the same time to come back to me.
Analyzing Slow Request Time
We run dynaTrace on our Confluence installation, allowing me to analyze the internals of the request that opens the blank search dialog:
The PurePath shows that the Servlet first executes a query that returns all usernames and then executes the same SELECT statement for the individual usernames. Looking only at the database activity gives as an overview of how many statements are actually executed:
Each user is actually queried twice with an individual SQL Statement resulting in almost 50k+ SQL Statements
The database view not only verifies my assumption that there is one SQL statement executed for each user in the database. It turns out that the complete user database is queried twice, totaling to almost 50k SQL Statement executions. Remember – this is just opening the dialog that shows no results, just filter options. We can also look at the individual bind variables to verify that a different username gets passed to the SELECT * FROM userbase queries:
When I enter filter criteria, either a username or a group name, I assumed the search would return must faster. It however turns out that – even though the search result only returns the single user or the users in the specified group – it takes almost as long for the dialog to return as when I first opened it. Analyzing the PurePath for the request where I entered a specific username shows that there is one specific query for that username – which should be enough. After that query we can however see all the same queries for each user in the database again – twice!
Looking at all this I see the following problems:
Data is requested when it would not be necessary
When opening the Search Dialog it should not be necessary to execute any query against the user accounts as the dialog comes up with an empty result anyway prompting the user for filter credentials. Also – when querying for a specific username or a specific group it should not be necessary to fetch user details for every user in the database.
Queries are executed twice
There might be some duplicated code or methods that are called twice while processing the page request. If this is really necessary data caching would definitely help here
Lazy vs. Eager Loading (N+1 Query Problem)
The N+1 Query Problem means that a set of object identifiers is requested with one statement (list of usernames in this example). For every object identifier a separate statement is executed to retrieve details. We often see this behavior when using O/R Mappers with Lazy Loading fetching strategy where Eager Loading would make more sense. For more details on this read the O/R Mapper related articles (Hibernate, JPA, …) from Alois Reitbauer
Confluence 3.5 addresses this problem
The online documentation indicates that performance might not be the best if you have a large number of users in the database. As I mentioned in the initial paragraph I reached out to Atlassian to ask for some guidance on solving this problem. They promptly got back to me with the answer that Confluence 3.5 is going to use a new user management system that will significantly improve performance for scenarios like ours. If anybody is interested check out the Development Releases that you can download for testing purposes.
Conclusion
We have written many blogs about these types of problems as we see them with our clients and summarized the Top 10 Performance Problems with follow up links on detailed articles. Any feedback on this blog is welcome. Thanks for reading
Published January 24, 2011 Reads 3,496
Copyright © 2011 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Andreas Grabner
Andreas has over a decade of experience as an architect and developer, and currently works as a senior performance architect and technology strategist for dynaTrace Software, where he influences product strategy and works closely with customers in implementing performance management solutions across the application life cycle. He is a regular speaker at software conferences, writes for a number of technology publications, and blogs at http://blog.dynatrace.com
- Agile Adoption – Crossing the Chasm
- Cloud Expo New York: The Java EE 7 Platform - Developing for the Cloud
- Write Once Run Anywhere or Cross Platform Mobile Development Tools
- Cross-Platform Mobile Website Development – a Tool Comparison
- Architecture Governance – the TOGAF Way
- Twelve New Programming Languages: Is Cloud Responsible?
- It's the Java vs. C++ Shootout Revisited!
- Cloud Expo New York Speaker Profile: Arun Gupta – Oracle
- Agile Development & Enterprise Architecture Practice – Can They Coexist?
- Cloud Expo New York: Industry-Leading CxOs to Present June 11-14
- Component Development and Assembly Using OSGi Services
- Big Data: Information Spawns Innovation
- Agile Adoption – Crossing the Chasm
- Graal, a Dynamic Java Compiler in the Works
- Cloud Expo New York: The Java EE 7 Platform - Developing for the Cloud
- Write Once Run Anywhere or Cross Platform Mobile Development Tools
- Cross-Platform Mobile Website Development – a Tool Comparison
- Architecture Governance – the TOGAF Way
- Google Analytics with Monitis Dashboard
- Twelve New Programming Languages: Is Cloud Responsible?
- It's the Java vs. C++ Shootout Revisited!
- Cloud Expo New York Speaker Profile: Arun Gupta – Oracle
- Scaling Java and JSP Apps with Distributed Caching
- Agile Development & Enterprise Architecture Practice – Can They Coexist?
- A Cup of AJAX? Nay, Just Regular Java Please
- Java Developer's Journal Exclusive: 2006 "JDJ Editors' Choice" Awards
- JavaServer Faces (JSF) vs Struts
- The i-Technology Right Stuff
- Rich Internet Applications with Adobe Flex 2 and Java
- Java vs C++ "Shootout" Revisited
- Bean-Managed Persistence Using a Proxy List
- Reporting Made Easy with JasperReports and Hibernate
- Creating a Pet Store Application with JavaServer Faces, Spring, and Hibernate
- Why Do 'Cool Kids' Choose Ruby or PHP to Build Websites Instead of Java?
- What's New in Eclipse?
- i-Technology Predictions for 2007: Where's It All Headed?





















