Appboard/2.5/builder/caching and polling: Difference between revisions
imported>Doug yeager |
imported>Jason.nicholls No edit summary |
||
Line 1: | Line 1: | ||
{{DISPLAYTITLE:Caching & Polling Overview}} | {{DISPLAYTITLE:Caching & Polling Overview}} | ||
[[Category:AppBoard 2.5]] | [[Category:AppBoard 2.5]] | ||
[[image:AdapterFlow.png]] | [[image:AdapterFlow.png|756px]] | ||
This page summarizes the options available in AppBoard for caching and polling, and provides some recommendations for setting the caching and polling configuration to maximize performance and utility. | This page summarizes the options available in AppBoard for caching and polling, and provides some recommendations for setting the caching and polling configuration to maximize performance and utility. |
Revision as of 09:37, 5 March 2015
This page summarizes the options available in AppBoard for caching and polling, and provides some recommendations for setting the caching and polling configuration to maximize performance and utility.
Definitions
AppBoard uses a model of demand driven data collection. This means that all requests for data originate from the client (Viewer or Builder) which the server then has to fulfill. In a trivial example, without any clients connected to the AppBoard server it does not perform any querying of data sources at all, unless Server Polling is enabled on Data Collections.
In a real world example the typical events resulting in requests for data are:
- initially viewing a board with a client (Viewer or Builder).
- one or more visible widgets configured with Data Collections set to poll. By default Data Collections are not configured for polling.
- user interacts with the board resulting in actions (switching to a new board, server side filtering, etc...)
- an admin client (Builder) previews a data collection
In this sense polling specifically refers to Data Collections configured to have a polling interval. This is actually driven by the client only for visible widgets that have a polling Data Collection configured. This is the recommended way to have the Viewer client update the display without requiring the end-user reload the client.
Regardless of the method of requesting data from the client, the server actually returns data based on the caching settings of the Data Sources. With caching disabled this means the AppBoard server will attempt to retrieve new data from the data source for every request. With a non-zero cache timeout the AppBoard server will deliver cached results if they exist and haven't timed out, otherwise it will fetch new data. Caching is only of benefit if multiple clients are requesting the same data within the timeout window, or a single client happens to be polling more frequently than the timeout window.
In addition when Server Polling is enabled on Data Collections the server is registering jobs that will be used to prepare query results so that when a client polls the results are returned immediately unless a temporary job doesn't exist and one is created.
Configuration
- Refer to the specific Data Source pages for enabling caching.
- Refer to the Data Collections page for enabling polling.
Optimization
Clearly a balance has to be struck between keeping the client with up-to-date information and the total number of queries being performed by the AppBoard server to external data sources.
Setting the Cache Timeout
- Consider how frequently the data is being updated in the back-end data source. If the AppBoard server is querying source data that is being updated every 20 minutes, and the cacheTimeout is set to 300 (5 minutes), then this would not be optimally efficient since about 75% of the time the server would be re-fetching old source data that is already in the server cache. However, it should also be considered that if the source data is being updated every 20 minutes, and the cacheTimeout is set to 1200 (20 minutes), it is possible that the data in the cache could be up to 40 minutes old by the time it is cached again.
- Consider how often the client will be polling for new data. For example, if the client is running a daily summary chart on a particular data source, simply updating the cache once per day may be sufficient for that purpose.
- Consider the complexity of the query that is being run against the server. If a simple query is being run, the cacheTimeout can be set to a low value (such as 1 minute or less) with minimal impact to the end-user.
Setting the Client Polling Interval
- Consider how frequently the cache is being refreshed for the Data Source being used by that Data Collection. If the Data Source's cache is being updated every 5 minutes, and the Polling is set to 60 (1 minute), then this would not be optimally efficient since about 80% of the time the Data Collection would be polling the same old data that is already in the client's cache. However, it should also be considered that if the Data Source's cache is being updated every 5 minutes, and the Polling is set to 300 (5 minutes), it is possible that the data in the client could be up to 10 minutes old by the time it is refreshed. In this case, you might decide to set the Polling to 60 (1 minute), sacrificing efficiency in order to ensure that the data being displayed will never be more than 5 minutes old.
- Consider the Widgets that are using the Data Collection. If you have polling set to every 10 seconds for a Widget that is showing a monthly trend chart, this frequency of polling will most likely be causing unnecessary load on the client.
- Consider how many records are being passed between the server and client at each polling interval. If the number of records being returned are on the order of tens and not thousands (numbers will vary for any installation, depending on client performance and network speed), then the client polling could be set to a very short interval. For example, returning a dozen records every 15 seconds should not impact performance. Returning hundreds of records every 15 seconds would slow the client.
Enabling Server Polling
- CConsider how long the server response time is and how dynamic the data collection queries are. Queries are dynamic based on the query filter and also when session variables or shim:query expressions are applied. Each unique query will cause a unique scheduled job to be created. Using the custom.properties to override the following settings such as the number of threads used to run server jobs or how often to cleanup unused temporary jobs:
# Default server job thread count.
- org.quartz.threadPool.threadCount=10
# Default thread priority.
- org.quartz.threadPool.threadPriority=5
# Time in minutes before temporary job is cleaned up.
- appboard.pollingCleanupTime=5
# Time in minutes cleanup of unused temporary jobs is ran.
- appboard.pollingCleanupJobRate=2
# Max Time in seconds server will wait when processing a job before returning results # for the current dc query. Remote timeout is passed in; and will be used. Max is # just used to prevent locking up a thread.
- appboard.pollingMaxWait=240
The server polling rate will typically be based on the client polling rate. If that rate is not set; then it will be based on the minimum cache timeout of the entity being queried or any entity referenced by an association. Typically the result of polling will cause the datasource to be polled a slightly higher rate.
Queries and Testing
In addition to setting the caching and polling intervals carefully, there are other elements of the system that should be controlled to maximize the performance of the AppBoard system. These include the following:
- In configuring Data Sources, server-side queries of data should be limited to exclude any data that will never be needed by AppBoard.
- The polling that takes place in the client has the greatest impact on the robustness of the AppBoard application. Data Collections should be configured to minimize both the amount of data requested and the frequency of that data being passed from the server to the client.
- When testing the system before it is released in production, different settings and combinations for polling and caching should be tested to help isolate any bottlenecks or inefficiencies.