Appboard/2.5/builder/caching and polling: Difference between revisions

imported>Doug yeager
imported>Jason.nicholls
 
(11 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{DISPLAYTITLE:Caching & Polling Overview}}
{{DISPLAYTITLE:Caching & Polling Overview}}
[[Category:AppBoard 2.5]]
[[Category:AppBoard 2.5]]
[[image:AdapterFlow.png]]
[[image:AdapterFlow.png|756px]]


This page summarizes the options available in AppBoard for caching and polling, and provides some recommendations for setting the caching and polling configuration to maximize performance and utility.
This page summarizes the options available in AppBoard for caching and polling, and provides some recommendations for setting the caching and polling configuration to maximize performance and provide the best user experience.


== Definitions ==
AppBoard uses a model of demand driven data collection. This means by default all requests for data originate from the client (Viewer or Builder) which the server then has to fulfill. In other words, without any clients connected to the AppBoard server it does not perform any querying of data sources at all. This default behaviour can be modified using Server Polling if enabled on specific Data Collections.


== Definitions ==
In practice the following events result in requests for data:


AppBoard uses a model of demand driven data collection. This means that all requests for data originate from the client (Viewer or Builder) which the server then has to fulfill. In a trivial example, without any clients connected to the AppBoard server it does not perform any querying of data sources at all, unless Server Polling is enabled on Data Collections.
* Initially viewing a board with a client (Viewer or Builder).
* One or more ''visible'' widgets configured with Data Collection that have '''Client Polling''' enabled. By default Data Collections are not configured for Client Polling.
* Data Collections with '''Server Polling''' enabled in which case data requests will be made on the server without clients connected. By default Server Polling is disabled.
* User interaction with widgets resulting in actions (switching to a new board, server side filtering, etc...)
* An admin client (Builder) ''previews'' a data collection


In a real world example the typical events resulting in requests for data are:
When the AppBoard server receives a request for data it will service the request by either returning cached data, or by fetching new data from the data source. '''Caching''' is enabled as part of the Data Source configuration, either for the entire Data Source, or for individual Data Source entities.


* initially viewing a board with a client (Viewer or Builder).
Understanding when to use client polling, server polling, and setting appropriate caching is important to ensure a good user experience, reduce the load on data sources, and reduce the load on the AppBoard server.
* one or more ''visible'' widgets configured with Data Collections set to '''poll'''. By default Data Collections are not configured for polling.
* user interacts with the board resulting in actions (switching to a new board, server side filtering, etc...)
* an admin client (Builder) ''previews'' a data collection


In this sense '''polling''' specifically refers to Data Collections configured to have a polling interval. This is actually driven by the client only for visible widgets that have a polling Data Collection configured. This is the recommended way to have the Viewer client update the display without requiring the end-user reload the client.
For widgets where the data shown is updated at the source and these updates should be reflected in the widget automatically, then it is necessary to turn on '''Client Polling'''. The AppBoard client will then poll the server for updated data based on the configured polling frequency. Client polling is only active for widgets on the currently visible Board.


Regardless of the method of requesting data from the client, the server actually returns data based on the '''caching''' settings of the Data Sources. With caching disabled this means the AppBoard server will attempt to retrieve new data from the data source for every request. With a non-zero cache timeout the AppBoard server will deliver cached results if they exist and haven't timed out, otherwise it will fetch new data. Caching is only of benefit if multiple clients are requesting the same data within the timeout window, or a single client happens to be polling more frequently than the timeout window.
In cases where the data source is slow to respond a client may have to wait on the response each time new data is fetched - which will depend on the client polling interval and cache settings of the data source. To avoid this problem enable '''Server Polling''' which will ensure server will always have a full cache and be able to respond to clients immediately.


== Configuration ==
== Configuration ==
Line 37: Line 41:
# Consider the complexity of the query that is being run against the server.  If a simple query is being run, the cacheTimeout can be set to a low value (such as 1 minute or less) with minimal impact to the end-user.
# Consider the complexity of the query that is being run against the server.  If a simple query is being run, the cacheTimeout can be set to a low value (such as 1 minute or less) with minimal impact to the end-user.


=== Setting the Client Polling Interval ===
=== Client Polling ===
 
Client polling can impact the performance of the client as it is doing more work issuing requests, processing responses, and re-drawing widgets. It will also increase the amount of network traffic to/from the AppBoard server. There is also an impact on the AppBoard server having to process more requests which ties back into setting appropriate cache timeouts.
 
When configuring client polling consider:
 
# How frequently the source data is updated and the data source cache timeout. Pick a setting that provides an acceptable maximum data age but doesn't poll so frequently as to be of little benefit. The total maximum data age will be the cache_timeout + polling_frequency. For example, a 5 minute cache timeout and 1 minute polling frequency will mean in the ''worst case'' a client will see data up to <tt>5+1=6</tt> minutes old.
# User usage. Assuming the source data is updated very frequently, and disabling caching and setting a very fast polling frequency (down to 5 seconds) does not mean it is a good idea. Aside from the performance impact, dashboards with too many widgets updating too frequently can be visually distracting and difficult to follow. A general rule of thumb is to set client polling to between 1 and 5 minutes depending on how frequently the source data is updated and use faster polling only where necessary.
# Size of the data set. Very large data sets will take longer to return to the client, longer to process by the client, and have a bigger impact on client performance as a result. For large data sets the client polling should be increased as much as is acceptable.
 
=== Server Polling ===
 
Enabling server polling schedules a job on the AppBoard server to make data requests for a Data Collection, just as a client would. The best use of this feature is to deal with slow to respond data sources by ensuring AppBoard always has data to return to clients immediately. Server polling jobs will continue at all times whether clients are connected or not, so will increase the ''idle'' load of the AppBoard server.


# Consider how frequently the cache is being refreshed for the Data Source being used by that Data Collection.  If the Data Source's cache is being updated every 5 minutes, and the Polling is set to 60 (1 minute), then this would not be optimally efficient since about 80% of the time the Data Collection would be polling the same old data that is already in the client's cache.  However, it should also be considered that if the Data Source's cache is being updated every 5 minutes, and the Polling is set to 300 (5 minutes), it is possible that the data in the client could be up to 10 minutes old by the time it is refreshed.  In this case, you might decide to set the Polling to 60 (1 minute), sacrificing efficiency in order to ensure that the data being displayed will never be more than 5 minutes old.
Please note with Server Polling enabled the AppBoard server will respond to clients with ''expired data'' if the cache timeout has been exceeded but fetching data from the source has not yet completed.
# Consider the Widgets that are using the Data Collection.  If you have polling set to every 10 seconds for a Widget that is showing a monthly trend chart, this frequency of polling will most likely be causing unnecessary load on the client.
# Consider how many records are being passed between the server and client at each polling interval.  If the number of records being returned are on the order of tens and not thousands (numbers will vary for any installation, depending on client performance and network speed), then the client polling could be set to a very short interval.  For example, returning a dozen records every 15 seconds should not impact performance.  Returning hundreds of records every 15 seconds would slow the client.


=== Enabling Server Polling ===
Server Polling does not have any settings itself, instead it relies on the Client Polling interval to set the server job polling interval. If Client Polling is disabled it will default to 60 seconds. While the server polling will occur at this frequency, whether AppBoard will request new data from the data source will depend on the cache timeout settings of the data source.


# CConsider how long the server response time is and how dynamic the data collection queries are.  Queries are dynamic based on the query filter and also when session variables or shim:query expressions are applied. Each unique query will cause a unique scheduled job to be created.  Using the custom.properties to override the following settings such as the number of threads used to run server jobs or how often to cleanup unused temporary jobs:
When enabling Server Polling consider:
  # Default server job thread count.
- org.quartz.threadPool.threadCount=10
  # Default thread priority.
- org.quartz.threadPool.threadPriority=5
  # Time in minutes before temporary job is cleaned up.
- appboard.pollingCleanupTime=5
  # Time in minutes cleanup of unused temporary jobs is ran.
- appboard.pollingCleanupJobRate=2
  # Max Time in seconds server will wait when processing a job before returning results
  # for the current dc query.  Remote timeout is passed in; and will be used.  Max is
  # just used to prevent locking up a thread.
- appboard.pollingMaxWait=240


The server polling rate will typically be based on the client polling rate. If that rate is not set; then it will be based on the minimum cache timeout of the entity being queried or any entity referenced by an association.  Typically the result of polling will cause the datasource to be polled a slightly higher rate.
# If the data source is fast to respond, and an appropriate cache timeout is set, then Server Polling is ''not needed''.
# If the data is highly dynamic and caching is disabled then Server Polling is ''not appropriate''.
# If pulling new data from the data source is slow to respond, taking many seconds or even minutes, and a reasonable cache timeout can be configured, then Server Polling is a very good idea to improve client experience.


=== Queries and Testing ===
=== Queries and Testing ===

Latest revision as of 11:43, 5 March 2015

AdapterFlow.png

This page summarizes the options available in AppBoard for caching and polling, and provides some recommendations for setting the caching and polling configuration to maximize performance and provide the best user experience.

Definitions

AppBoard uses a model of demand driven data collection. This means by default all requests for data originate from the client (Viewer or Builder) which the server then has to fulfill. In other words, without any clients connected to the AppBoard server it does not perform any querying of data sources at all. This default behaviour can be modified using Server Polling if enabled on specific Data Collections.

In practice the following events result in requests for data:

  • Initially viewing a board with a client (Viewer or Builder).
  • One or more visible widgets configured with Data Collection that have Client Polling enabled. By default Data Collections are not configured for Client Polling.
  • Data Collections with Server Polling enabled in which case data requests will be made on the server without clients connected. By default Server Polling is disabled.
  • User interaction with widgets resulting in actions (switching to a new board, server side filtering, etc...)
  • An admin client (Builder) previews a data collection

When the AppBoard server receives a request for data it will service the request by either returning cached data, or by fetching new data from the data source. Caching is enabled as part of the Data Source configuration, either for the entire Data Source, or for individual Data Source entities.

Understanding when to use client polling, server polling, and setting appropriate caching is important to ensure a good user experience, reduce the load on data sources, and reduce the load on the AppBoard server.

For widgets where the data shown is updated at the source and these updates should be reflected in the widget automatically, then it is necessary to turn on Client Polling. The AppBoard client will then poll the server for updated data based on the configured polling frequency. Client polling is only active for widgets on the currently visible Board.

In cases where the data source is slow to respond a client may have to wait on the response each time new data is fetched - which will depend on the client polling interval and cache settings of the data source. To avoid this problem enable Server Polling which will ensure server will always have a full cache and be able to respond to clients immediately.

Configuration


Optimization

Clearly a balance has to be struck between keeping the client with up-to-date information and the total number of queries being performed by the AppBoard server to external data sources.

Setting the Cache Timeout

  1. Consider how frequently the data is being updated in the back-end data source. If the AppBoard server is querying source data that is being updated every 20 minutes, and the cacheTimeout is set to 300 (5 minutes), then this would not be optimally efficient since about 75% of the time the server would be re-fetching old source data that is already in the server cache. However, it should also be considered that if the source data is being updated every 20 minutes, and the cacheTimeout is set to 1200 (20 minutes), it is possible that the data in the cache could be up to 40 minutes old by the time it is cached again.
  2. Consider how often the client will be polling for new data. For example, if the client is running a daily summary chart on a particular data source, simply updating the cache once per day may be sufficient for that purpose.
  3. Consider the complexity of the query that is being run against the server. If a simple query is being run, the cacheTimeout can be set to a low value (such as 1 minute or less) with minimal impact to the end-user.

Client Polling

Client polling can impact the performance of the client as it is doing more work issuing requests, processing responses, and re-drawing widgets. It will also increase the amount of network traffic to/from the AppBoard server. There is also an impact on the AppBoard server having to process more requests which ties back into setting appropriate cache timeouts.

When configuring client polling consider:

  1. How frequently the source data is updated and the data source cache timeout. Pick a setting that provides an acceptable maximum data age but doesn't poll so frequently as to be of little benefit. The total maximum data age will be the cache_timeout + polling_frequency. For example, a 5 minute cache timeout and 1 minute polling frequency will mean in the worst case a client will see data up to 5+1=6 minutes old.
  2. User usage. Assuming the source data is updated very frequently, and disabling caching and setting a very fast polling frequency (down to 5 seconds) does not mean it is a good idea. Aside from the performance impact, dashboards with too many widgets updating too frequently can be visually distracting and difficult to follow. A general rule of thumb is to set client polling to between 1 and 5 minutes depending on how frequently the source data is updated and use faster polling only where necessary.
  3. Size of the data set. Very large data sets will take longer to return to the client, longer to process by the client, and have a bigger impact on client performance as a result. For large data sets the client polling should be increased as much as is acceptable.

Server Polling

Enabling server polling schedules a job on the AppBoard server to make data requests for a Data Collection, just as a client would. The best use of this feature is to deal with slow to respond data sources by ensuring AppBoard always has data to return to clients immediately. Server polling jobs will continue at all times whether clients are connected or not, so will increase the idle load of the AppBoard server.

Please note with Server Polling enabled the AppBoard server will respond to clients with expired data if the cache timeout has been exceeded but fetching data from the source has not yet completed.

Server Polling does not have any settings itself, instead it relies on the Client Polling interval to set the server job polling interval. If Client Polling is disabled it will default to 60 seconds. While the server polling will occur at this frequency, whether AppBoard will request new data from the data source will depend on the cache timeout settings of the data source.

When enabling Server Polling consider:

  1. If the data source is fast to respond, and an appropriate cache timeout is set, then Server Polling is not needed.
  2. If the data is highly dynamic and caching is disabled then Server Polling is not appropriate.
  3. If pulling new data from the data source is slow to respond, taking many seconds or even minutes, and a reasonable cache timeout can be configured, then Server Polling is a very good idea to improve client experience.

Queries and Testing

In addition to setting the caching and polling intervals carefully, there are other elements of the system that should be controlled to maximize the performance of the AppBoard system. These include the following:

  • In configuring Data Sources, server-side queries of data should be limited to exclude any data that will never be needed by AppBoard.
  • The polling that takes place in the client has the greatest impact on the robustness of the AppBoard application. Data Collections should be configured to minimize both the amount of data requested and the frequency of that data being passed from the server to the client.
  • When testing the system before it is released in production, different settings and combinations for polling and caching should be tested to help isolate any bottlenecks or inefficiencies.