Search Performance

To understand search performance we need to cover a series of attributes in how searches are executed.

A General set of rules can be applied:

General Prioritization Rules

  • Searches that are recent (i.e. less than 5 days) are executed using foreground threads - i.e. as fast as possible.
  • The number of threads is determined by the core count and will be a minimum of 2
  • Older, long running searches run in background threads
  • When a search starts taking too long it gets reduced priority
  • Initial tasks are all given equal priority to allow interlacing of tasks by different searches and users
  • Searches that exceed the default time-to-live (3 mins) will expire. ttl can be controlled as a search parameter ttl(10)

The Map-Reduce execution plan

The dashboard applies further reduction stages and finally top() post style filters are applied. This also includes any post-level filtering such as GreaterThan or LessThan These are applied to aggregate results. i.e.

  • _host filters are evaluated.
  • _filename and _path filters are applied in determining content to search.
  • User permissions are applied.
  • Large content is broken down in to multiple tasks.
  • All tasks execute concurrently.
  • As each search head processes an event it is scanned via index or meta-time-based index.
  • The event is filtered before any parsing takes place when a Search level filter is applied.i.e.
    WARN AND ERROR : _type.equals(xxx)
    Note a hitLimit can also be applied
  • When the content level filter passes the event is mapped to fields and field level filtering applied (msg.contains() cpu.gt() etc).
  • Results are aggregated and reduced before being sent back to aggregation engines and the dashboard.
    * | cpu.avg(_host)

Search Level Optimizations

Making a few changes in the search syntax can provide performance gains, it's also reccomended you give the searching functions page a read to better understand the tools at your disposal. A few tips:

  • Use a text filter instead of the contains(string)
  • ttl(3) - Logscape will timeout a search after 3 mins and it will display the data collected while it was running. To increase this use the TTL function and set the number of minutes you want the search to be open. Use this when searching over large volumes of data and where speed is not a requirement.
  • hitlimit(number of events) - The hitLimit will stop collecting data from an Agent when the set limit has been reached. This gives the user estimates and trends of the data being analyzed. This function can be used when speed is a requirement.