Datatypes Tutorials

A Datatype is a loose schema that is assigned to files. This schema is used to extract fields and add structure to data being analysed.

You can use the following search syntax to filter your results by type

  • _type.equals(MYNEWTYPE)
  • _type.includes(java-)

Assigning a type to your data will allow you to go beyond simple keyword searches on your data and use analytics on fields extracted from your data.

Extracting Fields

In this video we will create a new type that will extract the following fields:

  • date
  • time
  • level
  • package
  • thread
  • msg

You can use the sample text below to follow the example in the video

2013-12-05 16:44:16,930 INFO orm-SHARED-5-1 (netty.NettyEndPoint)	 Stats:SHARED/stcp://10.28.1.154:11003 SendMsg:28 SendKb: 11 RecvMsg:29 RecvKb:9 ]
2013-12-05 16:44:25,978 INFO agent-local-sched-18-1 (agent.ResourceAgent)	AGENT QA-UK-FW-W82-11003-0 CPU:2 MemFree:348 MemUsePC:22.67 DiskFree:35898 DiskUsePC:29.75 SwapFree:1532 SwapUsePC:50.11
2013-12-05 16:44:25,978 INFO agent-local-sched-18-1 (agent.ResourceAgent)	AGENT QA-UK-FW-W82-11003-0 MEM MB MAX:990 COMMITED:447 USED:296 AVAIL:694 SysMemFree:348 TimeDelta:317 Threads:243
2013-12-05 16:45:16,933 INFO orm-SHARED-5-1 (netty.NettyEndPoint)	 Stats:SHARED/stcp://10.28.1.154:11003 SendMsg:47 SendKb: 19 RecvMsg:18 RecvKb:6 ]

Creating Synthetic Fields

Synthetics are dynamic fields and do not occur on every line eg. usernames,heartbeats, cpu stats and so on. If we take a look at the following sample data:

We can see that memFree,sendMsg,sendKb do not occur on every line. We can use a synthetic field to extract these values.

A Synthetic field needs a source field and a synthetic function. The synthetic function is applied to contents of the predefined source field. The synthetic function can be a pattern,a text function like split, an arithemetic evaluator or groovy script.

In the video a pattern expression is used.

Saving your datatype.

Each data type needs a

  • name
  • tag or directory mask
  • a file mask

A tag is one of your data sources. This makes your datatype independent of the physical location of your data source. This is recommended over using the directory mask syntax

tag:DataSourceTagName

A comma separated list of tags and directory masks are also valid entries:

tag:dev-gc-logs, ./opt/tomcat/logs,tag:uat-gc-logs