Most log files vary widely in structure and format, some are very well structured like csv files and others are fairly loose and application dependent.
A type applies a tabular structure over incoming data. Each row in the table corresponds to a single timestamped event and each column is a field and represents the data extracted from this event. The tabular structure is a loose schema for extracting commonly occurring features in the data.
Each type needs to be applied to data already in the system. The location of the data is identified by a tag ( See Using Data Sources )or by the directory.
On the data types page, the Dir/Tag option can be a comma-separated list of directories and tags. Example below assigns the datatype to the logscape-logs data source and a custom application log directory
tag:logscape-logs,/opt/vserver/logs
The structure of the data is defined using patterns or a split function. The diagram below shows the basic structure of a java application log. The first column refers to the timestamp, the second the level and so on. A regular expression pattern is used to assign matches to Fields.
Log4j Example
The pattern for this application log is shown below. Each group, identified by the parenthesis, is mapped sequentially onto the field table on the types page. The first column in the table will correspond to the first pattern group in the pattern expression. Taking the example below, the group , (2*) will match all dates. For example 2014-04-06, 2014-04-03 and so on. The second group will match the time field. The pattern in this case can be quite relaxed since the first two columns of this application log will always refer to the date field and the time field.
^(2*)\s+(*)\s+(INFO|DEBUG|WARN|ERROR|FATAL)\s+(*)\s+(*)\s+(**)
To test the expression mapping to the type fields click on the [TEST] button. This button does quite a few things:
This diagram shows how the regular expression groups will be mapped to the fields.
Log4j Example
Each Field group is mapped onto a field in a sequential order. Each field can be configured by clicking on the Field name. The example below shows the mapping for a java application log.
Once you are happy with the data type click save and explore the data from the search page. The following search will search all log4j data ingested by Logscape.
| _type.equals(log4j)
The fields defined in the type will appear on the fields section of the search page.
When creating a field it is possible to select the "index" option. Selecting this means that the field will be computed and indexed at injest time, rather than at search time, this will result in a much higher search speed. But will also mean the value is only possible when looking at the data via the raw table view, rather than the Event or Table views.