Logscape Documentation

Articles

Horizontal Scaling

1TB is a daunting amount of data, however with a small amount of pre-planning ingesting, indexing and searching this data is a simple task when using Logscape.

Whilst a single Indexstore could theoretically ingest and index 1 TB per day, we would suffer severe performance issues when running searches. Introducing additional Indexstores allows the data volume to be separated in to manageable clusters, allowing us to leverage the increased processing and Disk I/O capacity. For this specific deployment, we will install 6 well provisioned Indexstores.

Assumptions

1 TB a day inbound traffic
50 Users throughout the organization
No Zoning will be used.
Data arrives at a uniform rate
- (In a real world scenario bursts would be backfilled over time)
30 day retention (Total Indexed Capacity 30TB)

Comparison

Using the specifications stated below, we get the following when comparing a 1 Indexstore environment against a 6 Indexstore Environment.

Feature	1x Indexstore	6x Indexstore
Disk Speed	2GB/Sec	12GB/Sec
Network Bandwidth	2.5GB/sesc	2.5GB/sec
Search Speed(1TB)	500 Seconds	83 Seconds
GB Searched Per second	2GB	12GB
Events Per Second(2KB per Event)	1,000,000	12,000,000

The inclusion of 6 Indexstores reduces our Input requirements to an insignificant 12mb/sec, however even with the data spread evenly between each host, at maximum capacity a host will be responsible for searching 5TB of data, so disk speed, and in-memory caching are still a concern.

Network Stats

Network requirements will vary upon the individual IO capability of your Indexstores, but must be configured in such a way to take into account both data transfer incoming to the Indexstores as well as the capacity needed by in order to transfer data to the Manager at search time. To benchmark your network you can follow the steps here. For this scenario we will assume each Indexstore is connected via two fibre channel connections which are bonded for an estimated bandwidth of 2.5gb/sec.

Disk Considerations

In order to gather accurate metrics it is reccomended that you use a tool such as ioZone. However rough assumptions can be made based upon disk type.

Type	Disk Speed
Mechanical Drive	150Mb/sec
SSD	500Mb/sec
SSD x2	1Gb/sec
(SSD)RAID 5+1	2.5Gb/sec

Using these device speeds, in a scenario where we are searching 10 GB of data, we get the following search times.

Device	Dataset Size	IO Rate	Search Duration
Mechanical Disk	10Gb	150Mbs	66 Seconds
SSD	10Gb	500MBs	20 Seconds
SSDx2	10Gb	1GBs	10 Seconds
(SSD) RAID 5+1	10Gb	2.5GBs	4 Seconds

NOTE:

Realworld performance depends upon the processors ability to cope with throughput.
OS kernel caching will improve performance where enough memory is left free (50%)
Larger deployments will see users searching different sets of data
NFS mounted drives will be dependent upon sharing and latency
The load of virtualised infrastructure may impact performance and suffer latency

Rough Specifications

Device	Manager	IndexStore	Forwarder
Disk	RAID 5(4+1), Disks with at least 500mb/sec R/W	RAID 5(4+1), Disks with at least 500mb/sec R/W	-
CPU	48 core modern CPU	48 core modern CPU	-
Memory	64GB, Minimum of 8GB allocated to the Dashboard/Aggspace process	64GB	512MB

Manager sizing is dependant upon the average size of a search, the number of concurrent users, as well as the amount of pre-configured alerts. General advice is that the system should have enough RAM to allow for caching of search time data, whilst also allowing an additional 5-10mb per dashboard search, and alert.

Indexstores require high read speeds in order to search their indexes for data requested by searches. Ideally every Indexstore should feature a similar read potential as due to the Load Balanced nature of a Logscape deployment, searches will be held up by a single Indexstore with a slower read capability.

Forwarders are a lightweight agent where Disk and CPU are of little concern as long as Disk I/O and CPU resources are not at 100% values. The agent itself is designed to use a minimum amount of RAM, and is operationally capable while sized as low as 256MB.

The key value to note here is the search speed. The input requirements for 1 TB of data are relatively low, however serving 30 days worth of collated storage back in the form of a search in a short span of time requires multiple indexstores.

Environment Mock-Up
The below image reflects what a large environment utilising the built in Load Balancing, and Indexstores will look like