NI DIAdem is able to ingest a CSV text file with 87,264,000 random IEEE-754 64-bit floating-point values at a rate of 722,982 values/s, or 53.8 GiB/hr. Subsequent reading of the TDMS file after it is created takes 0.1 sec or 1.36E+09 values/s, and writing any changes takes 3.0 sec or 2.86E+07 values/s.
NI DIAdem data ingestion performance was measured to be at least 27x faster than the following online time series databases: InfluxDB, Cassendra, Elasticsearch, MongoDB, OpenTSDB, Graphite, and Splunk. The comparison is based on using the same number of cores.
NI DIAdem is software for the management, storage, analysis and visualization of data acquisition data. It is optimized to support the ingestion, storage, and analysis of test data from a wide variety of time series measurement data file formats.
InfluxDB is an open-source time series database (TSDB) for storage and retrieval of time series data. The TSDB is hosted in the cloud and tools are available online for data analysis and visualization.
As of October 2022, InfluxDB has whitepapers posted on their website demonstrating faster write (data ingestion) performance than the following competitive time series databases: Cassendra (5x faster), Elasticsearch (3.8x faster), MongoDB (1.9x faster), OpenTSDB (5x faster), Graphite (14x faster), and Splunk (17x faster). The comparison characterizes the write performance (data ingestion) in terms of values per second. The values vary in data type and randomly by value, resulting in a random data package size. This makes it easy to assess the data ingestion performance for a variety of applications, but difficult to convert that ingestion to bytes/second/server for comparison to other system outside of those evaluated in their evaluation. In all of the comparisons, 100 servers were used concurrently to process 87,264,000 values and the performance was measured over the ingestion of 100 values.
In the comparison of InfluxDB to Elasticsearch, the whitepaper claims the average write (ingestion) throughput of InfluxDB was 2,674,948 values per second utilizing 100 servers (or 26,749 values/s per server). In the comparison of InfluxDB to MongoDB, the whitepaper sites that the write or injestion performance of InfluxDB was 2,644,765 values per second utilizing 100 servers (or 26,447 values/s per server).
In InfluxDB's performance comparison to MongoDB, a total of 87,264,000 values were created in the test data set, and then the performance was measured as 100 values were ingested. This type of measurement is difficult to replicate in DIAdem without adversely affecting the ingestion process.
I wrote a NI DIAdem script to create a CSV text file with 87,264,000 values consisting of 872,264 lines, with each line containing a Unix timestamp with nanosecond precision, followed by 100 random IEEE-754 64-bit floating-point numbers, all deliminated by a semicolon (e.g. 1676996959000000000;3.94714746398025E+299;4.55576917883636E+299;..). It took NI DIAdem 120.7 sec to read the 1.80 Gb uncompressed text file and write it to a new TDMS binary file, or an average ingestion rate of 722,982 values/s (87,264,000 values / 120.7 sec ), or 53.8 GiB/hr. Reading the TDMS file after it is created takes 0.1 sec or 1.36E+09 values/s, and writing any changes takes only 3.0 sec or 2.86E+07 values/s.
NI DIAdem was able to ingest the data 27 times faster than InfluxDB, relative to the InfluxDB per server rate of 26,447 values/s per server.
It is critical to recognize that all of the above tests were conducted with a close proximity between the source data and the application (NI DIADem or a time-series database), and that a high speed data connection existed between the source and the ingestion application.
Do you need help with your project? Send me an email requesting a free phone / web share consultation.
Copyright © 2021,2022,2023 Mechatronic Solutions LLC, All Rights Reserved