For the past several years, NoSQL databases have provided a scalable, manageable, and stable method for dealing with modern scaling and data format challenges. Aerospike real-time NoSQL database and key value store has broken the predictable performance barrier by applying established distributed systems principles, new real-time optimization techniques and flash storage (SSDs) technologies.
Now, the newest release of Aerospike, dubbed Aerospike 3, adds support for distributed aggregations, complex and large data types and queries using secondary indexes and user-defined functions (UDFs). Built on Aerospike 2 architecture, the new release provides predictable high performance for processing terabytes of data and billions of transactions per day–all with 100% uptime. The user-defined functions enable developers to perform complex data types, and distribute query processing with secondary indexes.
“Hyper-scale applications will become the new normal, and we believe developers of these demanding applications should have the power to adapt the database to fit their applications rather than constraining their software to the structure of the database.” said Srini V. Srinivasan, Aerospike founder and vice president of engineering and operations. “With Aerospike 3, we have achieved a significant milestone in realizing that vision.”
Aerospike NoSQL database interacts with consumers in real-time. As a result, companies have unprecedented flexibility to implement the replication approach that best supports their business and technology demands. Aerospike 3 new extensibility features pushes data processing power with no single point of failure, no hotspots, no data loss, no performance degradation, no maintenance windows, and no downtime. In addition, it manages billions of objects and terabytes of data at 1 million-plus transactions per second (TPS) on commodity servers.
“Aerospike’s unique flash-first flash-optimized architecture is the foundation behind the performance, reliability and recoverability of applications using Aerospike 3.0”, said David Floyer, CTO & Co-founder, Wikibon. “This architecture will allow Aerospike applications access to much greater database sizes compared with DRAM-based architectures.”
The DevOps Angle
In order to understand how Aerospike 3 will affect developers and operations, I reached out to Srini V. Srinivasan, Aerospike founder and vice president of engineering and operations, and asked about things that would directly impact the DevOps team.
“Every part of an application must be finely tuned to run at hyper-scale. It is the combination of complex data types, large data types, queries and UDFs–in a fast, scalable and reliable database that is very powerful,” he said when I asked about how developers would make use of the new features–especially user defined functions (UDF) and specialized data types. This is especially important for the developer side of hyper-scale engineering where some developers may have come up with their own solutions.
“The ability to represent any data structure, push any logic to the database, run that logic on one or more records, as well as process a subset of records in parallel through a pipeline of UDFs gives developers extreme power and flexibility. Developers will want to use Aerospike’s new features because they can speed up development and benefit from optimizations like reducing network bandwidth by pushing processing close to the data and taking advantage of the distributed parallel processing power of the cluster that can scale linearly with additional nodes. Therefore, we do not expect developers to roll their own data types externally.”
On the operations side, Aerospike makes some amazing claims about reliability and–as Aerospike 2 a tried and true model for reliability. “Aerospike 3 is based on the same underlying platform as Aerospike 2 and benefits from the same performance and reliability mechanisms that have been tested in over three years of continuous deployment at Internet scale,” Srinivasan explained. ”All of the new features of Aerospike 3 incorporate the basics of reliability and high performance.”
He went on to describe the three basics of high performance and high reliability: speed, scale, and reliability:
The first, speed–by incorporating extra indexes, parallel processing, and superior caching. “Aerospike secondary indexes are stored in DRAM and modeled on the fast design of its primary indexes. As described above, multi-threading and parallelism are used to effectively distribute query load across CPU cores and several other optimizations are used to minimize overhead. Caching Lua-state objects enables efficient execution of UDFs.”
The second, scale–since Aerospike is designed to work well in hyperscale that means dealing with I/O scaling and extra CPUs. “The Aerospike secondary index system quickly reduces the data fed into any stream processing by an order of magnitude. It is well established that indexing used wisely is much better for query processing than brute force parallelism. Aerospike uses indexing, I/O parallelism, CPU parallelism along with its innovative real-time prioritization techniques to achieve extremely high vertical and horizontal scale. Aerospike can execute low selectivity queries in parallel on cluster nodes and thus enables an application to simply increase the amount of data queried by increasing the number of nodes without substantially increasing query response times.”
Third, reliability–the end all be all of the operations dream, a system designed to recover on faults, already tried and tested on a foundation the industry knows. “The Aerospike replication and shared-nothing architecture provide the seamless reliability that is a hallmark of Aerospike deployments. We have implemented the new extensibility features while still making them conform to the reliability requirements of the platform.”
The company has released two versions of Aerospike 3: Aerospike 3 Community Edition and Aerospike 3 Enterprise Edition. Aerospike 3 Community Edition supports a single cluster of up to two nodes within one data center and is free to use, whereas Aerospike 3 Enterprise Edition is available as enterprise wide license with no limits on the amount of storage or number of data centers, clusters and nodes.
Contributors: Saroj Kar, Kyt Dotson