The precog engineering challenge

by Security Dude

“Developing a fault-tolerant, highly-performant, compact key-value data store”…”disk-backed implementation”

This article flew by my twitter feed last week. I had to poll a couple of HackerSchoolers to figure out what the heck these guys where asking. It might be an in memory database, or may in memory key So down the rabbit hole I go on my Sunday afternoon, poking around the interwebs looking to get an understanding of this challenge. Google tells me this is http://redis.io/

Fault-tolerant

In engineeringfault-tolerant design is a design that enables a system to continue operation, possibly at a reduced level (also known as graceful degradation), rather than failing completely, when some part of the system fails. The term is most commonly used to describe computer-based systems designed to continue more or less fully operational with, perhaps, a reduction in throughput or an increase in response time in the event of some partial failure. That is, the system as a whole is not stopped due to problems either in thehardware or the software. An example in another field is a motor vehicle designed so it will continue to be drivable if one of the tires is punctured. A structure is able to retain its integrity in the presence of damage due to causes such as fatiguecorrosion, manufacturing flaws, or impact. http://en.wikipedia.org/wiki/Fault-tolerant_design

High Performance

Well, better than slow performance.

Key-Value Data Store

In computingNoSQL (commonly interpreted as “not only SQL”[1]) is a broad class of database management systems identified by non-adherence to the widely used relational database management system model. NoSQL databases are not built primarily on tables, and generally do not use structured query language for data manipulation.

NoSQL database systems are often highly optimized for retrieve and append operations and often offer little functionality beyond record storage (e.g. key–value stores). The reduced run-time flexibility compared to full SQL systems is compensated by marked gains in scalability and performance for certain data models.

In short, NoSQL database management systems are useful when working with a huge quantity of data when the data’s nature does not require a relational model. The data can be structured, but NoSQL is used when what really matters is the ability to store and retrieve great quantities of data, not the relationships between the elements. Usage examples might be to store millions of key–value pairs in one or a few associative arrays or to store millions of data records. This organization is particularly useful for statistical or real-time analyses of growing lists of elements (such as Twitter posts or the Internet server logs from a large group of users). http://en.wikipedia.org/wiki/NoSQL

Disk-backed implementation

The Disk-backed object cache project is a supplement to memcache-based object cache, which will allow us to greatly increase the amount of space dedicated to caching parsed pages. http://www.mediawiki.org/wiki/Disk-backed_object_cache

Links

http://precog.com/blog-precog-2/entry/do-you-have-what-it-takes-to-be-a-precog-engineer

http://news.techworld.com/applications/3411544/startups-reportedly-flocking-to-saps-hana-in-memory-database/

http://www.zodb.org/documentation/tutorial.html

http://www.pytables.org/docs/PyData2012-NYC.pdf

Advertisements