Category Archives: database

Hey Database! Go Fine Tune Yourself

Everyone expects AI prices to go down in the long term. But in the short term, we have three things going on. Token prices keep dropping, hurray for that. Subscription fees are going up and dumping their all you can eat plans for volume based pricing. There more you use, the more you pay. I guess that’s fair. Third, hardware component pricing is going up and big companies are borrowing billions to build the greatest and latest AI data centers. What’s going on? Are we in the pets.com era of selling $40 dollars worth of dog food for $20 bucks and making it up in volume? The real question is, how do we close this giant chasm of a value gap?

Molham Aref argues that enterprises must make agents smarter and cheaper. We have to solve two problems at the same time: making agents smart enough to handle real business decisions, and ensuring they are cost-effective enough to scale enterprise-wide. It sounds simple enough on the surface, but… it’s not. I’m going to talk about one of the ways we are doing that. But before I start, about six months ago, Greg Diamos and Naila Farooqui at RelationalAI wrote a blog post “Introducing Superalignment for Relational Databases“. If you haven’t read it yet, please take the time to do it now or you may be a little lost on what follows. There is a line in there people sometimes overlook, even thought it’s literally highlighted in bold:

The training dataset is the database itself.

Continue reading
Tagged , , , , , , ,

Extending MySQL with VillageSQL

One of the things that made me fall head over heals for Neo4j so many years ago was just how extensible it was. If the database engineering team was busy rebuilding the clustering feature for the third time and didn’t have time to take care of my feature requests… I could just add them myself. Not to Neo4j directly, no that would have been a horrible mess. Instead I could add any feature I wanted as an “Unmanaged Extension”. Later on they became Cypher Stored Procedures, but it was basically the same thing. You had access to the top level Java API that dealt with Nodes and Edges. You could use the Traversal API that dealt with Paths….and if you were feeling extra spicy that day you could go down to the Storage API that dealt with Cursors over raw bytes.

I had spent prior jobs working with Oracle and Microsoft SQL Server so I never had that kind of power and freedom before. Well, it took a long time, but that power has come to MySQL in the form of a change tracking fork called VillageSQL. There are already a bunch of extensions that add UUID, Network Address custom types, Cryptographic Functions, Multi-Dimensional Geometry as well as AI helpers. So of course I had to try it out. I decided to add an extension for one of my other great loves, the Roaring Bitmap data structure.

Continue reading
Tagged , , ,

Declarative Query Languages are the Iraq War of Computer Science

It’s Memorial Day weekend in the United States. Some people are staying home, others are observing the holiday quietly and others still are using it as an excuse to party because they have seemed to have forgotten that the entire world is once again at war. At war with a tiny enemy, so small some people think it’s a hoax. The worst part is the enemy is in each other, our friends and neighbors. But Memorial day is not about remembering the wars, but rather remembering the fallen. To remember those who gave all. Whatever you may think of war, all are terrible, some were necessary. I never served, so that’s about all I get to say about that.

About 14 years ago Ted Neward wrote a very long blog post on “The Vietnam of Computer Science”. There is a follow up, and a short summary by Jeff Atwood as well. If you have never read them, I ask you to do so now…and with that, I believe Query Languages are the Iraq War of Computer Science.

Continue reading

Tagged , , , , , , , , ,

Keeping Properties Secret in Neo4j

We’re an open source company with nothing to hide, but some of our customers have things they need to keep close to their chest. Sometimes you don’t want everybody to have access to salary information, or future predictions. Maybe you want to hide Personally identifiable information (PII) or Health Insurance Portability and Accountability Act (HIPPA) data. In Neo4j 3.4 we are introducing more security controls. We are starting with Role based Database wide property key blacklists. That’s a bit of a mouthful but let’s walk through and example to see one of the ways it can be utilized. Imagine you are working in “Area 51” and have to deal with very important information.
Continue reading

Tagged , , , , ,

Neptune and Uranus

Last year Microsoft announced “Cosmos DB”, a multi-modal database with graph support. I think multi-modal databases are like swiss army knifes, they can do everything, just not very well. I imagine you would design it to be as good as it can be at its main use case while not losing the ability to do other things. So it’s neither fully optimized for its main thing, nor very good at the other things. Maybe you can do pretty well with two things by making a few compromises, but if you try to do everything…it’s just not going to work out.

Can you imagine John Rambo stalking his enemies with an oversized swiss army knife? Here, let me help with the mental image:
Continue reading

Tagged , , , , , , , ,