Let’s build something Outrageous – Part 2: Shards are ok!

When I was a new Java developer I would sometimes wake up in the middle of the night hyperventilating and covered in sweat. Usually from a nightmare about Maven and fighting with pom.xml. We dream of Software, but does Software dream? I don’t know, but I hope when Maven goes to sleep at night, it wakes up screaming thinking about CMake and CMakeLists.txt… I know I do.

Rather I should say, “I did”, because I went looking for help on youtube and ran into a template for C++ projects by Jason Turner which made the nightmares stop. We’ll start our new project by blindly copying that into our repository and removing a few GUI related things we won’t be using. Watch the video for all the details, I only understood half of it, but it was enough. There I learned about Conan.

Not the Barbarian but the Package Manager. We’ll use it to import the dependencies we talked about in part 1:

luajit/2.0.5
sol2/3.2.3
tsl-sparse-map/0.6.2
simdjson/0.9.6
roaring/0.2.66

Then we’ll split our project into two parts. One “graph” library where we’ll do the graphy part of the work and a web server component as our main where we handle the user requests and respond to them.

add_subdirectory(src/graph)

add_executable(ragedb src/main/main.cpp)
target_link_libraries(
        Graph
        Seastar::seastar
        CONAN_PKG::tsl-sparse-map
        CONAN_PKG::simdjson
        CONAN_PKG::roaring
        CONAN_PKG::sol2
        CONAN_PKG::luajit
)

We are using Seastar which currently only runs on *nix so I’m using Ubuntu 20.04 on Parallels on my mac with CLion for my coding environment, but we will spin up an EC2 instance to test things later on. The setup instructions are on the readme. Alright with all that set we’ll follow along the Seastar tutorial and get our “something” going:

#include <seastar/core/app-template.hh>
#include <seastar/core/reactor.hh>
#include <iostream>

int main(int argc, char** argv) {
    seastar::app_template app;
    try {
        app.run(argc, argv, [] {
            std::cout << "Hello world!\n";
            std::cout << "This server has " << seastar::smp::count << " cores.\n";
            return seastar::make_ready_future<>();
        });
    } catch (...) {
        std::cerr << "Failed to start RageDB: "
                  << std::current_exception() << "\n";
        return 1;
    }
    return 0;
}

When I run this bit of code I get this for my output:

Hello world!
This server has 4 cores.

Super exciting I know, but we have to start somewhere. Now let’s start building out Graph, its Shards and connect the two together. We will have multiple graphs, and each needs a name, so we can add that. Each graph will have shards, so we can add that as well. Finally we’ll need a way to start, stop and clear the graph.

    class Graph {
    private:
        std::string name;

    public:
        seastar::sharded <Shard> shard;
        explicit Graph(std::string _name) : name (std::move(_name)) {}

        std::string GetName();
        seastar::future<> Start();
        seastar::future<> Stop();
        void Clear();
    };

Don’t worry we won’t go over every line of code, that would drive us both insane, but let’s take a look at the Shard class. We need our Shard to be a peering sharded service since the shards will need to talk to each other occasionally. We’ll keep track of the number of cpus and the shard_id of each shard, and for now we’ll just have it let us know it is starting and stopping.

    class Shard : public seastar::peering_sharded_service<Shard> {

    private:
        uint cpus;
        uint shard_id;

    public:
        explicit Shard(uint _cpus) : cpus(_cpus), shard_id(seastar::this_shard_id()) {
            std::stringstream ss;
            ss << "Starting Shard " << shard_id << '\n';
            std::cout << ss.str();
        }

    seastar::future<> Shard::stop() {
        std::stringstream ss;
        ss << "Stopping Shard " << seastar::this_shard_id() << '\n';
        std::cout << ss.str();
        return seastar::make_ready_future<>();
    }

Now we can wire a Graph inside our main, start it, and stop it. Instead of return seastar::make_ready_future<>(); in our main, we will return the results of async which creates a thread and returns a future which resolves when the thread completes:

            return seastar::async([&] {
                ragedb::Graph graph("rage");
                graph.Start().get();
                std::cout << "Started " << graph.GetName() << " graph \n";
                graph.Stop().get();
            });

When we run this bit of code we get:

Hello world!
This server has 4 cores.
Starting Shard 0
Starting Shard 1
Starting Shard 3
Starting Shard 2
Started rage graph 
Stopping Shard 0
Stopping Shard 3
Stopping Shard 1
Stopping Shard 2

Ok, so we have an empty graph. I’m sure that doesn’t impress anyone yet. Let’s keep going. We need to add a web server so we can talk to our graph. Luckily Seastar comes with one bundled in. We will listen on all ips addresses on the local machine with 0.0.0.0 and set the port to 7243 which is “rage” in phone number. Then we’ll set a route to return “hello” on /hello and start listening.

app.add_options()("address", bpo::value<seastar::sstring>()->default_value("0.0.0.0"), "HTTP Server address");
app.add_options()("port", bpo::value<uint16_t>()->default_value(7243), "HTTP Server port");

auto&& config = app.configuration();

seastar::net::inet_address addr(config["address"].as<seastar::sstring>());
uint16_t port = config["port"].as<uint16_t>();
auto server = new seastar::http_server_control();
server->start().get();

server->set_routes([](seastar::routes& r) {
    r.add(seastar::operation_type::GET,
          seastar::url("/hello"),
          new seastar::function_handler([]([[maybe_unused]] seastar::const_req req) {
            return  "hello";
        }));
    }).get();

server->listen(seastar::socket_address{addr, port}).get();

When we point a web browser to http://0.0.0.0:7243/hello I get “hello” back! How cool is that? Not very yeah, we need to connect our web server to our graph otherwise this is not very useful. Let’s actually talk to our graph and our shards by adding a health check route that builds a response from all the shards .

HealthCheck healthCheck(graph);
server->set_routes([&healthCheck](routes& r) { healthCheck.set_routes(r); }).get();
...
healthCheck->add_str("/db/" + graph.GetName() + "/health_check");

I’m skipping some details but this route handler we’ve created will eventually crawl up to the HealthCheck object, grab the graph and go to the shard on the core it is running on and call HealthCheckPeered which returns a list of strings, one from each shard and returns a JSON array:

return parent.graph.shard.local().HealthCheckPeered()
    .then([rep = std::move(rep)] (const std::vector<std::string>& checks) mutable {
        rep->write_body("json", json::stream_object(checks));
            return make_ready_future<std::unique_ptr<reply>>(std::move(rep));
        });

In our Shard, we will add two methods, one that is peered and one that is local. The local one just responds that it is ok with its shard_id. The peered one calls this method on all the shards and using .map puts them in a vector for us:

    seastar::future<std::string> Shard::HealthCheck() {
        std::stringstream message;
        message << "Shard " << seastar::this_shard_id() << " is OK";
        return seastar::make_ready_future<std::string>(message.str());
    }

    seastar::future<std::vector<std::string>> Shard::HealthCheckPeered() {
        return container().map([](Shard &local_shard) {
            return local_shard.HealthCheck();
        });
    }

Now, finally when we go to http://0.0.0.0:7243/db/rage/health_check we get a glorious:

["Shard 0 is OK","Shard 1 is OK","Shard 2 is OK","Shard 3 is OK"]

That was a lot to setup for what may not seem like a huge pay off, but think about it for a second. We’ve wired together an http route to a web server to our specific graph then to a shard, then to all shards and back to the user responding in JSON. We have all the elements needed to talk to our graph now using an HTTP API. If all that seastar::future, shard and peered stuff is completely new to you, please give this tutorial a read through so you can learn all about them. Keep track of the progress on this repository and don’t be afraid to comment or contribute. We’ll talk about shards some more on part 3, so stay tuned.

Max De Marzi

Graphs, Graphs, and nothing but the Graphs