Graphs graphs everywhere

Anne Hunt
3 min readJan 5, 2018

--

“A relentless barrage of ‘why’s’ is the best way to prepare your mind to pierce the clouded veil of thinking caused by the status quo. Use it often.” Shigeo Shingo

Knowledge graphs are suddenly everywhere. Google, Facebook, Microsoft, Amazon, LinkedIn, and Apple use them to drive product intelligence. These companies have armies of AI scientists and data engineers at their command. You know or suspect these graphs are the future of search and of cool user experience. So how does a small relational shop get a piece of the knowledge graph pie?

The answer has 3 parts:

  1. agile experimentation
  2. lightweight ontological methods
  3. just enough intelligence

Let’s start backwards.

I cannot overemphasize the importance of just enough intelligence. I was one of the first knowledge engineers working in Silicon Valley, hired to build and lead a team of ontologists at Ingenuity Systems. I was tasked with creating a team, infrastructure, and processes to quickly create a proprietary life-sciences knowledge base with a data accuracy of near 100%. A big choice early on was whether to start with an existing ontology such as Cyc or to start from scratch. I recognized pretty quickly that trying to use a system designed for general intelligence would be far too much burden for our needs. Starting from scratch allowed us to set up a very lean system, building just what we needed and nothing more. Just enough intelligence for us meant we didn’t need generalized inference capabilities or an upper level ontology.

The next question is whether to use full industry standards. We were lucky in that OWL, SPARQL, RDF, and their brethren didn’t exist yet, so we weren’t tempted to adopt them. We only had SQL to work with, so we quickly arrived at a graph data model that could be traversed in parts in memory (for reasoning) and then persisted to our relational DB. This is what I now call a lightweight ontological approach: domain concepts with durable globally unique identifiers that can be lasso’ed together as needed for graph operations, a sort of minimalist open data format. The identifiers are optionally dereference-able URIs, depending on how or whether you want to share your graph’s entities.

Fast forward to today: agile experimentation is key. The choices of technologies for building a knowledge graph are almost endless. There are native graph databases such as Neo4j, Allegrograph, and Neptune. There are NoSQL database technologies like MarkLogic that expose SPARQL endpoints. There are knowledge graph platforms like Stardog. And ontologies for almost anything can be found on the web.

As with any development project start with a use case (a small one). See if you can use a simple graph data structure to do something interesting with that use case. Link the entities in the graph to the objects in your legacy databases as needed.

If you can build something interesting, the way forward will clarify itself later. That is, after all, the essence of agile development.

--

--

Anne Hunt
Anne Hunt

Written by Anne Hunt

Product leader, artist, and early developer of intelligent systems. Contact me if you want to talk about art, good software, or cool product ideas.

No responses yet