Geeks With Blogs
Brendon Page Dev stuff

I’m going to be doing a Neo4j workshop up in JHB in November 2015 and thought I’d give an example of something that is easy to do in a graph database but challenging to do in a relational database. Before we begin, Neo4j is a graph database, a graph database is a database that uses a graph model to store data, graph databases fall under the broader category of NoSQL databases.

The problem

I have a social network and want to recommend possible friends to my users.

In Neo4j I’ll be storing the data using the following structure:


In SQL server I’ll be using:


The social network

I’ve setup the following social network in both Neo4j and SQL server:


I’ve arranged it so that it is easy to see who should be recommended for Brendon. I would want my query to recommend to Louise for Brendon because she is a friend of 2 of Brendon’s friends, where as Alice is only a friend of one of Brendon’s friends. I would also expect the query to exclude Bob because Brendon is already friends with Bob:


First Attempt

So I started off by writing the Cypher to get friend recommendations for Brendon (Cypher is the query language used by Neo4j):

    AND = 'Brendon'
    count(friendOfFriend) as friendsInCommon, as suggestedFriend
    friendsInCommon DESC;

Which returns:


Then I wrote SQL that would do the same:

    Me.Id                      AS MeId,
    FriendOfFriend.FriendId    AS SuggestedFriendId,
    COUNT(*)                   AS FriendsInCommon
    People         AS Me
    FriendMaps    AS MyFriends
      ON MyFriends.MeId = Me.Id
    FriendMaps    AS FriendOfFriend
      ON MyFriends.FriendId = FriendOfFriend.MeId
    FriendMaps    AS FriendsWithMe
      ON  Me.Id = FriendsWithMe.MeId
      AND FriendOfFriend.FriendId = FriendsWithMe.FriendId
    FriendsWithMe.MeId IS NULL
    AND Me.Name = 'Brendon'
    FriendsInCommon DESC

Which returns:


The first thing you’ll notice is that for my SQL results I’ve only returned Ids, no names, this is because to return names I either have to add another join (back to the People table) or do a separate query, both are additional overhead. Whereas in Cypher I have access to both the Id and Name and chose to return only the name. This isn’t to much of a big deal, but it is the first hint that Neo4j is more suited to this problem.

The second thing you might notice is that the Cypher is shorter, and if you are familiar with both languages the Cypher is certainly easier to read. Again this is a small hint towards Neo4j being more suited.

Road Block

My queries work great for Brendon, but if I try use them to get friend recommendations for Louise I get no results! Why is this? Well If we re-arrange the social network so that it is easy to see who should be recommended to Louise you will notice that the direction of the friend relationships are no uniformly pointing away from our subject and towards their friends and their friends friends. We now have relationship pointing in both directions:


Ignoring Relationship Direction

To solve this let’s ignore the direction of the FRIEND relationship. Here is the updated Cypher query which recommends friends for Louise and ignores the direction of the FRIEND relationships:

    AND = 'Louise'
    count(friendOfFriend) as friendsInCommon, as suggestedFriend
    friendsInCommon DESC;

Which returns:


Yay it works! You will notice that all I had to do was to remove the arrows from the relationship definitions in the query. So where ever I had "-[:FRIEND]->" I now have "-[:FRIEND]-".

I started updating the SQL query to do the same thing but gave up after 30 minutes of unsuccessfully trying to figure it out. Granted I’m not a SQL guru, but I have been using it for most of my career and have solved a lot of interesting problems with it.

Some might argue that my data is incomplete, that I should’ve added friend relationships in both directions, which would make the original queries work. But that isn’t the point, the point is that it is difficult to ignore relationship directions in SQL, and putting data in for the sake of a query is going to cause other issue for us. For example, what if the direction of the relationships had meaning? As in it indicated who added who as a friend, and if there are FRIEND relationships in both directions then that indicates that the other has accepted the friend request. If I’d blindly added relationships in both directions so that my original recommendation queries worked then I wouldn’t be able to do any of that.


Graph databases are good at doing what I like to call ad-hoc relationship queries and Cypher makes it easier to express, read and reason about relationships. Relational databases are more rigid in their relationship querying capabilities because relationships aren’t first class citizens and have to be modelled using table structures.

One thing that I have not touched on but I feel is worth a mention is performance. Neo4j is going to have a linear increase in query execution time as the social network grows in size and complexity where as the SQL server queries be impacted severely as the social network grows in complexity, the more each person is connected the bigger the results of those joins are going to be!

Posted on Monday, October 26, 2015 4:35 PM | Back to top

Comments on this post: Friend of Friend recommendations Neo4j and SQL Sever

# re: Friend of Friend recommendations Neo4j and SQL Sever
Requesting Gravatar...
How would a use case like this can be solved :
Get all nodes who are connected to at least K nodes with a path length <= 2
Left by Siddhartha Singh on Jun 01, 2017 5:04 PM

Your comment:
 (will show your gravatar)

Copyright © brendonpage | Powered by: