Friend Suggestions At Cash App

At Cash App, customers can invite their friends to join the platform and be rewarded for doing so. We built a ML model to populate a “Suggested Friends” section in the Invite Friends screen to help customers quickly find friends to invite without having to scroll through their whole contact book. In this post, we describe how we built it, the scalability issues we solved, and other things we learned along the way.

Introduction

With Cash App, customers can easily send money to friends, family and businesses. These interactions form a social network of people (and merchants) transacting with each other, at a scale of 53 million monthly transacting actives (as of Q1 2023). As with many social networks, Cash App benefits from network effects: when people join Cash App, they often do so to specifically interact with others already there, and at the same time existing customers get more utility by having more of their friends in the network (for example, they can now avoid the hassle of carrying physical cash around, or of switching to another payment app). To lean on these network effects Cash App has a referral program, where a customer can be rewarded for inviting their friends (we typically offer between $5 and $15, some rules apply). To invite a friend, a customer goes to a specific page within the app; we will refer to it as “Invite Friends page” in this post. It currently looks like this:

invite friends screen

The invite Friends screen on Cash App, as of August 2023.

At Cash App, consistent with our approach to privacy, we ask customers for permission to share their contact books with us. We do this to make it easy to find friends, protect their account, and prevent spam. Customers who decide to share their contact book benefit from the recommendation algorithm we discuss here.

The invitation flow consists of the following steps:

If all goes well, you both receive some extra cash in your balance and you have one more friend you can transact with. Cash App has one more customer who in turn can invite new friends (possibly in a viral cascade). When blended with other marketing programs, invitations have helped Cash App acquire a new transacting active at an average cost of $10 or less and a less than one year payback (data from the 2022 investor day presentation).

If your contact book looks anything like mine, you’ll have your closest friends and family in there, coworkers, some extended family, some old friends from school or college, some acquaintances that you may not have talked to in years, a range of businesses you interact with (from small contractors to big retail chains), and probably people who you don’t even remember who they are.

Before we built the “suggested” section you can now see in the app, customers had to scroll through all of their contacts until they saw someone that they felt like inviting! In this project, we wanted to see if we could make it easier for customers to invite their friends, and we decided to build a section within the app to be populated with friends they might want to invite.

Good suggestions will result in more invitations and signups, while bad ones generally make for a bad customer experience (for example, being recommended friends that you don’t want to invite, or friends that don’t want to sign up for Cash App, meaning you will never get your reward).

As recommendations will never be perfect, we also wanted to make sure that the customer could still easily find their friends even when our system did not guess correctly. We currently show up to 3 suggestions, while your full contact book is still included just below in an easily scrollable alphabetical list.

invite friends screen sections

The suggested and the all contacts section in the Invite Friends screen.

Modeling

The problem that we set out to solve is the following: within your friends that are not yet on Cash App, who are you going to invite? Or more specifically, who are the top-3 friends you are most likely to invite?

First, let’s dive into the data: how can we predict who you are going to invite to Cash App, How can we do that, given that we know almost nothing about your friends or how close you are to them in real life? Because they are not on Cash App, the only thing we would know about them is that they are friends with you and other Cash App customers: for example, if many Cash App customers have one person in their contact book, they might be “at the edge” of the Cash App network, ready to be invited. But they might also not be interested in Cash App, or may have already been invited by other friends and consider any further invitation as spam from their friends: as this is a behavior we obviously don’t want to encourage, we include previous invitations history in the training data to help the model learn to not suggest these friends.

Let’s further frame the question as an ML problem:

For each (customer, friend) pair in our dataset, we calculate many features which fall into three different categories (for most of these, we use our homegrown feature service, Blizzard, as described in a previous post in this blog to quickly define hundreds of features):

Features - Common Friends

What does it mean to have friends in common with someone? For example, take the following sentence:

You and your friend have 2 common friends on Cash App

This means that two of your friends who are on Cash App are also friends of this person. If you have many friends in common with someone, it is more likely that you are closely connected to them in the real world (your friendship “circles” overlap more). Furthermore, because many of their friends are already here, they also have more reasons to join Cash App and your invite (and the reward!) might be the one that convinces them to finally join.

Let’s take a look at what a sample friendship graph might look like:

Example customer and non-customer graph

An example friendship graph, with Cash App customers shaded in green, and their friends in white. In this network, certain friends have stronger connections to existing customers, which makes them more likely to be invited to Cash App. Icons from IconFinder, licensed under CC BY 4.0.

Mathematically, given a node \(x\) let’s call \(\Gamma(x)\) its one-hop neighborhood for a given edge class (the set of edges associated to the node) and \(k(x) = \vert\Gamma(x)\vert\) its degree (the number of edges associated to the node): the number of common friends between \(x\) and \(y\) is then just \(\vert\Gamma(x) \bigcap \Gamma(y)\vert\). There are many other metrics based on the intersection of one-hop neighborhoods, such as cosine similarity, Jaccard index, Adamic-Adar index and so on (e.g. see this non-exhaustive list). As they mostly just differ by normalization factors, in this post we will focus on the common friends feature, but the generalization is straightforward.

While we have drawn a simple graph above, there are multiple ways people can connect with each other. In graph theory language, this forms a heterogeneous graph, defined by:

The heterogeneous edge classes in this graph have different cardinalities and will overlap only partially: for example, you might have friends in your contact book that you’ve never interacted with in the app (it might even be most of them!), or friends not in your contact book that you interact with a lot on Cash App. These different types of edges allow us to define different kinds of “common friends” features: for example, “p2p common friends” would be friends that you’ve sent money to that also have your not-on-cash-app friend in their contact book. There is also a time component on both nodes and edges, meaning we can calculate features like “recently added common friends”, or “most frequently paid common friends”. To simplify the discussion, in this post we only consider one class of undirected edges.

Operationally, calculating the number of common friends between you and your friend means counting the number of triangles that have you and your friend as one of the edges. For example, let’s zoom in into a given customer in the above graph:

Example graph - common friend calculation walkthrough

Let’s pick customer A and examine a few (customer, friend) pairs:

In this example of an undirected graph, the calculation is symmetric, e.g. if B is a common friend of (A, 3) then A is a common friend of (B, 3), but this would not always be the case if we considered directional edges. In addition to directionality, heterogeneous edges and weighing nodes based on their features allows us to maximize the information extracted from the graph.

But what does it mean to compute the “number of triangles”? If the graph edges form a table graph with (source, destination) columns (this assumes directionality of the edges, but can be easily tweaked to remove the direction), we would use the following SQL query:

select t1.source
, t1.destination
, count(t3.destination) as count_common_friends
from graph as t1
join graph as t2
 on t1.source = t2.source
join graph as t3
 on t3.source = t2.destination
   and t3.destination = t1.destination
where is_cash_customer(t1.source)
  and is_not_cash_customer(t1.destination)
  and is_cash_customer(t2.destination)
group by t1.source, t1.destination

Inspecting the query, for a given origin \(s_0\) the first self-join creates node pairs \((x_i, x_j)\), while the second one closes the triangle when such an edge exists between these pairs. The where filter is because we are only interested in (customer, non-customer) pairs as for the problem at hand (calculating this feature for friends not yet on Cash). Finally, we aggregate by the (source, destination) link and count the number of entries, which is the number of friends.

Let’s run through the logic for the example customer A and the highlighted nodes in the graph above:

Common friends count walkthrough

Counting common friends in the 1-hop neighborhood of a node.

Computationally, because we are going two hops away from a node before returning to it on the final edge, if \(\langle k \rangle\) is the average node degree, we generate \(\langle k \rangle^2\) candidate pairs before being able to check for the final edge. For a typical contact book with hundreds of contacts, this generates tens of thousands of candidate pairs (even only generating distinct pairs with exactly one Cash App customer in them, the O(n) analysis is the same). With this calculation repeated over 50 millions monthly actives (Cash App’s scale), we would generate on the order of trillions of candidate pairs; more than could be computed at reasonable cost or speed.

While parallelizing the calculation at the customer level by partitioning the graph used in the first two hops is possible (and is how we computed this feature on our smaller training data), it was not an effective solution for the full population, as the two-hop candidate pairs typically span multiple partitions - we still would need to load and join the full graph to close the last edge.

Common Friends Calculation at Scale

To solve this problem at scale, we used sampling techniques inspired by the paper “Dimension Independent Similarity Computation” by Zadeh and Goel to compute pairwise similarities. We sample nodes and edges to reduce the computational complexity of the problem to a fixed, manageable scale, at the cost of introducing some approximation error which can be evaluated during model training (comparing model performance using the exact and approximate values with different hyperparameters choices). Let’s walk through the same example graph we just discussed while detailing the algorithm as we implemented it:

Approximate common friends count walkthrough

Approximate count of common friends in the 1-hop neighborhood of a node. Dashed lines are dropped during sampling at each step.

To get an idea of how this approach helps in a more realistic case, let’s plug in some numbers: if the average contact book has 200 entries, choosing \(K_0=100\) (sampling half of contacts in each book), would reduce the number of candidate pairs from 40k to 10k (a 4x improvement), and then choosing \(F=1000\) (sampling 1k of these 10k pairs), we would reduce the size of the problem by an overall factor of 40! Of course, as edges and nodes are dropped some information is lost: the sampling factors are hyperparameters of the algorithm and should be tuned depending on the properties of the graph (degree, connectedness, …) and the acceptable loss in predictive power. In our case, we were able to achieve almost equal model performance using this technique, while accelerating feature calculation from what would have been days to a few hours.

Conclusions

Finally, building a model is only a fraction of the work to be done in order to impact customers, especially as this was the first time we computed and showed this type of recommendations. We also worked with our ML engineering teams to serve model predictions online, with our client engineers to show the recommendations in Cash App, and with our product data scientists to design and execute A/B tests to measure the impact of recommendations on our top-line metrics. After measuring statistically significant lifts in invitations, we rolled out the feature to all customers, and have since been monitoring recommendations quality and how customers use them.

In the Personalization ML team, we will keep exploring opportunities to improve recommendations, iterating on signals and models with an eye on our long-term goal to make it easy for everyone to connect on Cash App.

If you’re interested in this type of problems or you have feedback for this post, feel free to contact me at angelo@squareup.com.