Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[server] Add relationship updater #70

Merged
merged 13 commits into from
Jun 24, 2020
Merged

Conversation

zrebcu411
Copy link
Contributor

It implements update_relationship endpoint which is a generic endpoint for sending friend requests, blocking users and accepting friend requests.

There is a small change: it allows to send request only to one user so it will be needed to change mobile implementation a bit. It's much easier and safer to handle one request at once. Also, it's common in social apps to invite each user separately I think.

I'm working on another generic endpoint called delete_relationship which will be responsible for unblocking users, removing from friends, canceling already sent friend requests or optionally declining a friend request.

KNOF_OF status in not supported in this PR

@CLAassistant
Copy link

CLAassistant commented Jun 12, 2020

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@Ashoat Ashoat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll also need to update server/src/deleters/user-deleters.js to make sure a user's relationships are all deleted when that user is deleted

It's much easier and safer to handle one request at once

Can you clarify why you think it's safer?

My thoughts: I think we'll need to be able to support sending multiple relationship requests at once, for instance if somebody syncs their contact lists or FB friends. We should make sure that we don't handle each request with individual SQL queries. It should be possible to do multiple requests at once in a performant and efficient way, but I admit it will make this endpoint more complex. Maybe we can discuss how it could be implemented over video chat on Monday?

@Ashoat
Copy link
Contributor

Ashoat commented Jun 12, 2020

RE CLAassistant: I think what's happening here is that you're using a different email on your GitHub account vs. in your commits. I've whitelisted your GitHub account with CLAassistant, but since it can't match your email to your GitHub account, it doesn't know you're whitelisted. I think this can be fixed if you just add the email you're using in your Git commits to GitHub, or alternately if you change the email you're using in your Git commits to match your GitHub account.

@zrebcu411
Copy link
Contributor Author

We'll also need to update server/src/deleters/user-deleters.js to make sure a user's relationships are all deleted when that user is deleted

It's much easier and safer to handle one request at once

Can you clarify why you think it's safer?

My thoughts: I think we'll need to be able to support sending multiple relationship requests at once, for instance if somebody syncs their contact lists or FB friends. We should make sure that we don't handle each request with individual SQL queries. It should be possible to do multiple requests at once in a performant and efficient way, but I admit it will make this endpoint more complex. Maybe we can discuss how it could be implemented over video chat on Monday?

Maybe not necessarily safer but definitely simpler. But simplicity might prevent doing mistakes on the client in terms of sync with a backend. What about if one of the ids sent from the client points to user that not exists? Should we insert only valid users and throw an error with a list of invalid id's or should we insert only if all of the provided IDs are valid? If the client decides to have optimistic updates it might be more complex to restore the state after a failure. It's harder to handle errors in general.

I always thought it's a good API design practice to expose endpoints that operate on a single resource. If it's needed there should be exposed another endpoint for a batch update.

I this case I thought we will create another endpoint for syncing FB friends. But I can refactor this endpoint to accept a list of users. We can discuss it in today's call

Copy link
Contributor

@Ashoat Ashoat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree simplicity is better whenever possible, and in most cases I think "overengineering" for hypothetical use cases doesn't make sense. But in this case, you'll be working on FB log-in integration very soon, so I think it makes sense to make the updater function handle multiple users right now.

To be clear, this isn't about the endpoint as much. If the UI only lets you send a friend request to one user at a time, then the endpoint can support just one user at a time. It's more about the updateRelationship function - I want it to be ready for when we need to update multiple relationships at once.

PS, just want to make sure you didn't miss this comment:

We'll also need to update server/src/deleters/account-deleters.js to make sure a user's relationships are all deleted when that user is deleted

@Ashoat
Copy link
Contributor

Ashoat commented Jun 17, 2020

Analysis

Key:

  1. insert: INSERT INTO
  2. update: UPDATE
  3. insert/update: INSERT INTO ... ON DUPLICATE KEY
  4. select: SELECT
  5. delete: DELETE FROM
  6. in parallel: this query can occur at the same time as the previous one
action know_of &
friend/request/block
know_of/friend &
friend request/block
send friend request 1 select on friend/request/block*,
1 insert/update on know_of**,
1 insert on friend/request/block (in parallel),
possibly 1 delete on friend/request/block† (in parallel)
1 select on request/block*,
1 insert/update on know_of/friend**,
1 insert/update on friend request/block (in parallel)
accept friend request 1 select on friend/request/block*,
1 update on friend/request/block
1 select on request/block*,
1 insert/update on know_of/friend,
1 delete on friend request/block (in parallel)
unfriend 1 delete on friend/request/block 1 insert/update on know_of/friend
block 1 insert/update on know_of‡,
1 delete on friend/request/block (in parallel),
1 insert on friend/request/block (in parallel)
1 insert/update on know_of/friend‡
1 insert/update on request/block (in parallel)
unblock 1 delete on friend/request/block 1 delete on request/block
  • * we need to make sure the user you are requesting hasn't blocked you, and want to check if they have already requested you
  • ** we need to make sure you know_of the user if you are requesting
  • † in case you friend request somebody you have previously blocked, we should unblock
  • ‡ since we will need to display a blocked user in the block list, we need to make sure we have blocked users in know_of, even if you didn't know_of them before (so like if you have a worst enemy and you look them up in the app immediately and block them)

Conclusion

It's a close call in this case. The two approaches are similar in performance.

Ultimately I think I am still leaning towards know_of/friend & friend request/block. I like that we are able to offer idempotency and uniqueness guarantees on the database layer, so we are less likely to get into a state of data corruption.


if (action === 'friend_request') {
const requestsOrBlockedSelectQuery = SQL`
SELECT user1, user2, status from friend_requests_blocks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a higher level, the point of using a UNION SELECT is to combine requestsOrBlockedSelectQuery and friendsOrBlockedMeSelectQuery, so we'll need to rewrite this query. But I want to share some comments about how you wrote this query, and show you how I would write it instead if we didn't need to refactor it.

First, I think there's an error here. Because AND takes precedence over OR, your status = ${requestsBlocksStatus.PENDING_FRIEND} condition only applies to your (user1 IN (${userIDs}) AND user2 = ${viewer.userID}) condition, not to your (user1 = ${viewer.userID} AND user2 IN (${userIDs})) condition.

Second, I have some style nits:

  1. Let's capitalize from since it's a SQL operator
  2. Let's indent OR so it's clear that it's part of the WHERE condition
  3. Let's indent the second line of the user1/user2 condition so it's clear that they're part of the same block
  4. I prefer putting simpler conditions first
SELECT user1, user2, status FROM relationships_directed
WHERE
  (status = ${requestsBlocksStatus.BLOCKED} AND user1 = ${viewer.userID}) OR
  (status = ${requestsBlocksStatus.PENDING_FRIEND} AND
    ((user1 = ${viewer.userID} AND user2 IN (${userIDs})) OR
      (user1 IN (${userIDs}) AND user2 = ${viewer.userID})))

OR user1 = ${viewer.userID} AND status = ${requestsBlocksStatus.BLOCKED}
`;

const friendsOrBlockedMeSelectQuery = SQL`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the goal of the UNION SELECT is to make it so we only need to make one query. Let's try and see if we can do both of these at once.

So we have six questions we might want to answer here:

  1. relationships_directed: Has the other user blocked me? We need to know this because if the other user has blocked you, you can't send them a friend request.
  2. relationships_directed: Have I blocked the other user? Hmm... I'm actually not sure if we need to answer this question. On the SQL side, if you've blocked the other user, the block in relationships_directed will be replaced by the friend request when you do the INSERT INTO ... ON DUPLICATE KEY, because it will be treated as a duplicate key. So we don't need to ask the question for SQL. How about once we get to Server awareness of client's user store #22? Will we need to do something different if we are removing a block? My instinct is no... all of the requesters' devices should get an updateTypes.UPDATE_USER update (or something like it) saying that a friend request has been sent, and they should know that this means the user is no longer blocked.
  3. relationships_directed: Has the other user already sent me a friend request? We need to know this because if so, we can go ahead and create the friendship now.
  4. relationships_directed: Have I already sent this user a friend request? If you've already sent a friend request to this user, we should avoid doing any further work. It's true that the INSERT INTO ... ON DUPLICATE KEY operation is idempotent, so it probably won't change anything. But since we're already dispatching this SELECT query, we might as well figure this out now instead of doing more queries. Plus, after Server awareness of client's user store #22 is implemented, we'll be able to skip dispatching useless updateTypes.UPDATE_USER updates if we know it's a no-op.
  5. relationships_undirected: Do I already have a friendship with the other user? If you already have a friendship with the other user, we should skip this user so that we avoid creating any friend request rows.
  6. relationships_undirected: Do I know_of the other user? If you search somebody by username that you're not currently in a thread with, you might not know_of them yet. But since you're requesting their friendship, we want to make sure we keep the user object updated on the client. We can make sure the know_of relationship exists by issuing a INSERT INTO ... ON DUPLICATE KEY query, and that query won't break things if you already have a row. But since we're already checking the relationships_undirected table, it's cheap here to see if the know_of exists, and will let us skip the later query.

1-4 actually cover all "directed" types of relationships, and 5-6 cover all types of "undirected" relationships. So we basically just want to know all of the relationships that these users share. If 2 wasn't crossed out, our query would be asking for all relationships between these users, and consequently would look pretty simple:

SELECT user1, user2, status
FROM relationships_directed
WHERE
  (user1 IN (${userIDs}) AND user2 = ${viewer.userID}) OR
  (user1 = ${viewer.userID} AND user2 IN (${userIDs}))
UNION SELECT user1, user2, status
FROM relationships_undirected
WHERE
  (user1 = ${viewer.userID} AND user2 IN (${userIDs})) OR
  (user1 IN (${userIDs}) AND user2 = ${viewer.userID})

With 2 crossed out, we just have to make some adjustments:

SELECT user1, user2, status
FROM relationships_directed
WHERE
  (user1 IN (${userIDs}) AND user2 = ${viewer.userID}) OR
  (status = ${requestsBlocksStatus.PENDING_FRIEND} AND
    user1 = ${viewer.userID} AND user2 IN (${userIDs}))
UNION SELECT user1, user2, status
FROM relationships_undirected
WHERE
  (user1 = ${viewer.userID} AND user2 IN (${userIDs})) OR
  (user1 IN (${userIDs}) AND user2 = ${viewer.userID})

const errors: RelationshipErrors = {};
const relationshipActions: UserActions = {};
for (const userID in relationshipsByUserId) {
const [relationship] = relationshipsByUserId[userID].filter(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. We could have two relationship rows if each user blocked each other. In that case we need to check both rows, because if the other user blocked you, you can't friend request them. But if you blocked them, it is okay
  2. We're relying on database consistency here, which worries me a little bit. Databases have a habit of getting corrupted. What if in the process of accepting a friend request, our application randomly crashes (due to reasons outside our control) right after updating the relationships_undirected table with the friendship, but before deleting the friend request from the relationships_directed table?

Let's try and do something more robust. Maybe we can iterate through relationshipsByUserId and if we find any corruption, we'll console.warn, and continue with the "most important" relationship. Order of importance: block > friendship > friend request

Copy link
Contributor Author

@zrebcu411 zrebcu411 Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I don't think it's possible. The select query we have doesn't return rows where current user is is a blocker:
SELECT user1, user2, status
    FROM relationships_directed
    WHERE
      (user1 IN (${userIDs}) AND user2 = ${viewer.userID}) OR
      (status = ${directedStatus.PENDING_FRIEND} AND
        user1 = ${viewer.userID} AND user2 IN (${userIDs}))
    UNION SELECT user1, user2, status
    FROM relationships_undirected
    WHERE
      (user1 = ${viewer.userID} AND user2 IN (${userIDs})) OR
      (user1 IN (${userIDs}) AND user2 = ${viewer.userID})

But, as you mentioned in another comment we should unblock/unblock then friend request when you already blocked another user. In that case we have to change a query a bit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I'm not sure if I'm following. What do you mean by finding any corruption and continue with the "most important" relationship?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Oops, you're right!

when you already blocked another user. In that case we have to change a query a bit

I'm confused now, what needs to be changed? If you have blocked the user you are friend requesting, that would mean this query returns just a know_of row, right? Which gets converted to 'pending_friend' in the if (!relationship) { block below

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Corruption would be if you get multiple rows after doing this filter here. Continuing with the most important relationship would be picking the most important row.

ON DUPLICATE KEY UPDATE status = VALUES(status)
`;
const directedDeleteQuery = SQL`
DELETE FROM friend_requests_blocks WHERE user1 IN (${directedDeleteIDs}) and user2 = ${viewer.userID}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. If user2 is simultaneously blocking user1 while user1 is accepting user2's friend request, this could lead to a race condition where the block is deleted. Let's add status = ${directedStatus.PENDING_FRIEND} to the WHERE condition here
  2. Let's put the WHERE on a new line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. What about if users blocked each other? If user1 sends a friend request to user2 the only thing we want to do is to remove his corresponding BLOCK row. If I put a WHERE condition with status = ${directedStatus.PENDING_FRIEND} it wouldn't work I think. Is that correct?

I think it might be tricky to avoid all possible race conditions in this implemenation 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have already scoped this delete to only remove the rows where user2 is the actor. It doesn't address deleting rows where user1 has blocked user2.

const undirectedInsertQuery = SQL`
INSERT INTO know_of_friends (user1, user2, status)
VALUES ${undirectedInsertRows}
ON DUPLICATE KEY UPDATE status = VALUES(status)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can imagine a race condition here that can lead to database corruption.

Let's say two users friend request each other around the same time.

If the relationships_undirected update from the first request resolves first, then the second request sees this and thinks "great, we can create the friendship now". Let's say the second request finishes really quickly for some reason and the first request is stuck due to some weird behavior. Now the first request is going to run it's relationships_directed update to make sure know_of exists. In this scenario, the friendship could get overwritten with a know_of.

Of course this is an unlikely race condition, but I think it would be smart to guard against it here. Could we do something like this?

ON DUPLICATE KEY UPDATE status = MAX(status, VALUES(status))

for (const userID in relationshipsByUserId) {
const relationships = relationshipsByUserId[userID];

if (relationships.length === 3) {
Copy link
Contributor Author

@zrebcu411 zrebcu411 Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: I implemented it like this since relationships can be one of the following cases:

[know_of, blocked_1, blocked_2]
[know_of, blocked_1]
[know_of, blocked_2]
[know_of, pending_friend_1]
[know_of, pending_friend_2]
[know_of, friends]
[know_of]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General advice when your code seem confusing or needs clarification

If you find yourself needing to explain something in a PR, that's usually a sign that the code isn't too readable. Instead of explaining over PR, I'd suggest:

  1. Preferred - rewrite the code so it's obvious what's going on
  2. Sometimes it's not possible to rewrite the code. In that case, it's always better to put a comment in the code instead of the PR

This particular case: avoiding brittle logic

In this particular case I also worry about the approach of counting rows because it is brittle.

  1. Database corruption can cause issues. For instance, imagine if two people friend request each other at the same time. If both requests check and see no current relationship, they will both create a friend request row. That could lead to three rows here
  2. What happens if somebody adds a new relationship type? Would they realize that they need to change this code?

Proposed solution

In my last review, I suggested:

if we find any corruption, we'll console.warn, and continue with the "most important" relationship. Order of importance: block > friendship > friend request

I think this is the jist of what we should be doing here. Instead of counting the number of relationships, let's check to see what type of relationships we have, and determine our action based on the priority of those relationships.

This approach would be more readable (no confusing counting of number of relationships), as well as less brittle (less likely to be broken by data corruption or new relationship types).

Code

To help explain, I've written up some code to give you an idea of how I would approach this problem. I think this code should behave identical to what you've written, but in my opinion it's more readable and clear about what's going on.

const viewerBlockedTarget = relationships.some(
  relationship => relationship.status === directedStatus.BLOCKED && relationship.user1 === viewerID.userID,
);
const targetBlockedViewer = relationships.some(
  relationship => relationship.status === directedStatus.BLOCKED && relationship.user2 === viewerID.userID,
);
const friendshipExists = relationships.some(
  relationship => relationship.status === directedStatus.FRIEND,
);
const viewerRequestedTargetFriendship = relationships.some(
  relationship => relationship.status === directedStatus.PENDING_FRIEND && relationship.user1 === viewerID.userID,
);
const targetRequestedViewerFriendship = relationships.some(
  relationship => relationship.status === directedStatus.PENDING_FRIEND && relationship.user2 === viewerID.userID,
);

const operations = [];
if (targetBlockedViewer) {
  if (viewerBlockedTarget) {
    operations.push('delete_directed');
  }
  const user_blocked = errors.user_blocked || [];
  errors.user_blocked = [...user_blocked, userID];
} else if (friendshipExists) {
  const already_friends = errors.already_friends || [];
  errors.already_friends = [...already_friends, userID];
} else if (targetRequestedViewerFriendship) {
  operations.push('friend', 'delete_directed');
} else if (!viewerRequestedTargetFriendship) {
  operations.push('pending_friend');
}

SELECT user1, user2, status
FROM relationships_directed
WHERE
(user1 IN (${userIDs}) AND user2 = ${viewer.userID}) OR
Copy link
Contributor

@Ashoat Ashoat Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's return this query to what it was before. I think I may have confused you with my previous comments. I don't think there was any need to change this

EDIT - okay, I see this is to address the new goal

Copy link
Contributor

@Ashoat Ashoat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, sorry for the initial mix-up on my end. I see where you're coming from now.

The reason you've changed the query to check for blocks from the viewer on the target is that you're trying to make sure that friend requests that fail because the target is blocked still remove any block the requester might have on the target.

That's a valid goal and I appreciate your diligence.

Next time it might help me understand more quickly if you add a note about why you've made a change like this. On my end, I'll try to read the whole PR before giving feedback, so I don't get mixed up again in the future.

As a higher level note - this is very close now. I just want the code in fetchFriendRequestRelationshipOperations to be a bit more clear and then we should be able to get this in.

for (const userID in relationshipsByUserId) {
const relationships = relationshipsByUserId[userID];

if (relationships.length === 3) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General advice when your code seem confusing or needs clarification

If you find yourself needing to explain something in a PR, that's usually a sign that the code isn't too readable. Instead of explaining over PR, I'd suggest:

  1. Preferred - rewrite the code so it's obvious what's going on
  2. Sometimes it's not possible to rewrite the code. In that case, it's always better to put a comment in the code instead of the PR

This particular case: avoiding brittle logic

In this particular case I also worry about the approach of counting rows because it is brittle.

  1. Database corruption can cause issues. For instance, imagine if two people friend request each other at the same time. If both requests check and see no current relationship, they will both create a friend request row. That could lead to three rows here
  2. What happens if somebody adds a new relationship type? Would they realize that they need to change this code?

Proposed solution

In my last review, I suggested:

if we find any corruption, we'll console.warn, and continue with the "most important" relationship. Order of importance: block > friendship > friend request

I think this is the jist of what we should be doing here. Instead of counting the number of relationships, let's check to see what type of relationships we have, and determine our action based on the priority of those relationships.

This approach would be more readable (no confusing counting of number of relationships), as well as less brittle (less likely to be broken by data corruption or new relationship types).

Code

To help explain, I've written up some code to give you an idea of how I would approach this problem. I think this code should behave identical to what you've written, but in my opinion it's more readable and clear about what's going on.

const viewerBlockedTarget = relationships.some(
  relationship => relationship.status === directedStatus.BLOCKED && relationship.user1 === viewerID.userID,
);
const targetBlockedViewer = relationships.some(
  relationship => relationship.status === directedStatus.BLOCKED && relationship.user2 === viewerID.userID,
);
const friendshipExists = relationships.some(
  relationship => relationship.status === directedStatus.FRIEND,
);
const viewerRequestedTargetFriendship = relationships.some(
  relationship => relationship.status === directedStatus.PENDING_FRIEND && relationship.user1 === viewerID.userID,
);
const targetRequestedViewerFriendship = relationships.some(
  relationship => relationship.status === directedStatus.PENDING_FRIEND && relationship.user2 === viewerID.userID,
);

const operations = [];
if (targetBlockedViewer) {
  if (viewerBlockedTarget) {
    operations.push('delete_directed');
  }
  const user_blocked = errors.user_blocked || [];
  errors.user_blocked = [...user_blocked, userID];
} else if (friendshipExists) {
  const already_friends = errors.already_friends || [];
  errors.already_friends = [...already_friends, userID];
} else if (targetRequestedViewerFriendship) {
  operations.push('friend', 'delete_directed');
} else if (!viewerRequestedTargetFriendship) {
  operations.push('pending_friend');
}


await Promise.all(promises);

return { ...friendRequestErrors, invalid_user };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I like that in your old approach, you avoid settings any error keys unless there actually are errors. In this case, we'll always be returning an empty invalid_users array, which isn't a big deal, but would probably be cleaner/more consistent if we didn't

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I've changed it because I had an inexact spread Flow warning. To avoid that warning I'm using Object.assign() now

@zrebcu411 zrebcu411 force-pushed the feature/relationship-updater branch from a4ff8de to 1780efa Compare June 24, 2020 10:12
@zrebcu411
Copy link
Contributor Author

Thank you for your detailed review and a bunch of advice The code you proposed is much more readable and definitely less brittle, I appreciate that!

|};

export type RelationshipErrors = $Shape<{|
invalid_user?: string[],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: we don't need the optionality here, $Shape applies it automatically

@Ashoat Ashoat merged commit 108677a into master Jun 24, 2020
@palys-swm palys-swm deleted the feature/relationship-updater branch December 28, 2020 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants