This post is part four of a series on building an app with spatialization features using Daily's real-time video and audio APIs. The introduction briefly recapitulates the previous posts in the series, which covered how to construct a 2D world and send/receive position data between users. In this post, we delve into managing user tracks, proximity calculations, and building an audio graph to implement spatial audio features in the world. We explore how users' video and audio tracks are handled, including creating textures from video tags and updating media streams when necessary. The proximity update process involves checking if a user is in earshot or vicinity of another user, triggering track subscriptions/unsubscriptions accordingly, and adjusting audio gain and pan values based on distance. The post concludes by summarizing the key points covered in this installment of the series.