White paper: Impact of the Digital Choice Act on Social Media Services

Implementing support for the law in a social media service

/whitepapers/impact-digital-choice-act-on-social-media-services/implementing-support-for-the-law/

This law is new, nobody has any experience with it yet, and there is much room for interpretation in the text. This is our take at this time for best practices in implementing support for it.

We distinguish between two levels of implementation SMSs can choose from:

“Sufficient base-level implementation” is what we recommend social media companies implement if their primary objective is to comply with the law and they have no further strategic objectives.
“Leading lean-in implementation” goes beyond what the law requires in order to maximize opportunities for benefiting from the existence of the law, given that competitors also need to comply with it.

Note that when choosing the “leading lean-in implementation” strategy, we do not consider it very important what exactly a law in its current version in a particular jurisdiction may require. Instead, the focus should be on what the market will look like when the current law in Utah, similar laws being contemplated in other jurisdictions (and future updates) are fully implemented, and how to take advantage of the opportunities that will have emerged.

Being where the puck is going to be before others is likely going to be of significant competitive advantage.

Data export features: implementation recommendations

Sufficient base-level implementation: data export

Implementations created to address similar legislation on full data export in other jurisdictions E.g. the CCPA/CPRA in California. appear to address most of the requirements on this subject in the Digital Choice Act. However:
All structured data needs to be exported in a structured format. JSON is recommended. (PDF will not do.)

There is no requirement that the data is in any particular format, as long as it is “portable” and “readily usable”, which means that it needs to be easily parseable with standard tools.
Each data element (content, connections etc) should be annotated with a URL that uniquely identifies the data element.

The intent of the law is that there is as little data loss as possible. Without unique identifiers in the export, data elements in files, or in different locations in the same file, cannot be unambiguously related (e.g. which of the friends with the same name in a JSON file containing friends has made a post contained in a JSON).
Further, given the intent of the bill, it should be possible to resolve a particular data element beyond the scope of a single import. For example, an importer should be able to resolve a friend in my contacts to the correct account of this friend on the destination SMS. If my friend imported their data set, and I imported mine, and both contained the same URLs, the destination SMS would be able to resolve the references to the same object, which would be what the user expects and is the objective of the bill.
Structured data does not need to follow any particular schema, as long as the structure is documented.
Media files should be exported in their originally-uploaded format.
If the export contains more than one file, a single zip file should be offered to the user for download that contains all files.

"Readily usable" cannot require that the user download dozens of separate files separately, as has been implemented by some companies in response to other data rights legislation.
There should be programmatic access to trigger the creation of such a downloadable file and to download it.

We derive this from "allows the consumer to transmit the data to another controller without impediment".

Leading “lean-in” implementation: data portability

We suggest starting by implementing what could be called “optimal self-round-trip data portability”. By doing so, SMSs have an easy-to-understand, easily QA’able foundation to optimize for a range of high-value data portability scenarios with other SMSs.

The self-round-trip data portability core use case is as follows:

The user exports all of their data from the SMS.
The user deletes their account and all their data from the SMS.
The user signs up with a new account on the same SMS, and imports their previously exported data.
Now the user is in the same situation they would have been if they had never deleted their account on the SMS: they have all their content, they have all their followers and following relationships, all their content and engagements are correctly integrated with other content and engagement, e.g. a comment the user left on somebody else’s post has re-appeared there below the post.

The following qualities should be optimized:

Minimize the number of clicks required to perform the entire scenario end to end.
Minimize the amount of time it takes to perform the entire scenario (both time spent by the user, and waiting time e.g. for batch processes to collect an exportable archive).
To the maximum extent, use data formats that are already in use elsewhere, such as standards published by well-regarded standards development organizations.
If data needs to be exported that goes beyond what existing formats or standards support, use their extensibility mechanisms (as far as they exist), again reusing approaches other organizations or products may have used before.

Once this “optimal self-round-trip data portability” scenario works well, the SMS should focus on making it work in a similar fashion with other SMSs, preferably while their own implementations for meeting these requirements are still in development; this is likely going to be a cheaper and less frustrating process than going to GA unilaterally, and then having to make things work together with other SMSs after the release. The existing round trip functionality is a great starting point that proves it can be made to work, at least if the data models (“schemas”) of the SMSs are the same.

As, however, the data models of no two SMSs are ever exactly the same, the SMS should implement graceful fallbacks for those data elements it receives from other SMSs that it does not understand directly. Good fallbacks are likely going to be a key criterion for users investigating whether they should leave another SMS and join yours. They may not if they feel their data has substantially degraded in your SMS.

As an example: if an SMS_A supports one type of “Like” only, while another SMS_B supports several types of reactions (e.g. “Heart”, “Fire”, “Sadness”, ..), when attempting to import data from SMS_B, SMS_A is faced with the question of what to do with those types of “Likes” it does not directly understand. Other than extending SMS_A itself to cover them natively (which may not be viable or desired), the best approach is to gracefully degrade by mapping the several types of “Likes” into the same in SMS_A. It would likely not be acceptable to the user if SMS_A simply dropped all kinds of “Likes” it does not understand.

As a corollary, when exporting data, an SMS_B should do so in a way that another SMS_A attempting to import the data set can easily implement graceful degradation on their end. In case of the “Likes”, for example, if the export contained a data element that represents “Like” with an enumerated-value property that indicates the type of “Like”", the importing SMS can simply ignore that attribute.

We will consider them together in this section, as the boundaries between them blur in implementations.

Sufficient base-level implementation: interoperability and synchronization

Consistent with the definition of “sufficient base-level implementation”, we will only consider the requirements of the law that requires an SMS_A to allow SMS_B to implement certain features (like that its users on SMS_B can follow users on SMS_A) but not attempt to implement interoperability symmetrically (so that users on SMS_A also can follow users on SMS_B and interact with them.)

Note: such an asymmetric implementation would make for a very odd product in most cases, so this implementation may not be advisable.

To do so, SMS_A must implement the following:

A user interface in which any user A of SMS_A can choose between:
- No interoperability with any other SMSs. This should be the default. If the user chooses this option, the SMS behaves like the walled garden version of this SMS for the user. ◦
- Interoperability with an enumerated set of other SMSs. The user can choose any subset of them. If the user chooses this option, their user-contributed content (filtered by the shared content selection below) will be available to the selected SMSs as continuous, real-time feed.
The set of SMSs the user can choose from to share with should:
- List some of the most-used SMSs that have the ability to do so for easy point-and-click selection. This list should be pre-vetted by the SMS in terms of 1) actual technical ability to interact, and 2) standards of care sufficiently high that shared user data is considered to be “safe” there in the opinion of the SMS.
- Also enable the user to add less-well-known SMSs and other systems, as the law requires the SMS to be open to interoperate with anybody, not just other SMSs that the SMS has a relationship with or approves of.
The SMS should maintain an internal “block list” of SMSs that it believes are deceptive or hostile, and prevent users from sharing with those. It needs to be prepared to defend its choices when challenged.
An open API by which any SMS_B approved by user A can “resolve” a handle identifying user A’s account on SMS_A. The details of “resolve” are highly dependent on the protocol the SMS chooses to implement, but in the abstract it means:
- Determine whether the handle refers to a valid user account on SMS_A.
- Obtain endpoint information on how to obtain content previously created by user A (e.g. to backfill user A’s timeline on SMS_B).
- Obtain endpoint information and register for being notified to obtain content created by user A in the future, or updates to content previously created by user A.
- Obtain the information that SMS_B may need to have to verify that content “pushed” to it by SMS_A indeed came from SMS_A and user A (e.g. public keys, or shared secrets needed for verification).

While not explicitly required in the law, SMS_A should also implement functionality that:

Enables SMS_B to indicate to SMS_A that a user B on SMS_B wants to “follow” user A on SMS_A, even if user A so far is not interoperating with SMS_B.
Enables user A on SMS_A to approve or deny “follow” requests from specific users on SMS_B, and communicates this back to SMS_B.

For data synchronization, it may be sufficient to implement a mechanism by which a 3^rd party can query the user’s profile, and subscribe to an event stream by which they can be notified that a data element of the profile has changed.

For a “sufficient base-level implementation” it would be sufficient to make up a home-grown set of protocols that provide the above features. However, as discussed in the strategy section above, this would severely restrict interop in the real world and should better be considered part of the “fight the law” strategy. The use of a standard protocol is thus strongly recommended.

Leading “lean-in” implementation: interoperability and synchronization

For a “lean-in” implementation, an SMS should first implement all items listed in the “sufficient base-level implementation”, and next implement the scenarios called “optimal self-federation” described next.

In the “optimal self-federation scenario”, conceptually, there are two instances of the same SMS_A It may be difficult to imagine two instances of Facebook, for example, at facebook.com and anotherfacebook.com. However, we can be quite certain that Meta internally already runs several instances of Facebook: not just the production version that their users use, but internally staged versions for quality assurance and other kinds of development purposes. The assignment for the engineers implementing the “optimal self-federation” scenario would be to make two of these separate instances interoperate so well with each other, that users would forget whether they interact with other users on the same or a different instance. . The scenario is successfully implemented if there is no difference in the features and user experience when two users A1 and A2 are interacting:

When both users A1 and A2 have accounts on the same instance of SMS_A.
If user A1 uses an account on SMS_A,1 and user A2 uses an account on SMS_A,2".

Without prejudicing format and protocol recommendations (discussed in a section below), it is useful to consider the ActivityPub-based Fediverse as it exists today as an example. For example, two instances of Mastodon (e.g. mastodon.social and hachyderm.io), interoperating with each other, come pretty close to this “optimal self-federation” scenario for users that follow each other from different SMSs.

But there are challenges as well. For example, the user experience between following a user in the same SMS and a different instance is different, and backfill across instances is not consistently handled, leading to messages such as “more content may be available on the original profile”. These need to be resolved.

Once the self-federation works well, it would be advantageous for the SMS “leaning in” to work with other SMSs to make the experience as seamless with their SMS as well. The experience with the ActivityPub-based Fediverse makes it clear that merely implementing the standard protocols is insufficient to provide good experiences for users, which are table stakes for an SMS “leaning in”.

All the comments about selecting standard formats, how to handle differences in the data model and graceful degradation made above for the “lean-in” scenario for “data portability” apply here as well, and more so:

Building an interoperability and synchronization protocol is not simple. For example, there are lots of corner cases that are not readily apparent. Chances of success are much higher when building on a protocol that has been successfully used for this purpose for some time.
Pick a protocol that is well-governed in a way so you have the ability to influence its enforcement and evolution. Protocols for purposes like this are subject to competitive capture where competitors with their own agenda may want to drive the standard in the direction of their own competitive advantage.
Ultimately: choose a protocol that enables you to interoperate with as many other SMSs as possible.