Data Schema
This section explains the structure of the imported data via the Data Portability API.
The data portability api sends back the response streamed as an HTTP data zip file. The file is named by the email address of the user used to create the account on Booking.com with the @
character replaced by _
. For example the Data Portability API response of the user with an email address test.user@domain.com
becomes test.user_domain.com.zip
.
The contents of the zip file when extracted consists of a report.json
file containing various types of data of the user. Some data types contain reference to attachments which are subsequently provided in the zip file as separate directories denoted by the data type name:
test.user_domain.com |
---|
l------ report.json l------ data-type-1/ l------ attachment-1 l------ attachment-2 l------ more attachments... l------ more data types... |
Report.json
The report.json file contains a JSON object with all the data represented as (key, value) pairs.
The keys denote a particular type of user data. For example RESERVATION_SERVICE
for reservation details, FLIGHTS
for all information related to flight bookings etc..
The values are structured as a JSON object or an array of JSON objects.
Each JSON object can be thought of as a separate table containing some part of the user data for the type of user data denoted by the root level key. Example : RESERVATION_SERVICE
is a root level key for the reservation data of the user as shown below. It will always be in pairs, having topic
and data
as the keys.
{
"RESERVATION_SERVICE":{
"topic":"Topic 1", "data":{ ... } or [ ... ]
},
"FLIGHTS":[
{"topic":"Topic 1", "data":{ ... } or [ ... ]},
{"topic":"Topic 2", "data":{ ... } or [ ... ]}
]
.....
.....
"Data Type N":{ ... } or [ ... ]
}
The value of the topic
key is always a string and signifies the heading of the table. The value of the corresponding data
key is again a JSON object or array with keys denoting the column names and the values containing the row values. Please note that the value can again contain an array of JSON object signifying a nested table inside the current table and so on with further nesting possible.
{
"RESERVATION_SERVICE":{
"topic":"User reservations", "data":[{"Stayer emails":"*","Guest name":"John"}, {"Stayer emails":"*","Guest name":"Danny"}]
},
"FLIGHTS":[
{"topic":"Flights", "data":{"Route":{"Source":"NL", "Destination":"DE"}, "Date":"17th July 2024"}},
{"topic":"Bookings", "data":[{"Country":"NL", "Date":"17th July 2024"}]}
]
.....
}
This transformation algorithm is adapted from Tableify. These are the main rules: - Each {topic, data} object becomes a document section with topic value as header. A standalone object(hashmap) becomes a key: value list. An array of objects becomes a table. Object keys become column headers, values become cells in the corresponding column. Cells can again be independent tables or a single value.
Attachments
Some types of user data include attachments. These attachments are stored in a zip file, organized in directories where each directory is named after the corresponding user data type.
Any cell in the above report.json
if contains a (key,value) pair as ("type":"file") denotes the presence of a file attachment. The file cell should observe the following format:
{
"type": "file",
"filename": "image1.jpg",
"size": "2500",
"path": "images/",
"creationDate": "2014-09-08T08:02:15Z"
}
Properties:
Key | Value |
---|---|
type | "file". Fixed value |
filename | Name of the attachment |
size | Attachment size in bytes |
path | Path of the attachment inside the root directory of the corresponding user data type |
creationDate | Creation date of the attachment. In ISO 8601 format |
Sample report.json
for the user with email address test.user@domain.com
:
{
"REVIEWS":[
{
"topic":"Property photos",
"data":[
{
"type": "file",
"filename": "image1.jpg",
"size": "2500",
"path": "property-images/",
"creationDate": "2014-09-08T08:02:15Z"
},
{
"type": "file",
"filename": "image2.jpg",
"size": "3000",
"path": "customer-images/",
"creationDate": "2014-09-08T08:02:20Z"
}
]
}
]
}
The corresponding zip data file will have the following contents when extracted:
test.user_domain.com |
---|
l------ report.json l------ REVIEWS/ l------ property-images/ l------ image1.jpg l------ customer-images/ l------ image2.jpg |