6.2 Technical Details
6.2.1 Baseline Format and Character Codings
Revenue declaration messages created in accordance with this standard shall be tab delimiter-separated values text files. Further details are provided below.
The messages defined in this standard are coded using UTF-8 with a big endian byte ordering.
Each Record is placed into one line terminated by a line feed (Unicode U+000A) or a carriage return and line feed pair (Unicode U+000D 000A).
Cells within a Record are separated by tab characters (Unicode U+0009).
Should a Cell contain two or more data elements, these data elements shall be separated by a pipe character (Unicode U+007C). This is referred to as a Secondary Delimiter;
All data elements in a multi-value Cell shall be of the same primitive data type.
Should a Cell contain a data element whose provenance needs to be provided, the data element shall be preceded by a string that provides a "namespace" and two colon characters (Unicode U+003A). The double colon is referred to as a Namespace Delimiter.
Examples include specifically identifiers where, e.g., a party ID can be communicated as " ISNI::0000000081266409 ", indicating that the identifier (0000000081266409) is an International Standard Name Identifier (ISNI).
In addition, if a party provides multiple IDs whose type cannot be defined by a namespace before the double colon, then the type of the identifier can precede the data element, from which it is separated by a colon (Unicode U+003A), e.g. "PADPIDA2007081601G::PartyID:123456", where 123456 is a PartyID as defined by the party with the DPID, PADPIDA2007081601G.
Delimiters shall not be surrounded by extra space characters. The same principle applies to the Secondary Delimiter.
Example: Two identifiers, 1234 and ABCD should be communicated in one Cell as "1234|ABCD" and not as "1234 | ABCD".
If a Message Sender has received data with extra white spaces, they are encouraged to trim any such extra white space characters when compiling a
RevenueDeclarationMessage. They may, however, also use the data with these extra white space characters in such cases. The same principle applies to the Secondary Delimiter.
Example: When the Message Sender received a Display Artist as "John Lennon ", then the Display Artist should be communicated as "John Lennon" but may be communicated as "John Lennon ".
To communicate a primary Delimiter in a Cell, such a Cell shall not be enclosed in double quote characters. Instead the Delimiter shall be immediately preceded by an escaping code.
To escape a primary Delimiter, the escaping code is the backslash character (Unicode U+005C). Therefore, the string A[TAB]B would have to be communicated as A\[TAB]B (with [TAB] representing the tabulator).
To escape a Secondary Delimiter, the escaping code is a double backslash character (Unicode U+005C). Therefore, the string A|B would have to be communicated as A\\|B.
To communicate a backslash character, three backslash characters need to be communicated. Therefore, the string A\B would have to be communicated as A\\\B;
This “escaping mechanism” must be used for all special characters in all Cells, whether those Cells allow multiple values or not. A non-escaped pipe character in a single-value cell is, consequently, an error. For the avoidance of doubt, escaping a character that should not be escaped, or not escaping a character that should have been escaped, will lead to an invalid message.
It is not permissible to include empty Records in a
RevenueDeclarationMessag. Records whose first character is the hash symbol (“#”, Unicode U+0023) shall be ignored by automated ingestors. These Records are included solely to aid human readability.
It is possible to communicate empty Cells. In this case, two tab characters shall follow each other with no characters in-between. The semantics of an empty Cell is determined by the commercial relationship between Message Sender and Message Recipient, unless specifically defined in this standard.
6.2.3 Primitive Data Types
Within this standard the following primitive data types are used. This standard does not prescribe a specific precision in which numbers are to be given. Message Sender and Message Recipient must agree a mutually agreeable precision for the type of transactions at hand.
A date in the ISO 8601 format to indicate a single year (YYYY), month (YYYY-MM) or day (YYYY-MM-DD).
A duration in the ISO 8601 format (P[nY][nM][nD][T[nH][nM]nS]).
Elements including their designator may be omitted if their value is zero.
Note that the expressions “PT3M2S” (three minutes and two seconds) and “PT182S” (182 seconds) are both permitted and are equivalent.
A sequence of 0-n strings in accordance with a defined data type separated by Secondary Delimiters.
If only one data item is to be provided, no Secondary Delimiter shall be included.
If the cardinality of such an element is "M", at least one such data item must be provided.
Cells that may contain multiple values may be related related. For example, one Multiple-String Cell may be for names of composers/lyricists and it may be followed by a Multiple-String Cell for identifiers of these composers/lyricists. In those cases, both Cells must contain the same number of Secondary Delimiters and there may only be 0-1 data items between the Secondary Delimiters.
"12" or "12|54|123" or "12||123"
(without the quotes)
A sequence of characters with a length of at least one character. This standard does not define a maximum length. Strings may not contain non-printable characters (Unicode U+0000 to U+001F).
A sequence of digits to represent positive or negative integer numbers, positive or negative decimal fractions or zero. See Clause 6.4 regarding precision of amounts.
The character used to separate the integers from fractions is the dot (“.”, Unicode U+002E).
Thousands separators or any other digit grouping shall not be used.
Numbers may be represented with a trailing decimal point, a trailing “.0” or with no trailing characters. For example, the number five can be represented as “5”, “5.” for “5.0”. Multiple trailing “0” should not be used, so the value 5.5 should not be represented as “5.50” or “5.5000”. When the number 0 (zero) is to be communicated, it shall be represented as 0 (and not 0.0 or 0.000000).
A string containing either “true” or “false”
A string in the form IdType::ID where IdType identifies the identification scheme. Examples are ISNI for an ISO 27729 International Standard Name Identifier, IPI for a CISAC Interested Party Identifier or IPN for a SCAPR International Performer Number. The ID element then contains the identifier in the format defined by the identification scheme. Only one identifier can be communicated for each party in a PartyID field.
DDEX Party ID
A string of 18 characters in accordance with the DDEX Party ID standard.
Note that the DDEX Party ID standard stipulates that DDEX Party IDs do not contain dashes in computer-to-computer communications. Therefore, DDEX Party IDs, when included in a
A string from a pre-defined allowed value set. Allowed value sets and their allowed values are listed, defined and provided in Clause 8.
To override a DDEX-defined value, the DDEX-defined value
Such user-defined values will need to be agreed between Message Sender and Message Recipient and users are encouraged to raise these issues with DDEX with a view to having these values added to the list of DDEX-defined allowed values.